Anomaly Detection for troubleshooting on Cork Stopper sorting Machines

Detalhes bibliográficos
Autor(a) principal: Inês Santos Pereira
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/133564
Resumo: Due to the ever-growing competitive pressure, the supply of high-quality products continues to evolve as an essential factor in securing a company's long-term success. With the emergence of Industry 4.0 and data-driven approaches, the quality control processes have taken a step further, enabling companies to predict product quality by continuously monitoring the manufacturing process. This work's aim is to investigate the possibility of inferring incorrect sensor measurements produced by the NDtech quality control screening system, and thereby increasing the systems' reliability. The incorrect detection of the 2,4,6-Trichloroanisole contaminate and subsequent classification of the cork stoppers leads to the appearance of false positives and false negatives. The false negatives are particularly harmful to the company's business reputation. The work's challenges are to optimally and automatically characterise behaviours of interest from monitoring sensor data and use them to guarantee the continuous and accurate classification of high-end natural cork stoppers. This dissertation analysed and developed classification models to detect anomalies in sensor data. The main contributions were the analysis of the features that best represent the anomaly detection problem, the development and study of several machine learning strategies that will be the basis for future work, and the exploration of techniques for dealing with imbalanced data sets. Two approaches were compared: in the first approach, the models were trained using the imbalanced data, and in the second, an oversampling technique named Synthetic Minority Oversampling Technique (SMOTE) was used to augment the data. The models studied in both approaches were Decision Trees (DT), k-Nearest Neighbours (k-NN), Random Forests (RF), Logistic Regression (LR) and Support Vector Machine (SVM). For each model, two feature selection methods, PCA and ETC, were used. Two additional combinations of features were selected without resorting to feature selection techniques - using all the features extracted and only using two features (mean TCA values of the target and virtual modules). The results obtained demonstrated that SMOTE revealed to be an effective technique to overcame the imbalanced data set problem, improving all model performances for all feature selection techniques by 16.5%. The DT models, in particular, showed better results in this anomaly detection problem with F1-score performances of 73.10% and 91.09% using the imbalanced data and SMOTE technique, respectively. Keywords: Anomaly Detection; Electrochemical Sensors; Predictive Quality; Sampling techniques; Machine Learning; 2,4,6-Trichloroanisole.
id RCAP_fb65799335bad7c627487add5f5ef7b1
oai_identifier_str oai:repositorio-aberto.up.pt:10216/133564
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Anomaly Detection for troubleshooting on Cork Stopper sorting MachinesOutras ciências da engenharia e tecnologiasOther engineering and technologiesDue to the ever-growing competitive pressure, the supply of high-quality products continues to evolve as an essential factor in securing a company's long-term success. With the emergence of Industry 4.0 and data-driven approaches, the quality control processes have taken a step further, enabling companies to predict product quality by continuously monitoring the manufacturing process. This work's aim is to investigate the possibility of inferring incorrect sensor measurements produced by the NDtech quality control screening system, and thereby increasing the systems' reliability. The incorrect detection of the 2,4,6-Trichloroanisole contaminate and subsequent classification of the cork stoppers leads to the appearance of false positives and false negatives. The false negatives are particularly harmful to the company's business reputation. The work's challenges are to optimally and automatically characterise behaviours of interest from monitoring sensor data and use them to guarantee the continuous and accurate classification of high-end natural cork stoppers. This dissertation analysed and developed classification models to detect anomalies in sensor data. The main contributions were the analysis of the features that best represent the anomaly detection problem, the development and study of several machine learning strategies that will be the basis for future work, and the exploration of techniques for dealing with imbalanced data sets. Two approaches were compared: in the first approach, the models were trained using the imbalanced data, and in the second, an oversampling technique named Synthetic Minority Oversampling Technique (SMOTE) was used to augment the data. The models studied in both approaches were Decision Trees (DT), k-Nearest Neighbours (k-NN), Random Forests (RF), Logistic Regression (LR) and Support Vector Machine (SVM). For each model, two feature selection methods, PCA and ETC, were used. Two additional combinations of features were selected without resorting to feature selection techniques - using all the features extracted and only using two features (mean TCA values of the target and virtual modules). The results obtained demonstrated that SMOTE revealed to be an effective technique to overcame the imbalanced data set problem, improving all model performances for all feature selection techniques by 16.5%. The DT models, in particular, showed better results in this anomaly detection problem with F1-score performances of 73.10% and 91.09% using the imbalanced data and SMOTE technique, respectively. Keywords: Anomaly Detection; Electrochemical Sensors; Predictive Quality; Sampling techniques; Machine Learning; 2,4,6-Trichloroanisole.2021-04-202021-04-20T00:00:00Z2024-04-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/133564TID:202820700engInês Santos Pereirainfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:19:52Zoai:repositorio-aberto.up.pt:10216/133564Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:59:05.099884Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
title Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
spellingShingle Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
Inês Santos Pereira
Outras ciências da engenharia e tecnologias
Other engineering and technologies
title_short Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
title_full Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
title_fullStr Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
title_full_unstemmed Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
title_sort Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
author Inês Santos Pereira
author_facet Inês Santos Pereira
author_role author
dc.contributor.author.fl_str_mv Inês Santos Pereira
dc.subject.por.fl_str_mv Outras ciências da engenharia e tecnologias
Other engineering and technologies
topic Outras ciências da engenharia e tecnologias
Other engineering and technologies
description Due to the ever-growing competitive pressure, the supply of high-quality products continues to evolve as an essential factor in securing a company's long-term success. With the emergence of Industry 4.0 and data-driven approaches, the quality control processes have taken a step further, enabling companies to predict product quality by continuously monitoring the manufacturing process. This work's aim is to investigate the possibility of inferring incorrect sensor measurements produced by the NDtech quality control screening system, and thereby increasing the systems' reliability. The incorrect detection of the 2,4,6-Trichloroanisole contaminate and subsequent classification of the cork stoppers leads to the appearance of false positives and false negatives. The false negatives are particularly harmful to the company's business reputation. The work's challenges are to optimally and automatically characterise behaviours of interest from monitoring sensor data and use them to guarantee the continuous and accurate classification of high-end natural cork stoppers. This dissertation analysed and developed classification models to detect anomalies in sensor data. The main contributions were the analysis of the features that best represent the anomaly detection problem, the development and study of several machine learning strategies that will be the basis for future work, and the exploration of techniques for dealing with imbalanced data sets. Two approaches were compared: in the first approach, the models were trained using the imbalanced data, and in the second, an oversampling technique named Synthetic Minority Oversampling Technique (SMOTE) was used to augment the data. The models studied in both approaches were Decision Trees (DT), k-Nearest Neighbours (k-NN), Random Forests (RF), Logistic Regression (LR) and Support Vector Machine (SVM). For each model, two feature selection methods, PCA and ETC, were used. Two additional combinations of features were selected without resorting to feature selection techniques - using all the features extracted and only using two features (mean TCA values of the target and virtual modules). The results obtained demonstrated that SMOTE revealed to be an effective technique to overcame the imbalanced data set problem, improving all model performances for all feature selection techniques by 16.5%. The DT models, in particular, showed better results in this anomaly detection problem with F1-score performances of 73.10% and 91.09% using the imbalanced data and SMOTE technique, respectively. Keywords: Anomaly Detection; Electrochemical Sensors; Predictive Quality; Sampling techniques; Machine Learning; 2,4,6-Trichloroanisole.
publishDate 2021
dc.date.none.fl_str_mv 2021-04-20
2021-04-20T00:00:00Z
2024-04-19T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/133564
TID:202820700
url https://hdl.handle.net/10216/133564
identifier_str_mv TID:202820700
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135914296868864