Anomaly Detection for troubleshooting on Cork Stopper sorting Machines
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/133564 |
Resumo: | Due to the ever-growing competitive pressure, the supply of high-quality products continues to evolve as an essential factor in securing a company's long-term success. With the emergence of Industry 4.0 and data-driven approaches, the quality control processes have taken a step further, enabling companies to predict product quality by continuously monitoring the manufacturing process. This work's aim is to investigate the possibility of inferring incorrect sensor measurements produced by the NDtech quality control screening system, and thereby increasing the systems' reliability. The incorrect detection of the 2,4,6-Trichloroanisole contaminate and subsequent classification of the cork stoppers leads to the appearance of false positives and false negatives. The false negatives are particularly harmful to the company's business reputation. The work's challenges are to optimally and automatically characterise behaviours of interest from monitoring sensor data and use them to guarantee the continuous and accurate classification of high-end natural cork stoppers. This dissertation analysed and developed classification models to detect anomalies in sensor data. The main contributions were the analysis of the features that best represent the anomaly detection problem, the development and study of several machine learning strategies that will be the basis for future work, and the exploration of techniques for dealing with imbalanced data sets. Two approaches were compared: in the first approach, the models were trained using the imbalanced data, and in the second, an oversampling technique named Synthetic Minority Oversampling Technique (SMOTE) was used to augment the data. The models studied in both approaches were Decision Trees (DT), k-Nearest Neighbours (k-NN), Random Forests (RF), Logistic Regression (LR) and Support Vector Machine (SVM). For each model, two feature selection methods, PCA and ETC, were used. Two additional combinations of features were selected without resorting to feature selection techniques - using all the features extracted and only using two features (mean TCA values of the target and virtual modules). The results obtained demonstrated that SMOTE revealed to be an effective technique to overcame the imbalanced data set problem, improving all model performances for all feature selection techniques by 16.5%. The DT models, in particular, showed better results in this anomaly detection problem with F1-score performances of 73.10% and 91.09% using the imbalanced data and SMOTE technique, respectively. Keywords: Anomaly Detection; Electrochemical Sensors; Predictive Quality; Sampling techniques; Machine Learning; 2,4,6-Trichloroanisole. |
id |
RCAP_fb65799335bad7c627487add5f5ef7b1 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/133564 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Anomaly Detection for troubleshooting on Cork Stopper sorting MachinesOutras ciências da engenharia e tecnologiasOther engineering and technologiesDue to the ever-growing competitive pressure, the supply of high-quality products continues to evolve as an essential factor in securing a company's long-term success. With the emergence of Industry 4.0 and data-driven approaches, the quality control processes have taken a step further, enabling companies to predict product quality by continuously monitoring the manufacturing process. This work's aim is to investigate the possibility of inferring incorrect sensor measurements produced by the NDtech quality control screening system, and thereby increasing the systems' reliability. The incorrect detection of the 2,4,6-Trichloroanisole contaminate and subsequent classification of the cork stoppers leads to the appearance of false positives and false negatives. The false negatives are particularly harmful to the company's business reputation. The work's challenges are to optimally and automatically characterise behaviours of interest from monitoring sensor data and use them to guarantee the continuous and accurate classification of high-end natural cork stoppers. This dissertation analysed and developed classification models to detect anomalies in sensor data. The main contributions were the analysis of the features that best represent the anomaly detection problem, the development and study of several machine learning strategies that will be the basis for future work, and the exploration of techniques for dealing with imbalanced data sets. Two approaches were compared: in the first approach, the models were trained using the imbalanced data, and in the second, an oversampling technique named Synthetic Minority Oversampling Technique (SMOTE) was used to augment the data. The models studied in both approaches were Decision Trees (DT), k-Nearest Neighbours (k-NN), Random Forests (RF), Logistic Regression (LR) and Support Vector Machine (SVM). For each model, two feature selection methods, PCA and ETC, were used. Two additional combinations of features were selected without resorting to feature selection techniques - using all the features extracted and only using two features (mean TCA values of the target and virtual modules). The results obtained demonstrated that SMOTE revealed to be an effective technique to overcame the imbalanced data set problem, improving all model performances for all feature selection techniques by 16.5%. The DT models, in particular, showed better results in this anomaly detection problem with F1-score performances of 73.10% and 91.09% using the imbalanced data and SMOTE technique, respectively. Keywords: Anomaly Detection; Electrochemical Sensors; Predictive Quality; Sampling techniques; Machine Learning; 2,4,6-Trichloroanisole.2021-04-202021-04-20T00:00:00Z2024-04-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/133564TID:202820700engInês Santos Pereirainfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:19:52Zoai:repositorio-aberto.up.pt:10216/133564Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:59:05.099884Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
title |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
spellingShingle |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines Inês Santos Pereira Outras ciências da engenharia e tecnologias Other engineering and technologies |
title_short |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
title_full |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
title_fullStr |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
title_full_unstemmed |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
title_sort |
Anomaly Detection for troubleshooting on Cork Stopper sorting Machines |
author |
Inês Santos Pereira |
author_facet |
Inês Santos Pereira |
author_role |
author |
dc.contributor.author.fl_str_mv |
Inês Santos Pereira |
dc.subject.por.fl_str_mv |
Outras ciências da engenharia e tecnologias Other engineering and technologies |
topic |
Outras ciências da engenharia e tecnologias Other engineering and technologies |
description |
Due to the ever-growing competitive pressure, the supply of high-quality products continues to evolve as an essential factor in securing a company's long-term success. With the emergence of Industry 4.0 and data-driven approaches, the quality control processes have taken a step further, enabling companies to predict product quality by continuously monitoring the manufacturing process. This work's aim is to investigate the possibility of inferring incorrect sensor measurements produced by the NDtech quality control screening system, and thereby increasing the systems' reliability. The incorrect detection of the 2,4,6-Trichloroanisole contaminate and subsequent classification of the cork stoppers leads to the appearance of false positives and false negatives. The false negatives are particularly harmful to the company's business reputation. The work's challenges are to optimally and automatically characterise behaviours of interest from monitoring sensor data and use them to guarantee the continuous and accurate classification of high-end natural cork stoppers. This dissertation analysed and developed classification models to detect anomalies in sensor data. The main contributions were the analysis of the features that best represent the anomaly detection problem, the development and study of several machine learning strategies that will be the basis for future work, and the exploration of techniques for dealing with imbalanced data sets. Two approaches were compared: in the first approach, the models were trained using the imbalanced data, and in the second, an oversampling technique named Synthetic Minority Oversampling Technique (SMOTE) was used to augment the data. The models studied in both approaches were Decision Trees (DT), k-Nearest Neighbours (k-NN), Random Forests (RF), Logistic Regression (LR) and Support Vector Machine (SVM). For each model, two feature selection methods, PCA and ETC, were used. Two additional combinations of features were selected without resorting to feature selection techniques - using all the features extracted and only using two features (mean TCA values of the target and virtual modules). The results obtained demonstrated that SMOTE revealed to be an effective technique to overcame the imbalanced data set problem, improving all model performances for all feature selection techniques by 16.5%. The DT models, in particular, showed better results in this anomaly detection problem with F1-score performances of 73.10% and 91.09% using the imbalanced data and SMOTE technique, respectively. Keywords: Anomaly Detection; Electrochemical Sensors; Predictive Quality; Sampling techniques; Machine Learning; 2,4,6-Trichloroanisole. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-04-20 2021-04-20T00:00:00Z 2024-04-19T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/133564 TID:202820700 |
url |
https://hdl.handle.net/10216/133564 |
identifier_str_mv |
TID:202820700 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135914296868864 |