Data mining for ranking sorghum seed lots

Detalhes bibliográficos
Autor(a) principal: Rocha, Luciana Dias
Data de Publicação: 2023
Outros Autores: Gadotti, Gizele Ingrid, Bernardy, Ruan, Pinheiro, Romario de Mesquita, Monteiro, Rita de Cassia Mota
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Revista Caatinga
Texto Completo: https://periodicos.ufersa.edu.br/caatinga/article/view/11310
Resumo: The ranking of seed lots is a fundamental process for all companies in the seed industry. This work aims to demonstrate data mining methods for ranking sorghum seed lots during the seed processing through analysis of quality control data. Germination and cold tests were performed to verify the physiological quality of the lots. Seed samples from each lot were evaluated in two moments: post-cleaning and finished product (ready for marketing). The results after pre-processing totaled 188 rows of data with six attributes, encompassing 150 lots accepted for marketing, 6 rejected, and 32 intermediate lots. The classifiers used were J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron, and IBk. The Resample filter was used for adjustment of the data. The k-fold technique was used for training, with ten folds. The metrics of Accuracy, Precision, Recall, F-measure, and ROC Area were used to verify the accuracy of the algorithms. The results obtained were used to determine the best machine-learning algorithm. IBk and J48 presented the highest accuracy of data; the IBk technique presented the best results. The Resample filter was essential for solving the data imbalance problem. Sorghum seed lots can be classified with great accuracy and precision through artificial intelligence and machine learning technique.
id UFERSA-1_865e22f0726e554a886eb044470f8e0d
oai_identifier_str oai:ojs.periodicos.ufersa.edu.br:article/11310
network_acronym_str UFERSA-1
network_name_str Revista Caatinga
repository_id_str
spelling Data mining for ranking sorghum seed lotsMineração de dados no ranqueamento de lotes de sementes de sorgoQualidadeTecnologia de pós-colheitaGases de efeito estufa. Manejo do solo. Modelagem. Inteligência artificial.QualityPost-harvest technologyArtificial intelligence. Image processing. Seeds.The ranking of seed lots is a fundamental process for all companies in the seed industry. This work aims to demonstrate data mining methods for ranking sorghum seed lots during the seed processing through analysis of quality control data. Germination and cold tests were performed to verify the physiological quality of the lots. Seed samples from each lot were evaluated in two moments: post-cleaning and finished product (ready for marketing). The results after pre-processing totaled 188 rows of data with six attributes, encompassing 150 lots accepted for marketing, 6 rejected, and 32 intermediate lots. The classifiers used were J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron, and IBk. The Resample filter was used for adjustment of the data. The k-fold technique was used for training, with ten folds. The metrics of Accuracy, Precision, Recall, F-measure, and ROC Area were used to verify the accuracy of the algorithms. The results obtained were used to determine the best machine-learning algorithm. IBk and J48 presented the highest accuracy of data; the IBk technique presented the best results. The Resample filter was essential for solving the data imbalance problem. Sorghum seed lots can be classified with great accuracy and precision through artificial intelligence and machine learning technique.A classificação de lotes de sementes é um processo fundamental para todas as empresas do setor sementeiro. O objetivo do trabalho é demonstrar os métodos de mineração de dados de ranqueamento de lotes de sementes de sorgo durante o processo de beneficiamento, através de análises de dados do controle de qualidade. Os testes realizados foram germinação e teste de frio, com o objetivo de verificar a qualidade fisiológica dos lotes. As amostras de sementes de cada lote foram avaliadas em dois momentos: pós-limpeza e produto acabado (pronto para comercialização). Os dados gerados, após o pré-processamento, totalizaram 188 linhas com seis atributos, contabilizando 150 lotes aceitos para comercialização, seis rejeitados e 32 denominados intermediários. Os classificadores utilizados foram J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron e IBk. Utilizou-se o filtro Resample para ajustamento dos dados. A técnica empregada para treinamento foi a k-fold, com 10 folds. Para verificar a precisão dos algoritmos foram utilizadas as métricas de Acurácia, Precisão, Recall, F-measure e Área ROC. Com os resultados obtidos determinou-se o melhor algoritmo de aprendizagem de máquina. Verificou-se que o IBk e o J48 obtiveram maior acurácia nos dados, sendo que a técnica de IBk obteve o melhor resultado. O filtro Resample foi importante para resolver o problema do desequilíbrio dos dados. Concluímos ser possível classificar lotes de sementes de sorgo com grande acurácia e precisão através de inteligência artificial e sua técnica de aprendizado de máquina.Universidade Federal Rural do Semi-Árido2023-02-28info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://periodicos.ufersa.edu.br/caatinga/article/view/1131010.1590/1983-21252023v36n224rcREVISTA CAATINGA; Vol. 36 No. 2 (2023); 471-478Revista Caatinga; v. 36 n. 2 (2023); 471-4781983-21250100-316Xreponame:Revista Caatingainstname:Universidade Federal Rural do Semi-Árido (UFERSA)instacron:UFERSAenghttps://periodicos.ufersa.edu.br/caatinga/article/view/11310/11156Copyright (c) 2023 Revista Caatingainfo:eu-repo/semantics/openAccessRocha, Luciana DiasGadotti, Gizele IngridBernardy, RuanPinheiro, Romario de MesquitaMonteiro, Rita de Cassia Mota2023-07-27T12:06:05Zoai:ojs.periodicos.ufersa.edu.br:article/11310Revistahttps://periodicos.ufersa.edu.br/index.php/caatinga/indexPUBhttps://periodicos.ufersa.edu.br/index.php/caatinga/oaipatricio@ufersa.edu.br|| caatinga@ufersa.edu.br1983-21250100-316Xopendoar:2024-04-29T09:47:02.903550Revista Caatinga - Universidade Federal Rural do Semi-Árido (UFERSA)true
dc.title.none.fl_str_mv Data mining for ranking sorghum seed lots
Mineração de dados no ranqueamento de lotes de sementes de sorgo
title Data mining for ranking sorghum seed lots
spellingShingle Data mining for ranking sorghum seed lots
Rocha, Luciana Dias
Qualidade
Tecnologia de pós-colheita
Gases de efeito estufa. Manejo do solo. Modelagem. Inteligência artificial.
Quality
Post-harvest technology
Artificial intelligence. Image processing. Seeds.
title_short Data mining for ranking sorghum seed lots
title_full Data mining for ranking sorghum seed lots
title_fullStr Data mining for ranking sorghum seed lots
title_full_unstemmed Data mining for ranking sorghum seed lots
title_sort Data mining for ranking sorghum seed lots
author Rocha, Luciana Dias
author_facet Rocha, Luciana Dias
Gadotti, Gizele Ingrid
Bernardy, Ruan
Pinheiro, Romario de Mesquita
Monteiro, Rita de Cassia Mota
author_role author
author2 Gadotti, Gizele Ingrid
Bernardy, Ruan
Pinheiro, Romario de Mesquita
Monteiro, Rita de Cassia Mota
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Rocha, Luciana Dias
Gadotti, Gizele Ingrid
Bernardy, Ruan
Pinheiro, Romario de Mesquita
Monteiro, Rita de Cassia Mota
dc.subject.por.fl_str_mv Qualidade
Tecnologia de pós-colheita
Gases de efeito estufa. Manejo do solo. Modelagem. Inteligência artificial.
Quality
Post-harvest technology
Artificial intelligence. Image processing. Seeds.
topic Qualidade
Tecnologia de pós-colheita
Gases de efeito estufa. Manejo do solo. Modelagem. Inteligência artificial.
Quality
Post-harvest technology
Artificial intelligence. Image processing. Seeds.
description The ranking of seed lots is a fundamental process for all companies in the seed industry. This work aims to demonstrate data mining methods for ranking sorghum seed lots during the seed processing through analysis of quality control data. Germination and cold tests were performed to verify the physiological quality of the lots. Seed samples from each lot were evaluated in two moments: post-cleaning and finished product (ready for marketing). The results after pre-processing totaled 188 rows of data with six attributes, encompassing 150 lots accepted for marketing, 6 rejected, and 32 intermediate lots. The classifiers used were J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron, and IBk. The Resample filter was used for adjustment of the data. The k-fold technique was used for training, with ten folds. The metrics of Accuracy, Precision, Recall, F-measure, and ROC Area were used to verify the accuracy of the algorithms. The results obtained were used to determine the best machine-learning algorithm. IBk and J48 presented the highest accuracy of data; the IBk technique presented the best results. The Resample filter was essential for solving the data imbalance problem. Sorghum seed lots can be classified with great accuracy and precision through artificial intelligence and machine learning technique.
publishDate 2023
dc.date.none.fl_str_mv 2023-02-28
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://periodicos.ufersa.edu.br/caatinga/article/view/11310
10.1590/1983-21252023v36n224rc
url https://periodicos.ufersa.edu.br/caatinga/article/view/11310
identifier_str_mv 10.1590/1983-21252023v36n224rc
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://periodicos.ufersa.edu.br/caatinga/article/view/11310/11156
dc.rights.driver.fl_str_mv Copyright (c) 2023 Revista Caatinga
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2023 Revista Caatinga
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal Rural do Semi-Árido
publisher.none.fl_str_mv Universidade Federal Rural do Semi-Árido
dc.source.none.fl_str_mv REVISTA CAATINGA; Vol. 36 No. 2 (2023); 471-478
Revista Caatinga; v. 36 n. 2 (2023); 471-478
1983-2125
0100-316X
reponame:Revista Caatinga
instname:Universidade Federal Rural do Semi-Árido (UFERSA)
instacron:UFERSA
instname_str Universidade Federal Rural do Semi-Árido (UFERSA)
instacron_str UFERSA
institution UFERSA
reponame_str Revista Caatinga
collection Revista Caatinga
repository.name.fl_str_mv Revista Caatinga - Universidade Federal Rural do Semi-Árido (UFERSA)
repository.mail.fl_str_mv patricio@ufersa.edu.br|| caatinga@ufersa.edu.br
_version_ 1797674030027767808