Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification

Detalhes bibliográficos
Autor(a) principal: Frank, Franz
Data de Publicação: 2023
Outros Autores: Bacao, Fernando
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/155832
Resumo: Frank, F., & Bacao, F. (2023). Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification. Emerging Science Journal, 7(4), 1349-1363. https://doi.org/10.28991/ESJ-2023-07-04-021--- Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/joaopfonseca/mlresearch. ---This work was supported by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a Ciência e a Tecnologia”), DSAIPA/DS/0116/2019, and project UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC)
id RCAP_380ec22ab9fc1fca48cb1c1ad165108d
oai_identifier_str oai:run.unl.pt:10362/155832
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary ClassificationGenetic ProgrammingAutomated Machine LearningAutoMLImbalanced Binary ClassificationGeneralFrank, F., & Bacao, F. (2023). Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification. Emerging Science Journal, 7(4), 1349-1363. https://doi.org/10.28991/ESJ-2023-07-04-021--- Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/joaopfonseca/mlresearch. ---This work was supported by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a Ciência e a Tecnologia”), DSAIPA/DS/0116/2019, and project UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC)The objective of this article is to provide a comparative analysis of two novel genetic programming (GP) techniques, differentiable Cartesian genetic programming for artificial neural networks (DCGPANN) and geometric semantic genetic programming (GSGP), with state-of-the-art automated machine learning (AutoML) tools, namely Auto-Keras, Auto-PyTorch and Auto-Sklearn. While all these techniques are compared to several baseline algorithms upon their introduction, research still lacks direct comparisons between them, especially of the GP approaches with state-of-the-art AutoML. This study intends to fill this gap in order to analyze the true potential of GP for AutoML. The performances of the different tools are assessed by applying them to 20 benchmark datasets of the imbalanced binary classification field, thus an area that is a frequent and challenging problem. The tools are compared across the four categories average performance, maximum performance, standard deviation within performance, and generalization ability, whereby the metrics F1-score, G-mean, and AUC are used for evaluation. The analysis finds that the GP techniques, while unable to completely outperform state-of-the-art AutoML, are indeed already a very competitive alternative. Therefore, these advanced GP tools prove that they are able to provide a new and promising approach for practitioners developing machine learning (ML) models.NOVA Information Management School (NOVA IMS)Information Management Research Center (MagIC) - NOVA Information Management SchoolRUNFrank, FranzBacao, Fernando2023-07-25T22:15:26Z2023-08-012023-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article15application/pdfhttp://hdl.handle.net/10362/155832eng2610-9182PURE: 67340501https://doi.org/10.28991/ESJ-2023-07-04-021info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:38:25Zoai:run.unl.pt:10362/155832Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:56:13.899332Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
title Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
spellingShingle Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
Frank, Franz
Genetic Programming
Automated Machine Learning
AutoML
Imbalanced Binary Classification
General
title_short Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
title_full Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
title_fullStr Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
title_full_unstemmed Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
title_sort Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
author Frank, Franz
author_facet Frank, Franz
Bacao, Fernando
author_role author
author2 Bacao, Fernando
author2_role author
dc.contributor.none.fl_str_mv NOVA Information Management School (NOVA IMS)
Information Management Research Center (MagIC) - NOVA Information Management School
RUN
dc.contributor.author.fl_str_mv Frank, Franz
Bacao, Fernando
dc.subject.por.fl_str_mv Genetic Programming
Automated Machine Learning
AutoML
Imbalanced Binary Classification
General
topic Genetic Programming
Automated Machine Learning
AutoML
Imbalanced Binary Classification
General
description Frank, F., & Bacao, F. (2023). Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification. Emerging Science Journal, 7(4), 1349-1363. https://doi.org/10.28991/ESJ-2023-07-04-021--- Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/joaopfonseca/mlresearch. ---This work was supported by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a Ciência e a Tecnologia”), DSAIPA/DS/0116/2019, and project UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC)
publishDate 2023
dc.date.none.fl_str_mv 2023-07-25T22:15:26Z
2023-08-01
2023-08-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/155832
url http://hdl.handle.net/10362/155832
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2610-9182
PURE: 67340501
https://doi.org/10.28991/ESJ-2023-07-04-021
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 15
application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138148098244608