A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness

Detalhes bibliográficos
Autor(a) principal: Verschoor, Cornelis Marnik
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/57053
Resumo: Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
id RCAP_173ee6e925a0acebb12a2cfaa38c2778
oai_identifier_str oai:run.unl.pt:10362/57053
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardnessGeometric ProgrammingGeometric Semantic Genetic ProgrammingParticle Swarm OptimizaionMulti Objective OptimizationMulti Objective Particle Swarm OptimizaionGSGP hardnessGP hardnessDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsSince its introduction, GSGP has been very successful in optimizing symbolic regression problems. GSGP has shown great convergence ability, as well as generalization ability. Almost without exception, GSGP outperforms GP on real life applications. These results give the impression that almost all symbolic regression problems are easy for GSGP to solve. However, little to no research has been done towards GSGP hardness, while GP hardness has been researched quite extensively already. The lack of real life applications on which GSGP has more difficulty converging than GP creates a paradox: if there are no, or very little, problems that are GSGP hard, no research can be done towards it. In this paper we propose an algorithm that generates and evolves datasets on which GP outperforms GSGP, under the condition that the GP model remains as accurate as possible. The algorithm can also be altered so that it produces datasets in which GSGP outperforms GP. This allows for comparing GP hard datasets with GSGP hard datasets. The algorithm has shown to be able to produce favorable datasets for both GP and GSGP using multiple settings for the number of instances and the number of variables. Therefore, the algorithm proposed in this paper breaks the earlier mentioned paradox by producing GSGP hard datasets, thus allowing GSGP hardness to be effectively researched for the first time. Furthermore, tuning the algorithm led to some early observations about the relation between dataset composition and GP/GSGP performance. GSGP has difficulty converging when using only 1 dependent variable and 1 independent variable, while it is easy to produce datasets on which GP heavily outperforms GSGP with the same settings. GSGP performs better when more independent variables are added. Furthermore, a big range of the dataset has been shown to be beneficial for GP convergence, while a small range is beneficial for GSGP convergence.Dissertation presented as partial requirement for obtaining the Master’s degree in Advanced AnalyticsVanneschi, LeonardoRUNVerschoor, Cornelis Marnik2019-01-10T19:45:29Z2018-11-292018-11-29T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/57053TID:202137333enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:27:29Zoai:run.unl.pt:10362/57053Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:33:01.155090Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
title A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
spellingShingle A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
Verschoor, Cornelis Marnik
Geometric Programming
Geometric Semantic Genetic Programming
Particle Swarm Optimizaion
Multi Objective Optimization
Multi Objective Particle Swarm Optimizaion
GSGP hardness
GP hardness
title_short A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
title_full A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
title_fullStr A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
title_full_unstemmed A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
title_sort A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
author Verschoor, Cornelis Marnik
author_facet Verschoor, Cornelis Marnik
author_role author
dc.contributor.none.fl_str_mv Vanneschi, Leonardo
RUN
dc.contributor.author.fl_str_mv Verschoor, Cornelis Marnik
dc.subject.por.fl_str_mv Geometric Programming
Geometric Semantic Genetic Programming
Particle Swarm Optimizaion
Multi Objective Optimization
Multi Objective Particle Swarm Optimizaion
GSGP hardness
GP hardness
topic Geometric Programming
Geometric Semantic Genetic Programming
Particle Swarm Optimizaion
Multi Objective Optimization
Multi Objective Particle Swarm Optimizaion
GSGP hardness
GP hardness
description Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
publishDate 2018
dc.date.none.fl_str_mv 2018-11-29
2018-11-29T00:00:00Z
2019-01-10T19:45:29Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/57053
TID:202137333
url http://hdl.handle.net/10362/57053
identifier_str_mv TID:202137333
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137951893946368