A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/57053 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
id |
RCAP_173ee6e925a0acebb12a2cfaa38c2778 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/57053 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardnessGeometric ProgrammingGeometric Semantic Genetic ProgrammingParticle Swarm OptimizaionMulti Objective OptimizationMulti Objective Particle Swarm OptimizaionGSGP hardnessGP hardnessDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsSince its introduction, GSGP has been very successful in optimizing symbolic regression problems. GSGP has shown great convergence ability, as well as generalization ability. Almost without exception, GSGP outperforms GP on real life applications. These results give the impression that almost all symbolic regression problems are easy for GSGP to solve. However, little to no research has been done towards GSGP hardness, while GP hardness has been researched quite extensively already. The lack of real life applications on which GSGP has more difficulty converging than GP creates a paradox: if there are no, or very little, problems that are GSGP hard, no research can be done towards it. In this paper we propose an algorithm that generates and evolves datasets on which GP outperforms GSGP, under the condition that the GP model remains as accurate as possible. The algorithm can also be altered so that it produces datasets in which GSGP outperforms GP. This allows for comparing GP hard datasets with GSGP hard datasets. The algorithm has shown to be able to produce favorable datasets for both GP and GSGP using multiple settings for the number of instances and the number of variables. Therefore, the algorithm proposed in this paper breaks the earlier mentioned paradox by producing GSGP hard datasets, thus allowing GSGP hardness to be effectively researched for the first time. Furthermore, tuning the algorithm led to some early observations about the relation between dataset composition and GP/GSGP performance. GSGP has difficulty converging when using only 1 dependent variable and 1 independent variable, while it is easy to produce datasets on which GP heavily outperforms GSGP with the same settings. GSGP performs better when more independent variables are added. Furthermore, a big range of the dataset has been shown to be beneficial for GP convergence, while a small range is beneficial for GSGP convergence.Dissertation presented as partial requirement for obtaining the Master’s degree in Advanced AnalyticsVanneschi, LeonardoRUNVerschoor, Cornelis Marnik2019-01-10T19:45:29Z2018-11-292018-11-29T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/57053TID:202137333enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:27:29Zoai:run.unl.pt:10362/57053Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:33:01.155090Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
title |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
spellingShingle |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness Verschoor, Cornelis Marnik Geometric Programming Geometric Semantic Genetic Programming Particle Swarm Optimizaion Multi Objective Optimization Multi Objective Particle Swarm Optimizaion GSGP hardness GP hardness |
title_short |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
title_full |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
title_fullStr |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
title_full_unstemmed |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
title_sort |
A proposal for generating gsgp-hard datasets : an introductory research towards gsgp hardness |
author |
Verschoor, Cornelis Marnik |
author_facet |
Verschoor, Cornelis Marnik |
author_role |
author |
dc.contributor.none.fl_str_mv |
Vanneschi, Leonardo RUN |
dc.contributor.author.fl_str_mv |
Verschoor, Cornelis Marnik |
dc.subject.por.fl_str_mv |
Geometric Programming Geometric Semantic Genetic Programming Particle Swarm Optimizaion Multi Objective Optimization Multi Objective Particle Swarm Optimizaion GSGP hardness GP hardness |
topic |
Geometric Programming Geometric Semantic Genetic Programming Particle Swarm Optimizaion Multi Objective Optimization Multi Objective Particle Swarm Optimizaion GSGP hardness GP hardness |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-11-29 2018-11-29T00:00:00Z 2019-01-10T19:45:29Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/57053 TID:202137333 |
url |
http://hdl.handle.net/10362/57053 |
identifier_str_mv |
TID:202137333 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137951893946368 |