Development of estimation of distribution algorithms for linear genetic programming
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNIFESP |
Texto Completo: | https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466 https://hdl.handle.net/11600/64929 |
Resumo: | Linear Genetic Programming (LGP) is a Genetic Programming (GP) variant that has been successfully applied in various domains, such as regression, classification, and navigation. Differently from traditional GP, that represents programs as trees, LGP uses lists of instructions, which causes the data flow to be represented as a Directed Acyclic Graph (DAG) and introduces features as, for example, non-effective code and code reuse. As in other Evolutionary Algorithms (EAs), LGP’s stochastic search process neither has the knowledge to produce good quality solutions nor is it able to avoid poor quality programs, which reduces its efficacy. Furthermore, their recombination operators often ignore the correlation between the different positions in the genotype. To deal with these issues in EAs, researchers proposed the Estimation of Distribution Algorithms (EDAs), that use a probability model to sample promising solutions instead of applying recombination operators. The first goal of this PhD thesis is to propose EDAs for LGP that can make use of the LGP representation features and model dependencies between variables. Two forms of doing that are explored: 1) Adapting a Stochastic Context-free Grammar (SCFG) to sample sequences of instructions instead of derivation trees and integrating it in LGP via hybrid versions that combine the use of the grammar with the application of traditional LGP genetic operators; 2) Creating an intermediary integer vector that represents a sequence of instructions and using it to build a Bayesian Network. The resulting techniques are validated on regression and classification problems, and can outperform LGP when the hybrid version is considered. The thesis also address challenges in designing EDAs for the LGP representation. Given that the LGP representation features play an important role in how new methods should be designed, research was also conducted on the role of non-effective instructions in LGP and the impact of the DAG representation compared to the standard tree representation, in order to better understand how the technique works and thus to improve the design of new methods based on it. The conclusions are that non-effective instructions are an important component of LGP programs, although its benefits are dependent on how the algorithm is used, and it is also shown that DAGs present a great advantage over trees for solving determined classes of problems, specially design of digital circuits. |
id |
UFSP_475f13b058f23619864b462f89812cec |
---|---|
oai_identifier_str |
oai:repositorio.unifesp.br/:11600/64929 |
network_acronym_str |
UFSP |
network_name_str |
Repositório Institucional da UNIFESP |
repository_id_str |
3465 |
spelling |
Development of estimation of distribution algorithms for linear genetic programmingLinear Genetic ProgrammingEstimation Of Distribution AlgorithmsRegressionClassificationDigital CircuitsProgramação Genética LinearEstimação De Algoritmos De DistribuiçãoRegressãoClassificaçãoCircuitos DigitaisLinear Genetic Programming (LGP) is a Genetic Programming (GP) variant that has been successfully applied in various domains, such as regression, classification, and navigation. Differently from traditional GP, that represents programs as trees, LGP uses lists of instructions, which causes the data flow to be represented as a Directed Acyclic Graph (DAG) and introduces features as, for example, non-effective code and code reuse. As in other Evolutionary Algorithms (EAs), LGP’s stochastic search process neither has the knowledge to produce good quality solutions nor is it able to avoid poor quality programs, which reduces its efficacy. Furthermore, their recombination operators often ignore the correlation between the different positions in the genotype. To deal with these issues in EAs, researchers proposed the Estimation of Distribution Algorithms (EDAs), that use a probability model to sample promising solutions instead of applying recombination operators. The first goal of this PhD thesis is to propose EDAs for LGP that can make use of the LGP representation features and model dependencies between variables. Two forms of doing that are explored: 1) Adapting a Stochastic Context-free Grammar (SCFG) to sample sequences of instructions instead of derivation trees and integrating it in LGP via hybrid versions that combine the use of the grammar with the application of traditional LGP genetic operators; 2) Creating an intermediary integer vector that represents a sequence of instructions and using it to build a Bayesian Network. The resulting techniques are validated on regression and classification problems, and can outperform LGP when the hybrid version is considered. The thesis also address challenges in designing EDAs for the LGP representation. Given that the LGP representation features play an important role in how new methods should be designed, research was also conducted on the role of non-effective instructions in LGP and the impact of the DAG representation compared to the standard tree representation, in order to better understand how the technique works and thus to improve the design of new methods based on it. The conclusions are that non-effective instructions are an important component of LGP programs, although its benefits are dependent on how the algorithm is used, and it is also shown that DAGs present a great advantage over trees for solving determined classes of problems, specially design of digital circuits.Programação Genética Linear (LGP) é uma variante da Programação Genética (GP) aplicada com sucesso em vários domínios, como regressão, classificação, e navegação. Diferentemente da GP tradicional, que representa programas como árvores, a LGP usa listas de instruções, o que faz com que o fluxo de dados dos programas seja representado como um Grafo Direcionado Acíclico (DAG) e introduz características como código não-efetivo e reúso de código. Assim como em outros Algoritmos Evolutivos (EAs), o processo estocástico de busca da LGP não tem conhecimento para produzir soluções de alta qualidade e não é capaz de evitar programas de baixa qualidade, o que reduz sua eficácia. Além disso, seus operadores de recombinação ignoram a correlação entre diferentes posições do genótipo. Para lidar com esses problemas em EAs, pesquisadores propuseram os Algoritmos de Estimação de Distribuição (EDAs), que usam um modelo probabilístico para amostrar soluções ao invés de aplicar operadores genéticos. O primeiro objetivo desta tese de doutorado é propor EDAs para a LGP que façam uso das propriedades de sua representação e modelem dependências entre variáveis. Duas formas de se fazer isso são exploradas: 1) Adaptar uma Gramática Livre de Contexto Estocástica (SCFG) para amostrar sequências de instruções ao invés de árvores de derivação e integrá-las ao LGP através de versões híbridas que combinam o uso da gramática com a aplicação de operadores genéticos; 2) Criar um vetor de inteiros intermediário para representar uma sequência de instruções e usá-lo para construir uma Rede Bayesiana. As técnicas resultantes são validadas em problemas de regressão e classificação, e são capazes de superar a LGP através das versões híbridas. A tese também discute desafios para se desenvolver EDAs para a representação da LGP. Considerando que as características da representação da LGP são um aspecto importante no design de novos métodos, também se pesquisou o papel de instruções não-efetivas na LGP e o impacto da representação por DAGs comparada com a representação por árvores, com o objetivo de se entender melhor como a técnica funciona e melhorar o desenvolvimento de novos métodos baseados nela. Conclui-se que intruções não-efetivas são um componente importante da LGP, mas seus benefícios dependem de como o algoritmo é usado, e também é demonstrado que DAGs apresentam uma grade vantagem em relação a árvores em determinadas classes de problemas, especialmente evolução de circuitos digitais.Dados abertos - Sucupira - Teses e dissertações (2020)Universidade Federal de São Paulo (UNIFESP)Basgalupp, Marcio Porto [UNIFESP]Universidade Federal de São PauloSotto, Leo Francoso Dal Piccol [UNIFESP]2022-07-25T14:20:52Z2022-07-25T14:20:52Z2020-08-18info:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/publishedVersion178 p.application/pdfhttps://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466LEO FRANCOSO DAL PICCOL SOTTO.pdfhttps://hdl.handle.net/11600/64929enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNIFESPinstname:Universidade Federal de São Paulo (UNIFESP)instacron:UNIFESP2024-07-27T04:33:36Zoai:repositorio.unifesp.br/:11600/64929Repositório InstitucionalPUBhttp://www.repositorio.unifesp.br/oai/requestbiblioteca.csp@unifesp.bropendoar:34652024-07-27T04:33:36Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)false |
dc.title.none.fl_str_mv |
Development of estimation of distribution algorithms for linear genetic programming |
title |
Development of estimation of distribution algorithms for linear genetic programming |
spellingShingle |
Development of estimation of distribution algorithms for linear genetic programming Sotto, Leo Francoso Dal Piccol [UNIFESP] Linear Genetic Programming Estimation Of Distribution Algorithms Regression Classification Digital Circuits Programação Genética Linear Estimação De Algoritmos De Distribuição Regressão Classificação Circuitos Digitais |
title_short |
Development of estimation of distribution algorithms for linear genetic programming |
title_full |
Development of estimation of distribution algorithms for linear genetic programming |
title_fullStr |
Development of estimation of distribution algorithms for linear genetic programming |
title_full_unstemmed |
Development of estimation of distribution algorithms for linear genetic programming |
title_sort |
Development of estimation of distribution algorithms for linear genetic programming |
author |
Sotto, Leo Francoso Dal Piccol [UNIFESP] |
author_facet |
Sotto, Leo Francoso Dal Piccol [UNIFESP] |
author_role |
author |
dc.contributor.none.fl_str_mv |
Basgalupp, Marcio Porto [UNIFESP] Universidade Federal de São Paulo |
dc.contributor.author.fl_str_mv |
Sotto, Leo Francoso Dal Piccol [UNIFESP] |
dc.subject.por.fl_str_mv |
Linear Genetic Programming Estimation Of Distribution Algorithms Regression Classification Digital Circuits Programação Genética Linear Estimação De Algoritmos De Distribuição Regressão Classificação Circuitos Digitais |
topic |
Linear Genetic Programming Estimation Of Distribution Algorithms Regression Classification Digital Circuits Programação Genética Linear Estimação De Algoritmos De Distribuição Regressão Classificação Circuitos Digitais |
description |
Linear Genetic Programming (LGP) is a Genetic Programming (GP) variant that has been successfully applied in various domains, such as regression, classification, and navigation. Differently from traditional GP, that represents programs as trees, LGP uses lists of instructions, which causes the data flow to be represented as a Directed Acyclic Graph (DAG) and introduces features as, for example, non-effective code and code reuse. As in other Evolutionary Algorithms (EAs), LGP’s stochastic search process neither has the knowledge to produce good quality solutions nor is it able to avoid poor quality programs, which reduces its efficacy. Furthermore, their recombination operators often ignore the correlation between the different positions in the genotype. To deal with these issues in EAs, researchers proposed the Estimation of Distribution Algorithms (EDAs), that use a probability model to sample promising solutions instead of applying recombination operators. The first goal of this PhD thesis is to propose EDAs for LGP that can make use of the LGP representation features and model dependencies between variables. Two forms of doing that are explored: 1) Adapting a Stochastic Context-free Grammar (SCFG) to sample sequences of instructions instead of derivation trees and integrating it in LGP via hybrid versions that combine the use of the grammar with the application of traditional LGP genetic operators; 2) Creating an intermediary integer vector that represents a sequence of instructions and using it to build a Bayesian Network. The resulting techniques are validated on regression and classification problems, and can outperform LGP when the hybrid version is considered. The thesis also address challenges in designing EDAs for the LGP representation. Given that the LGP representation features play an important role in how new methods should be designed, research was also conducted on the role of non-effective instructions in LGP and the impact of the DAG representation compared to the standard tree representation, in order to better understand how the technique works and thus to improve the design of new methods based on it. The conclusions are that non-effective instructions are an important component of LGP programs, although its benefits are dependent on how the algorithm is used, and it is also shown that DAGs present a great advantage over trees for solving determined classes of problems, specially design of digital circuits. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-08-18 2022-07-25T14:20:52Z 2022-07-25T14:20:52Z |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466 LEO FRANCOSO DAL PICCOL SOTTO.pdf https://hdl.handle.net/11600/64929 |
url |
https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466 https://hdl.handle.net/11600/64929 |
identifier_str_mv |
LEO FRANCOSO DAL PICCOL SOTTO.pdf |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
178 p. application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Federal de São Paulo (UNIFESP) |
publisher.none.fl_str_mv |
Universidade Federal de São Paulo (UNIFESP) |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UNIFESP instname:Universidade Federal de São Paulo (UNIFESP) instacron:UNIFESP |
instname_str |
Universidade Federal de São Paulo (UNIFESP) |
instacron_str |
UNIFESP |
institution |
UNIFESP |
reponame_str |
Repositório Institucional da UNIFESP |
collection |
Repositório Institucional da UNIFESP |
repository.name.fl_str_mv |
Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP) |
repository.mail.fl_str_mv |
biblioteca.csp@unifesp.br |
_version_ |
1814268339455787008 |