Development of estimation of distribution algorithms for linear genetic programming

Sotto, Leo Francoso Dal Piccol [UNIFESP]

Development of estimation of distribution algorithms for linear genetic programming

Detalhes bibliográficos
Autor(a) principal:	Sotto, Leo Francoso Dal Piccol [UNIFESP]
Data de Publicação:	2020
Tipo de documento:	Tese
Idioma:	eng
Título da fonte:	Repositório Institucional da UNIFESP
Texto Completo:	https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466 https://hdl.handle.net/11600/64929
Resumo:	Linear Genetic Programming (LGP) is a Genetic Programming (GP) variant that has been successfully applied in various domains, such as regression, classification, and navigation. Differently from traditional GP, that represents programs as trees, LGP uses lists of instructions, which causes the data flow to be represented as a Directed Acyclic Graph (DAG) and introduces features as, for example, non-effective code and code reuse. As in other Evolutionary Algorithms (EAs), LGP’s stochastic search process neither has the knowledge to produce good quality solutions nor is it able to avoid poor quality programs, which reduces its efficacy. Furthermore, their recombination operators often ignore the correlation between the different positions in the genotype. To deal with these issues in EAs, researchers proposed the Estimation of Distribution Algorithms (EDAs), that use a probability model to sample promising solutions instead of applying recombination operators. The first goal of this PhD thesis is to propose EDAs for LGP that can make use of the LGP representation features and model dependencies between variables. Two forms of doing that are explored: 1) Adapting a Stochastic Context-free Grammar (SCFG) to sample sequences of instructions instead of derivation trees and integrating it in LGP via hybrid versions that combine the use of the grammar with the application of traditional LGP genetic operators; 2) Creating an intermediary integer vector that represents a sequence of instructions and using it to build a Bayesian Network. The resulting techniques are validated on regression and classification problems, and can outperform LGP when the hybrid version is considered. The thesis also address challenges in designing EDAs for the LGP representation. Given that the LGP representation features play an important role in how new methods should be designed, research was also conducted on the role of non-effective instructions in LGP and the impact of the DAG representation compared to the standard tree representation, in order to better understand how the technique works and thus to improve the design of new methods based on it. The conclusions are that non-effective instructions are an important component of LGP programs, although its benefits are dependent on how the algorithm is used, and it is also shown that DAGs present a great advantage over trees for solving determined classes of problems, specially design of digital circuits.

Metadados do item

id	UFSP_475f13b058f23619864b462f89812cec
oai_identifier_str	oai:repositorio.unifesp.br/:11600/64929
network_acronym_str	UFSP
network_name_str	Repositório Institucional da UNIFESP
repository_id_str	3465
spelling	Development of estimation of distribution algorithms for linear genetic programmingLinear Genetic ProgrammingEstimation Of Distribution AlgorithmsRegressionClassificationDigital CircuitsProgramação Genética LinearEstimação De Algoritmos De DistribuiçãoRegressãoClassificaçãoCircuitos DigitaisLinear Genetic Programming (LGP) is a Genetic Programming (GP) variant that has been successfully applied in various domains, such as regression, classification, and navigation. Differently from traditional GP, that represents programs as trees, LGP uses lists of instructions, which causes the data flow to be represented as a Directed Acyclic Graph (DAG) and introduces features as, for example, non-effective code and code reuse. As in other Evolutionary Algorithms (EAs), LGP’s stochastic search process neither has the knowledge to produce good quality solutions nor is it able to avoid poor quality programs, which reduces its efficacy. Furthermore, their recombination operators often ignore the correlation between the different positions in the genotype. To deal with these issues in EAs, researchers proposed the Estimation of Distribution Algorithms (EDAs), that use a probability model to sample promising solutions instead of applying recombination operators. The first goal of this PhD thesis is to propose EDAs for LGP that can make use of the LGP representation features and model dependencies between variables. Two forms of doing that are explored: 1) Adapting a Stochastic Context-free Grammar (SCFG) to sample sequences of instructions instead of derivation trees and integrating it in LGP via hybrid versions that combine the use of the grammar with the application of traditional LGP genetic operators; 2) Creating an intermediary integer vector that represents a sequence of instructions and using it to build a Bayesian Network. The resulting techniques are validated on regression and classification problems, and can outperform LGP when the hybrid version is considered. The thesis also address challenges in designing EDAs for the LGP representation. Given that the LGP representation features play an important role in how new methods should be designed, research was also conducted on the role of non-effective instructions in LGP and the impact of the DAG representation compared to the standard tree representation, in order to better understand how the technique works and thus to improve the design of new methods based on it. The conclusions are that non-effective instructions are an important component of LGP programs, although its benefits are dependent on how the algorithm is used, and it is also shown that DAGs present a great advantage over trees for solving determined classes of problems, specially design of digital circuits.Programação Genética Linear (LGP) é uma variante da Programação Genética (GP) aplicada com sucesso em vários domínios, como regressão, classificação, e navegação. Diferentemente da GP tradicional, que representa programas como árvores, a LGP usa listas de instruções, o que faz com que o fluxo de dados dos programas seja representado como um Grafo Direcionado Acíclico (DAG) e introduz características como código não-efetivo e reúso de código. Assim como em outros Algoritmos Evolutivos (EAs), o processo estocástico de busca da LGP não tem conhecimento para produzir soluções de alta qualidade e não é capaz de evitar programas de baixa qualidade, o que reduz sua eficácia. Além disso, seus operadores de recombinação ignoram a correlação entre diferentes posições do genótipo. Para lidar com esses problemas em EAs, pesquisadores propuseram os Algoritmos de Estimação de Distribuição (EDAs), que usam um modelo probabilístico para amostrar soluções ao invés de aplicar operadores genéticos. O primeiro objetivo desta tese de doutorado é propor EDAs para a LGP que façam uso das propriedades de sua representação e modelem dependências entre variáveis. Duas formas de se fazer isso são exploradas: 1) Adaptar uma Gramática Livre de Contexto Estocástica (SCFG) para amostrar sequências de instruções ao invés de árvores de derivação e integrá-las ao LGP através de versões híbridas que combinam o uso da gramática com a aplicação de operadores genéticos; 2) Criar um vetor de inteiros intermediário para representar uma sequência de instruções e usá-lo para construir uma Rede Bayesiana. As técnicas resultantes são validadas em problemas de regressão e classificação, e são capazes de superar a LGP através das versões híbridas. A tese também discute desafios para se desenvolver EDAs para a representação da LGP. Considerando que as características da representação da LGP são um aspecto importante no design de novos métodos, também se pesquisou o papel de instruções não-efetivas na LGP e o impacto da representação por DAGs comparada com a representação por árvores, com o objetivo de se entender melhor como a técnica funciona e melhorar o desenvolvimento de novos métodos baseados nela. Conclui-se que intruções não-efetivas são um componente importante da LGP, mas seus benefícios dependem de como o algoritmo é usado, e também é demonstrado que DAGs apresentam uma grade vantagem em relação a árvores em determinadas classes de problemas, especialmente evolução de circuitos digitais.Dados abertos - Sucupira - Teses e dissertações (2020)Universidade Federal de São Paulo (UNIFESP)Basgalupp, Marcio Porto [UNIFESP]Universidade Federal de São PauloSotto, Leo Francoso Dal Piccol [UNIFESP]2022-07-25T14:20:52Z2022-07-25T14:20:52Z2020-08-18info:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/publishedVersion178 p.application/pdfhttps://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466LEO FRANCOSO DAL PICCOL SOTTO.pdfhttps://hdl.handle.net/11600/64929enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNIFESPinstname:Universidade Federal de São Paulo (UNIFESP)instacron:UNIFESP2024-07-27T04:33:36Zoai:repositorio.unifesp.br/:11600/64929Repositório InstitucionalPUBhttp://www.repositorio.unifesp.br/oai/requestbiblioteca.csp@unifesp.bropendoar:34652024-07-27T04:33:36Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)false
dc.title.none.fl_str_mv	Development of estimation of distribution algorithms for linear genetic programming
title	Development of estimation of distribution algorithms for linear genetic programming
spellingShingle	Development of estimation of distribution algorithms for linear genetic programming Sotto, Leo Francoso Dal Piccol [UNIFESP] Linear Genetic Programming Estimation Of Distribution Algorithms Regression Classification Digital Circuits Programação Genética Linear Estimação De Algoritmos De Distribuição Regressão Classificação Circuitos Digitais
title_short	Development of estimation of distribution algorithms for linear genetic programming
title_full	Development of estimation of distribution algorithms for linear genetic programming
title_fullStr	Development of estimation of distribution algorithms for linear genetic programming
title_full_unstemmed	Development of estimation of distribution algorithms for linear genetic programming
title_sort	Development of estimation of distribution algorithms for linear genetic programming
author	Sotto, Leo Francoso Dal Piccol [UNIFESP]
author_facet	Sotto, Leo Francoso Dal Piccol [UNIFESP]
author_role	author
dc.contributor.none.fl_str_mv	Basgalupp, Marcio Porto [UNIFESP] Universidade Federal de São Paulo
dc.contributor.author.fl_str_mv	Sotto, Leo Francoso Dal Piccol [UNIFESP]
dc.subject.por.fl_str_mv	Linear Genetic Programming Estimation Of Distribution Algorithms Regression Classification Digital Circuits Programação Genética Linear Estimação De Algoritmos De Distribuição Regressão Classificação Circuitos Digitais
topic	Linear Genetic Programming Estimation Of Distribution Algorithms Regression Classification Digital Circuits Programação Genética Linear Estimação De Algoritmos De Distribuição Regressão Classificação Circuitos Digitais
description	Linear Genetic Programming (LGP) is a Genetic Programming (GP) variant that has been successfully applied in various domains, such as regression, classification, and navigation. Differently from traditional GP, that represents programs as trees, LGP uses lists of instructions, which causes the data flow to be represented as a Directed Acyclic Graph (DAG) and introduces features as, for example, non-effective code and code reuse. As in other Evolutionary Algorithms (EAs), LGP’s stochastic search process neither has the knowledge to produce good quality solutions nor is it able to avoid poor quality programs, which reduces its efficacy. Furthermore, their recombination operators often ignore the correlation between the different positions in the genotype. To deal with these issues in EAs, researchers proposed the Estimation of Distribution Algorithms (EDAs), that use a probability model to sample promising solutions instead of applying recombination operators. The first goal of this PhD thesis is to propose EDAs for LGP that can make use of the LGP representation features and model dependencies between variables. Two forms of doing that are explored: 1) Adapting a Stochastic Context-free Grammar (SCFG) to sample sequences of instructions instead of derivation trees and integrating it in LGP via hybrid versions that combine the use of the grammar with the application of traditional LGP genetic operators; 2) Creating an intermediary integer vector that represents a sequence of instructions and using it to build a Bayesian Network. The resulting techniques are validated on regression and classification problems, and can outperform LGP when the hybrid version is considered. The thesis also address challenges in designing EDAs for the LGP representation. Given that the LGP representation features play an important role in how new methods should be designed, research was also conducted on the role of non-effective instructions in LGP and the impact of the DAG representation compared to the standard tree representation, in order to better understand how the technique works and thus to improve the design of new methods based on it. The conclusions are that non-effective instructions are an important component of LGP programs, although its benefits are dependent on how the algorithm is used, and it is also shown that DAGs present a great advantage over trees for solving determined classes of problems, specially design of digital circuits.
publishDate	2020
dc.date.none.fl_str_mv	2020-08-18 2022-07-25T14:20:52Z 2022-07-25T14:20:52Z
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466 LEO FRANCOSO DAL PICCOL SOTTO.pdf https://hdl.handle.net/11600/64929
url	https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/viewTrabalhoConclusao.jsf?popup=true&id_trabalho=10889466 https://hdl.handle.net/11600/64929
identifier_str_mv	LEO FRANCOSO DAL PICCOL SOTTO.pdf
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	178 p. application/pdf
dc.publisher.none.fl_str_mv	Universidade Federal de São Paulo (UNIFESP)
publisher.none.fl_str_mv	Universidade Federal de São Paulo (UNIFESP)
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UNIFESP instname:Universidade Federal de São Paulo (UNIFESP) instacron:UNIFESP
instname_str	Universidade Federal de São Paulo (UNIFESP)
instacron_str	UNIFESP
institution	UNIFESP
reponame_str	Repositório Institucional da UNIFESP
collection	Repositório Institucional da UNIFESP
repository.name.fl_str_mv	Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)
repository.mail.fl_str_mv	biblioteca.csp@unifesp.br
_version_	1814268339455787008

Development of estimation of distribution algorithms for linear genetic programming

Registros relacionados