Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético

Lima Júnior, Francisco Chagas de

Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético

Detalhes bibliográficos
Autor(a) principal:	Lima Júnior, Francisco Chagas de
Data de Publicação:	2009
Tipo de documento:	Tese
Idioma:	por
Título da fonte:	Repositório Institucional da UFRN
Texto Completo:	https://repositorio.ufrn.br/jspui/handle/123456789/15129
Resumo:	Techniques of optimization known as metaheuristics have achieved success in the resolution of many problems classified as NP-Hard. These methods use non deterministic approaches that reach very good solutions which, however, don t guarantee the determination of the global optimum. Beyond the inherent difficulties related to the complexity that characterizes the optimization problems, the metaheuristics still face the dilemma of xploration/exploitation, which consists of choosing between a greedy search and a wider exploration of the solution space. A way to guide such algorithms during the searching of better solutions is supplying them with more knowledge of the problem through the use of a intelligent agent, able to recognize promising regions and also identify when they should diversify the direction of the search. This way, this work proposes the use of Reinforcement Learning technique - Q-learning Algorithm - as exploration/exploitation strategy for the metaheuristics GRASP (Greedy Randomized Adaptive Search Procedure) and Genetic Algorithm. The GRASP metaheuristic uses Q-learning instead of the traditional greedy-random algorithm in the construction phase. This replacement has the purpose of improving the quality of the initial solutions that are used in the local search phase of the GRASP, and also provides for the metaheuristic an adaptive memory mechanism that allows the reuse of good previous decisions and also avoids the repetition of bad decisions. In the Genetic Algorithm, the Q-learning algorithm was used to generate an initial population of high fitness, and after a determined number of generations, where the rate of diversity of the population is less than a certain limit L, it also was applied to supply one of the parents to be used in the genetic crossover operator. Another significant change in the hybrid genetic algorithm is the proposal of a mutually interactive cooperation process between the genetic operators and the Q-learning algorithm. In this interactive/cooperative process, the Q-learning algorithm receives an additional update in the matrix of Q-values based on the current best solution of the Genetic Algorithm. The computational experiments presented in this thesis compares the results obtained with the implementation of traditional versions of GRASP metaheuristic and Genetic Algorithm, with those obtained using the proposed hybrid methods. Both algorithms had been applied successfully to the symmetrical Traveling Salesman Problem, which was modeled as a Markov decision process

Metadados do item

id	UFRN_a120e35c7e116f7b2b27d1fab26a03b1
oai_identifier_str	oai:https://repositorio.ufrn.br:123456789/15129
network_acronym_str	UFRN
network_name_str	Repositório Institucional da UFRN
repository_id_str
spelling	Lima Júnior, Francisco Chagas dehttp://lattes.cnpq.br/7325007451912598Dória Neto, Adrião Duartehttp://lattes.cnpq.br/1987295209521433Aloise, Dario Joséhttp://lattes.cnpq.br/7266011798625538Viana, Gerardo Valdisio Rodrigueshttp://lattes.cnpq.br/6262051397848744Melo, Jorge Dantas de2014-12-17T14:54:52Z2009-06-092014-12-17T14:54:52Z2009-03-20LIMA JÚNIOR, Francisco Chagas de. Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético. 2009. 140 f. Tese (Doutorado em Automação e Sistemas; Engenharia de Computação; Telecomunicações) - Universidade Federal do Rio Grande do Norte, Natal, 2009.https://repositorio.ufrn.br/jspui/handle/123456789/15129Techniques of optimization known as metaheuristics have achieved success in the resolution of many problems classified as NP-Hard. These methods use non deterministic approaches that reach very good solutions which, however, don t guarantee the determination of the global optimum. Beyond the inherent difficulties related to the complexity that characterizes the optimization problems, the metaheuristics still face the dilemma of xploration/exploitation, which consists of choosing between a greedy search and a wider exploration of the solution space. A way to guide such algorithms during the searching of better solutions is supplying them with more knowledge of the problem through the use of a intelligent agent, able to recognize promising regions and also identify when they should diversify the direction of the search. This way, this work proposes the use of Reinforcement Learning technique - Q-learning Algorithm - as exploration/exploitation strategy for the metaheuristics GRASP (Greedy Randomized Adaptive Search Procedure) and Genetic Algorithm. The GRASP metaheuristic uses Q-learning instead of the traditional greedy-random algorithm in the construction phase. This replacement has the purpose of improving the quality of the initial solutions that are used in the local search phase of the GRASP, and also provides for the metaheuristic an adaptive memory mechanism that allows the reuse of good previous decisions and also avoids the repetition of bad decisions. In the Genetic Algorithm, the Q-learning algorithm was used to generate an initial population of high fitness, and after a determined number of generations, where the rate of diversity of the population is less than a certain limit L, it also was applied to supply one of the parents to be used in the genetic crossover operator. Another significant change in the hybrid genetic algorithm is the proposal of a mutually interactive cooperation process between the genetic operators and the Q-learning algorithm. In this interactive/cooperative process, the Q-learning algorithm receives an additional update in the matrix of Q-values based on the current best solution of the Genetic Algorithm. The computational experiments presented in this thesis compares the results obtained with the implementation of traditional versions of GRASP metaheuristic and Genetic Algorithm, with those obtained using the proposed hybrid methods. Both algorithms had been applied successfully to the symmetrical Traveling Salesman Problem, which was modeled as a Markov decision processTécnicas de otimização conhecidas como metaheurísticas têm obtido sucesso na resolução de problemas classificados como NP - Árduos. Estes métodos utilizam abordagens não determinísticas que geram soluções próximas do ótimo sem, no entanto, garantir a determinação do ótimo global. Além das dificuldades inerentes à complexidade que caracteriza os problemas NP-Árduos, as metaheurísticas enfrentam ainda o dilema de exploração/explotação, que consiste em escolher entre intensificação da busca em uma região específica e a exploração mais ampla do espaço de soluções. Uma forma de orientar tais algoritmos em busca de melhores soluções é supri-los de maior conhecimento do problema através da utilização de um agente inteligente, capaz de reconhecer regiões promissoras e/ou identificar em que momento deverá diversificar a direção de busca, isto pode ser feito através da aplicação de Aprendizagem por Reforço. Neste contexto, este trabalho propõe o uso de uma técnica de Aprendizagem por Reforço - especificamente o Algoritmo Q-learning - como uma estratégia de exploração/explotação para as metaheurísticas GRASP (Greedy Randomized Adaptive Search Procedure) e Algoritmo Genético. Na implementação da metaheurística GRASP proposta, utilizou-se o Q-learning em substituição ao algoritmo guloso-aleatório tradicionalmente usado na fase de construção. Tal substituição teve como objetivo melhorar a qualidade das soluções iniciais que serão utilizadas na fase de busca local do GRASP, e, ao mesmo tempo, suprir esta metaheurísticas de um mecanismo de memória adaptativa que permita a reutilização de boas decisões tomadas em iterações passadas e que evite a repetição de decisões não promissoras. No Algoritmo Genético, o algoritmo Q-learning foi utilizado para gerar uma população inicial de alta aptidão, e após um determinado número de gerações, caso a taxa de diversidade da população seja menor do que um determinado limite L, ele é também utilizado em uma forma alternativa de operador de cruzamento. Outra modificação importante no algoritmo genético híbrido é a proposta de um processo de interação mutuamente cooperativa entre o os operadores genéticos e o Algoritmo Q-learning. Neste processo interativo/cooperativo o algoritmo Q-learning recebe uma atualização adicional na matriz dos Q-valores com base na solução elite da população corrente. Os experimentos computacionais apresentados neste trabalho consistem em comparar os resultados obtidos com a implementação de versões tradicionais das metaheurísticas citadas, com aqueles obtidos utilizando os métodos híbridos propostos. Ambos os algoritmos foram aplicados com sucesso ao problema do caixeiro viajante simétrico, que por sua vez, foi modelado como um processo de decisão de Markovapplication/pdfporUniversidade Federal do Rio Grande do NortePrograma de Pós-Graduação em Engenharia ElétricaUFRNBRAutomação e Sistemas; Engenharia de Computação; TelecomunicaçõesMetaheurísticaGRASPAlgoritmos genéticosAlgoritmoQ-learningProblema do caixeiro viajanteGRASP metaheuristicGenetic algorithmQ-learning algorithmTravelling salesman problemCNPQ::ENGENHARIAS::ENGENHARIA ELETRICAAlgoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genéticoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRNinstname:Universidade Federal do Rio Grande do Norte (UFRN)instacron:UFRNORIGINALFranciscoCLJ.pdfapplication/pdf1181019https://repositorio.ufrn.br/bitstream/123456789/15129/1/FranciscoCLJ.pdfb3894e0c93f85d3cf920c7015daef964MD51TEXTFranciscoCLJ.pdf.txtFranciscoCLJ.pdf.txtExtracted texttext/plain288474https://repositorio.ufrn.br/bitstream/123456789/15129/6/FranciscoCLJ.pdf.txt951ccd7eae55a9893dbec1ec8411ecccMD56THUMBNAILFranciscoCLJ.pdf.jpgFranciscoCLJ.pdf.jpgIM Thumbnailimage/jpeg4047https://repositorio.ufrn.br/bitstream/123456789/15129/7/FranciscoCLJ.pdf.jpg2a9562f7700ea7f4ca5cfc43dbda5323MD57123456789/151292017-11-02 05:15:40.679oai:https://repositorio.ufrn.br:123456789/15129Repositório de PublicaçõesPUBhttp://repositorio.ufrn.br/oai/opendoar:2017-11-02T08:15:40Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)false
dc.title.por.fl_str_mv	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
title	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
spellingShingle	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético Lima Júnior, Francisco Chagas de MetaheurísticaGRASP Algoritmos genéticos AlgoritmoQ-learning Problema do caixeiro viajante GRASP metaheuristic Genetic algorithm Q-learning algorithm Travelling salesman problem CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
title_short	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
title_full	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
title_fullStr	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
title_full_unstemmed	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
title_sort	Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético
author	Lima Júnior, Francisco Chagas de
author_facet	Lima Júnior, Francisco Chagas de
author_role	author
dc.contributor.authorID.por.fl_str_mv
dc.contributor.advisorID.por.fl_str_mv
dc.contributor.advisorLattes.por.fl_str_mv	http://lattes.cnpq.br/7325007451912598
dc.contributor.advisor-co1ID.por.fl_str_mv
dc.contributor.referees1.pt_BR.fl_str_mv	Aloise, Dario José
dc.contributor.referees1ID.por.fl_str_mv
dc.contributor.referees1Lattes.por.fl_str_mv	http://lattes.cnpq.br/7266011798625538
dc.contributor.referees2.pt_BR.fl_str_mv	Viana, Gerardo Valdisio Rodrigues
dc.contributor.referees2ID.por.fl_str_mv
dc.contributor.referees2Lattes.por.fl_str_mv	http://lattes.cnpq.br/6262051397848744
dc.contributor.author.fl_str_mv	Lima Júnior, Francisco Chagas de
dc.contributor.advisor-co1.fl_str_mv	Dória Neto, Adrião Duarte
dc.contributor.advisor-co1Lattes.fl_str_mv	http://lattes.cnpq.br/1987295209521433
dc.contributor.advisor1.fl_str_mv	Melo, Jorge Dantas de
contributor_str_mv	Dória Neto, Adrião Duarte Melo, Jorge Dantas de
dc.subject.por.fl_str_mv	MetaheurísticaGRASP Algoritmos genéticos AlgoritmoQ-learning Problema do caixeiro viajante
topic	MetaheurísticaGRASP Algoritmos genéticos AlgoritmoQ-learning Problema do caixeiro viajante GRASP metaheuristic Genetic algorithm Q-learning algorithm Travelling salesman problem CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
dc.subject.eng.fl_str_mv	GRASP metaheuristic Genetic algorithm Q-learning algorithm Travelling salesman problem
dc.subject.cnpq.fl_str_mv	CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
description	Techniques of optimization known as metaheuristics have achieved success in the resolution of many problems classified as NP-Hard. These methods use non deterministic approaches that reach very good solutions which, however, don t guarantee the determination of the global optimum. Beyond the inherent difficulties related to the complexity that characterizes the optimization problems, the metaheuristics still face the dilemma of xploration/exploitation, which consists of choosing between a greedy search and a wider exploration of the solution space. A way to guide such algorithms during the searching of better solutions is supplying them with more knowledge of the problem through the use of a intelligent agent, able to recognize promising regions and also identify when they should diversify the direction of the search. This way, this work proposes the use of Reinforcement Learning technique - Q-learning Algorithm - as exploration/exploitation strategy for the metaheuristics GRASP (Greedy Randomized Adaptive Search Procedure) and Genetic Algorithm. The GRASP metaheuristic uses Q-learning instead of the traditional greedy-random algorithm in the construction phase. This replacement has the purpose of improving the quality of the initial solutions that are used in the local search phase of the GRASP, and also provides for the metaheuristic an adaptive memory mechanism that allows the reuse of good previous decisions and also avoids the repetition of bad decisions. In the Genetic Algorithm, the Q-learning algorithm was used to generate an initial population of high fitness, and after a determined number of generations, where the rate of diversity of the population is less than a certain limit L, it also was applied to supply one of the parents to be used in the genetic crossover operator. Another significant change in the hybrid genetic algorithm is the proposal of a mutually interactive cooperation process between the genetic operators and the Q-learning algorithm. In this interactive/cooperative process, the Q-learning algorithm receives an additional update in the matrix of Q-values based on the current best solution of the Genetic Algorithm. The computational experiments presented in this thesis compares the results obtained with the implementation of traditional versions of GRASP metaheuristic and Genetic Algorithm, with those obtained using the proposed hybrid methods. Both algorithms had been applied successfully to the symmetrical Traveling Salesman Problem, which was modeled as a Markov decision process
publishDate	2009
dc.date.available.fl_str_mv	2009-06-09 2014-12-17T14:54:52Z
dc.date.issued.fl_str_mv	2009-03-20
dc.date.accessioned.fl_str_mv	2014-12-17T14:54:52Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	LIMA JÚNIOR, Francisco Chagas de. Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético. 2009. 140 f. Tese (Doutorado em Automação e Sistemas; Engenharia de Computação; Telecomunicações) - Universidade Federal do Rio Grande do Norte, Natal, 2009.
dc.identifier.uri.fl_str_mv	https://repositorio.ufrn.br/jspui/handle/123456789/15129
identifier_str_mv	LIMA JÚNIOR, Francisco Chagas de. Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético. 2009. 140 f. Tese (Doutorado em Automação e Sistemas; Engenharia de Computação; Telecomunicações) - Universidade Federal do Rio Grande do Norte, Natal, 2009.
url	https://repositorio.ufrn.br/jspui/handle/123456789/15129
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidade Federal do Rio Grande do Norte
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Engenharia Elétrica
dc.publisher.initials.fl_str_mv	UFRN
dc.publisher.country.fl_str_mv	BR
dc.publisher.department.fl_str_mv	Automação e Sistemas; Engenharia de Computação; Telecomunicações
publisher.none.fl_str_mv	Universidade Federal do Rio Grande do Norte
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFRN instname:Universidade Federal do Rio Grande do Norte (UFRN) instacron:UFRN
instname_str	Universidade Federal do Rio Grande do Norte (UFRN)
instacron_str	UFRN
institution	UFRN
reponame_str	Repositório Institucional da UFRN
collection	Repositório Institucional da UFRN
bitstream.url.fl_str_mv	https://repositorio.ufrn.br/bitstream/123456789/15129/1/FranciscoCLJ.pdf https://repositorio.ufrn.br/bitstream/123456789/15129/6/FranciscoCLJ.pdf.txt https://repositorio.ufrn.br/bitstream/123456789/15129/7/FranciscoCLJ.pdf.jpg
bitstream.checksum.fl_str_mv	b3894e0c93f85d3cf920c7015daef964 951ccd7eae55a9893dbec1ec8411eccc 2a9562f7700ea7f4ca5cfc43dbda5323
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)
repository.mail.fl_str_mv
_version_	1802117585195499520

Algoritmo Q-learning como estratégia de exploração e/ou explotação para metaheurísticas GRASP e algoritmo genético

Registros relacionados