BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry
Autor(a) principal: | |
---|---|
Data de Publicação: | 2014 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFU |
Texto Completo: | https://repositorio.ufu.br/handle/123456789/12556 https://doi.org/10.14393/ufu.di.2014.165 |
Resumo: | The game of Go is, nowadays, one of the greatest challenge in the Articial Intelligence area, since this game has a set of characteristics that prevents the success application of techniques, which has been very successful in other games. In this set of characteristics there is the high level of complexity, which prevents it from the use of techniques that require the maximum exploration of its search state-space. In this thesis is described the development of a player agent for the game of Go named BTT-Go. This agent was created from another one named Fuego, which uses one of the few techniques that had provided improvement to the automatic players of Go: the Monte- Carlo Tree Search algorithm. The player Fuego uses a supervised learning, once its search method is based, exclusively, on Monte-Carlo simulations, heuristics board evaluations and database, which contains data about the game start (opening book).This way, the objective of this thesis is to produce a competitive agent in spite of the supervision reduction, which is much less then the supervision used by the agent Fuego. To achieve this objective, BTTGo was developed in three versions: in the rst, the agent uses a Transposition Table, which is a repository of data processed previously. This way, it is possible to reduce the simulation supervision by its reduction, and in some situations, the agent uses the data from the table instead of using the Fuego prior knowledge evaluation. The second version of BTT-Go consists in the application, in the nal stage of the Monte-Carlo search algorithm, of a bayesian technique inspired on Bradley-Terry model. This technique predicts the best move by a board evaluation. This evaluation is done considering some features that describes how good a move is. In this stage, the agent Fuego uses policies to indicate which move should be played. The BTT-Go third version was created by the combination of the rst and the second versions, in a way that the techniques used can work together without any loss. Once the development of the three version was completed, it was performed some experiments in dierent board sizes (9x9, 13x13 and 19x19). In these experiments, it was observed that the use of Transposition Table reduced the agent supervision. Although, there was a little reduction in its winning rate in large boards (13x13 and 19x19), comparing it to Fuego, nevertheless BTT-Go is still a competitive player. It was also observed that the technique inspired on Bradley-Terry model increased the competitiveness of the agent in large boards (13x13 and 19x19), and in some situation it was better than the agent Fuego. Therefore, the development of the player BTT-Go has provided a supervision reduction by the use of Transposition Table and by the use of bayesian technique inspired on Bradley- Terry model, and also a increase of the acuity in the moves generation during the search process. |
id |
UFU_aae6333c511eca8e76e7047d1f03c515 |
---|---|
oai_identifier_str |
oai:repositorio.ufu.br:123456789/12556 |
network_acronym_str |
UFU |
network_name_str |
Repositório Institucional da UFU |
repository_id_str |
|
spelling |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-TerryGoBusca monte-carloModelo bradley-terrySimulações monte-carloJogos eletrônicosJogos por computadorMonte Carlo, Método deMonte-carlo tree searchBradley-terry modelMonte-carlo simulationsCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOThe game of Go is, nowadays, one of the greatest challenge in the Articial Intelligence area, since this game has a set of characteristics that prevents the success application of techniques, which has been very successful in other games. In this set of characteristics there is the high level of complexity, which prevents it from the use of techniques that require the maximum exploration of its search state-space. In this thesis is described the development of a player agent for the game of Go named BTT-Go. This agent was created from another one named Fuego, which uses one of the few techniques that had provided improvement to the automatic players of Go: the Monte- Carlo Tree Search algorithm. The player Fuego uses a supervised learning, once its search method is based, exclusively, on Monte-Carlo simulations, heuristics board evaluations and database, which contains data about the game start (opening book).This way, the objective of this thesis is to produce a competitive agent in spite of the supervision reduction, which is much less then the supervision used by the agent Fuego. To achieve this objective, BTTGo was developed in three versions: in the rst, the agent uses a Transposition Table, which is a repository of data processed previously. This way, it is possible to reduce the simulation supervision by its reduction, and in some situations, the agent uses the data from the table instead of using the Fuego prior knowledge evaluation. The second version of BTT-Go consists in the application, in the nal stage of the Monte-Carlo search algorithm, of a bayesian technique inspired on Bradley-Terry model. This technique predicts the best move by a board evaluation. This evaluation is done considering some features that describes how good a move is. In this stage, the agent Fuego uses policies to indicate which move should be played. The BTT-Go third version was created by the combination of the rst and the second versions, in a way that the techniques used can work together without any loss. Once the development of the three version was completed, it was performed some experiments in dierent board sizes (9x9, 13x13 and 19x19). In these experiments, it was observed that the use of Transposition Table reduced the agent supervision. Although, there was a little reduction in its winning rate in large boards (13x13 and 19x19), comparing it to Fuego, nevertheless BTT-Go is still a competitive player. It was also observed that the technique inspired on Bradley-Terry model increased the competitiveness of the agent in large boards (13x13 and 19x19), and in some situation it was better than the agent Fuego. Therefore, the development of the player BTT-Go has provided a supervision reduction by the use of Transposition Table and by the use of bayesian technique inspired on Bradley- Terry model, and also a increase of the acuity in the moves generation during the search process.Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorMestre em Ciência da ComputaçãoO jogo de Go é, atualmente, um dos grandes desaos para a área de Inteligência Articial, pois este reúne uma série de características que impedem o sucesso de técnicas que foram bem sucedidas em outros jogos. Entre as características desaadoras do jogo, está o alto nível de complexidade que inviabiliza o uso de técnicas que necessitam explorar, ao máximo, seu espaço de busca. Diante deste desao, neste trabalho de Mestrado foi criado o agente jogador de Go denominado BTT-Go. Este agente foi criado a partir de outro jogador, chamado Fuego, que utiliza uma das poucas técnicas que proporcionam bons ganhos aos jogadores autom áticos de Go: o algoritmo de busca Monte-Carlo. O Fuego possui uma aprendizagem essencialmente supervisionada, uma vez que seu processo de busca pelo melhor movimento baseado, exclusivamente, nas simulações Monte-Carlo, em avaliações heurísticas de tabuleiros e em bases de dados de início de jogo (opening book). Assim sendo, o objetivo do presente trabalho é produzir um agente inspirado no Fuego que se mantenha bastante competitivo apesar de apresentar um nível de supervisão inferior ao do citado jogador automático. Para atingir este objetivo, o BTT-Go foi desenvolvido em três versões: na primeira delas, foi usada uma Tabela de Transposição, que serve como um repositório de dados previamente processados. Desta maneira, é possível reduzir a supervisão da quantidade de simulações efetuadas pelo algoritmo de busca Monte-Carlo, avaliação que permite substituir, em alguns casos, a avaliação prior-knowledge herdada do Fuego. A segunda versão do BTT-Go consiste na aplicação, durante a etapa nal da busca Monte-Carlo, de uma técnica bayesiana inspirada no modelo Bradley-Terry. Esta técnica permite predizer a melhor jogada através da avaliação do tabuleiro. Esta avaliação é feita em função de alguns atributos, que servem para dizer o quanto uma jogada é boa. No Fuego esta etapa tem os movimentos gerados, unicamente, por políticas. Na terceira versão é feita a associa ção da primeira com a segunda versão para o funcionamento em conjunto das técnicas aplicadas. Uma vez concluídas as três versões do agente BTT-Go, testes foram realizados em tabuleiros de tamanho 9x9, 13x13 e 19x19. Nestes testes observou-se que com a aplicação da Tabela de Transposição reduziu-se a supervisão no agente. Contudo, ocorreu uma leve queda no percentual de vitórias em tabuleiros maiores (13x13 e 19x19), quando comparado ao jogador Fuego, mas mesmo assim o agente se manteve competitivo. Contudo, com a aplicação da técnica inspirada no modelo Bradley-Terry observou-se um aumentou na competitividade do jogador mesmo em tabuleiros maiores (13x13 e 19x19), chegando em alguns casos ser melhor que o agente Fuego. Portanto, a criação do jogador BTT-Go proporcionou a redução da supervisão, obtida tanto pelo uso da Tabela de Transposição quanto pela técnica bayesiana inspirada no modelo Bradley-Terry. Também proporcionou ao agente um aumento da acuidade na geração de movimentos no processo de busca.Universidade Federal de UberlândiaBRPrograma de Pós-graduação em Ciência da ComputaçãoCiências Exatas e da TerraUFUJulia, Rita Maria da Silvahttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4788590Z8Amo, Sandra Aparecida dehttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4791545U6Carvalho, Andre Carlos Ponce de Leon Ferreira dehttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4788511Y6Vieira Júnior, Eldane2016-06-22T18:32:29Z2014-09-032016-06-22T18:32:29Z2014-02-27info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfapplication/pdfVIEIRA JÚNIOR, Eldane. BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry. 2014. 91 f. Dissertação (Mestrado em Ciências Exatas e da Terra) - Universidade Federal de Uberlândia, Uberlândia, 2014. DOI https://doi.org/10.14393/ufu.di.2014.165https://repositorio.ufu.br/handle/123456789/12556https://doi.org/10.14393/ufu.di.2014.165porinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFUinstname:Universidade Federal de Uberlândia (UFU)instacron:UFU2021-08-02T12:56:05Zoai:repositorio.ufu.br:123456789/12556Repositório InstitucionalONGhttp://repositorio.ufu.br/oai/requestdiinf@dirbi.ufu.bropendoar:2021-08-02T12:56:05Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)false |
dc.title.none.fl_str_mv |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
title |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
spellingShingle |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry Vieira Júnior, Eldane Go Busca monte-carlo Modelo bradley-terry Simulações monte-carlo Jogos eletrônicos Jogos por computador Monte Carlo, Método de Monte-carlo tree search Bradley-terry model Monte-carlo simulations CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
title_short |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
title_full |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
title_fullStr |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
title_full_unstemmed |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
title_sort |
BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry |
author |
Vieira Júnior, Eldane |
author_facet |
Vieira Júnior, Eldane |
author_role |
author |
dc.contributor.none.fl_str_mv |
Julia, Rita Maria da Silva http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4788590Z8 Amo, Sandra Aparecida de http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4791545U6 Carvalho, Andre Carlos Ponce de Leon Ferreira de http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4788511Y6 |
dc.contributor.author.fl_str_mv |
Vieira Júnior, Eldane |
dc.subject.por.fl_str_mv |
Go Busca monte-carlo Modelo bradley-terry Simulações monte-carlo Jogos eletrônicos Jogos por computador Monte Carlo, Método de Monte-carlo tree search Bradley-terry model Monte-carlo simulations CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
topic |
Go Busca monte-carlo Modelo bradley-terry Simulações monte-carlo Jogos eletrônicos Jogos por computador Monte Carlo, Método de Monte-carlo tree search Bradley-terry model Monte-carlo simulations CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
description |
The game of Go is, nowadays, one of the greatest challenge in the Articial Intelligence area, since this game has a set of characteristics that prevents the success application of techniques, which has been very successful in other games. In this set of characteristics there is the high level of complexity, which prevents it from the use of techniques that require the maximum exploration of its search state-space. In this thesis is described the development of a player agent for the game of Go named BTT-Go. This agent was created from another one named Fuego, which uses one of the few techniques that had provided improvement to the automatic players of Go: the Monte- Carlo Tree Search algorithm. The player Fuego uses a supervised learning, once its search method is based, exclusively, on Monte-Carlo simulations, heuristics board evaluations and database, which contains data about the game start (opening book).This way, the objective of this thesis is to produce a competitive agent in spite of the supervision reduction, which is much less then the supervision used by the agent Fuego. To achieve this objective, BTTGo was developed in three versions: in the rst, the agent uses a Transposition Table, which is a repository of data processed previously. This way, it is possible to reduce the simulation supervision by its reduction, and in some situations, the agent uses the data from the table instead of using the Fuego prior knowledge evaluation. The second version of BTT-Go consists in the application, in the nal stage of the Monte-Carlo search algorithm, of a bayesian technique inspired on Bradley-Terry model. This technique predicts the best move by a board evaluation. This evaluation is done considering some features that describes how good a move is. In this stage, the agent Fuego uses policies to indicate which move should be played. The BTT-Go third version was created by the combination of the rst and the second versions, in a way that the techniques used can work together without any loss. Once the development of the three version was completed, it was performed some experiments in dierent board sizes (9x9, 13x13 and 19x19). In these experiments, it was observed that the use of Transposition Table reduced the agent supervision. Although, there was a little reduction in its winning rate in large boards (13x13 and 19x19), comparing it to Fuego, nevertheless BTT-Go is still a competitive player. It was also observed that the technique inspired on Bradley-Terry model increased the competitiveness of the agent in large boards (13x13 and 19x19), and in some situation it was better than the agent Fuego. Therefore, the development of the player BTT-Go has provided a supervision reduction by the use of Transposition Table and by the use of bayesian technique inspired on Bradley- Terry model, and also a increase of the acuity in the moves generation during the search process. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014-09-03 2014-02-27 2016-06-22T18:32:29Z 2016-06-22T18:32:29Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
VIEIRA JÚNIOR, Eldane. BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry. 2014. 91 f. Dissertação (Mestrado em Ciências Exatas e da Terra) - Universidade Federal de Uberlândia, Uberlândia, 2014. DOI https://doi.org/10.14393/ufu.di.2014.165 https://repositorio.ufu.br/handle/123456789/12556 https://doi.org/10.14393/ufu.di.2014.165 |
identifier_str_mv |
VIEIRA JÚNIOR, Eldane. BTT-Go: um agente jogador de Go com busca Monte-Carlo aprimorada com tabela de transposição e modelo Bradley-Terry. 2014. 91 f. Dissertação (Mestrado em Ciências Exatas e da Terra) - Universidade Federal de Uberlândia, Uberlândia, 2014. DOI https://doi.org/10.14393/ufu.di.2014.165 |
url |
https://repositorio.ufu.br/handle/123456789/12556 https://doi.org/10.14393/ufu.di.2014.165 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Federal de Uberlândia BR Programa de Pós-graduação em Ciência da Computação Ciências Exatas e da Terra UFU |
publisher.none.fl_str_mv |
Universidade Federal de Uberlândia BR Programa de Pós-graduação em Ciência da Computação Ciências Exatas e da Terra UFU |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFU instname:Universidade Federal de Uberlândia (UFU) instacron:UFU |
instname_str |
Universidade Federal de Uberlândia (UFU) |
instacron_str |
UFU |
institution |
UFU |
reponame_str |
Repositório Institucional da UFU |
collection |
Repositório Institucional da UFU |
repository.name.fl_str_mv |
Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU) |
repository.mail.fl_str_mv |
diinf@dirbi.ufu.br |
_version_ |
1813711440725409792 |