Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos

Mota, Ícaro da Costa

Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos

Detalhes bibliográficos
Autor(a) principal:	Mota, Ícaro da Costa
Data de Publicação:	2018
Tipo de documento:	Trabalho de conclusão de curso
Idioma:	por
Título da fonte:	Biblioteca Digital de Monografias da UnB
Texto Completo:	http://bdm.unb.br/handle/10483/21323
Resumo:	Trabalho de Conclusão de Curso (graduação)—Universidade de Brasília, Faculdade de Tecnologia, Curso de Graduação em Engenharia de Controle e Automação, 2018.

Metadados do item

id	UNB-2_26ad268a39ae680f713d21c216d5813f
oai_identifier_str	oai:bdm.unb.br:10483/21323
network_acronym_str	UNB-2
network_name_str	Biblioteca Digital de Monografias da UnB
repository_id_str	11571
spelling	Mota, Ícaro da CostaLamar, Marcus ViniciusMOTA, Ícaro da Costa. Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos. 2018. 61 f., il. Trabalho de Conclusão de Curso (Bacharelado em Engenharia Mecatrônica)—Universidade de Brasília, Brasília, 2018.http://bdm.unb.br/handle/10483/21323Trabalho de Conclusão de Curso (graduação)—Universidade de Brasília, Faculdade de Tecnologia, Curso de Graduação em Engenharia de Controle e Automação, 2018.Em aprendizagem por reforço, um agente deve aprender com suas experiências ao interagir com o ambiente no qual se encontra. Este trabalho propõe um sistema de aprendizagem profunda com o algoritmo Deep Q-Learning para ensinar um agente genérico a jogar jogos eletrônicos distintos, utilizando redes neurais artificiais para estimar o valor de executar-se uma ação no estado no qual o agente se encontra. O trabalho foi desenvolvido utilizando a ferramenta ROS para gerenciar a comunicação entre os sistemas. Aplicou–se as técnicas desenvolvidos nos jogos Enduro, Ms. Pacman, Breakout e Pong, emulados pela ferramenta OpenAI Gym, desenvolvida especificamente para auxiliar em trabalhos de aprendizagem por reforço. O agente demonstrou aprender no jogo Ms. Pacman, porém a modelagem do estado foi insuficiente nos jogos Breakout e Pong, resultando na inabilidade do agente em selecionar a melhor ação no estado em que se encontrava. No jogo Enduro, o agente não conseguiu interagir o suficiente com o ambiente para obter recompensas e aprender a maximiza-las.Submitted by Luanna Maia (luanna@bce.unb.br) on 2019-02-06T10:35:59Z No. of bitstreams: 3 license_text: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) 2018_IcaroDaCostaMota_tcc.pdf: 2098322 bytes, checksum: 5da94bbd74a2645ec5ed697f77510a2c (MD5)Approved for entry into archive by Luanna Maia (luanna@bce.unb.br) on 2019-02-06T10:36:11Z (GMT) No. of bitstreams: 3 license_text: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) 2018_IcaroDaCostaMota_tcc.pdf: 2098322 bytes, checksum: 5da94bbd74a2645ec5ed697f77510a2c (MD5)Made available in DSpace on 2019-02-06T10:36:11Z (GMT). No. of bitstreams: 3 license_text: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) 2018_IcaroDaCostaMota_tcc.pdf: 2098322 bytes, checksum: 5da94bbd74a2645ec5ed697f77510a2c (MD5)In reinforcement learning, an agent must learn from past experiences by interacting with its environment. This work proposes a deep learning system with the Deep Q-Learning algorithm to teach a generic agent to play distinct electronic games, by using artificial neural networks to estimate the value of executing an action in the state the agent finds itself. The work was developed by using the ROS resource to manage the communication between systems. The developed techniques were applied to the games Enduro, Ms. Pacman, Breakout, and Pong, emulated by the OpenAI Gym toolkit, developed specifically to aid in reinforced learning projects. The agent has shown to learn in the Ms. Pacman environment, but the state representation was insufficient in the games Breakout and Pong, resulting in the agent’s inability to select the best action in its current state. In the game Enduro, the agent did not interact enough with its environment to obtain rewards and learn to maximize them.Inteligência artificialAprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicosinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesis2019-02-06T10:36:11Z2019-02-06T10:36:11Z2018-07info:eu-repo/semantics/openAccessporreponame:Biblioteca Digital de Monografias da UnBinstname:Universidade de Brasília (UnB)instacron:UNBLICENSElicense.txtlicense.txttext/plain1817http://bdm.unb.br/xmlui/bitstream/10483/21323/5/license.txt21554873e56ad8ddc69c092699b98f95MD55CC-LICENSElicense_urllicense_urltext/plain49http://bdm.unb.br/xmlui/bitstream/10483/21323/2/license_url4afdbb8c545fd630ea7db775da747b2fMD52license_textlicense_textapplication/octet-stream0http://bdm.unb.br/xmlui/bitstream/10483/21323/3/license_textd41d8cd98f00b204e9800998ecf8427eMD53license_rdflicense_rdfapplication/octet-stream0http://bdm.unb.br/xmlui/bitstream/10483/21323/4/license_rdfd41d8cd98f00b204e9800998ecf8427eMD54ORIGINAL2018_IcaroDaCostaMota_tcc.pdf2018_IcaroDaCostaMota_tcc.pdfapplication/pdf2098322http://bdm.unb.br/xmlui/bitstream/10483/21323/1/2018_IcaroDaCostaMota_tcc.pdf5da94bbd74a2645ec5ed697f77510a2cMD5110483/213232020-11-25 15:08:40.657oai:bdm.unb.br:10483/21323w4kgbmVjZXNzw6FyaW8gY29uY29yZGFyIGNvbSBhIGxpY2Vuw6dhIGRlIGRpc3RyaWJ1acOnw6NvIG7Do28tZXhjbHVzaXZhLAphbnRlcyBxdWUgbyBkb2N1bWVudG8gcG9zc2EgYXBhcmVjZXIgbmEgQmlibGlvdGVjYSBEaWdpdGFsIGRhIFByb2R1w6fDo28gCkRpc2NlbnRlIGRhIFVuaXZlcnNpZGFkZSBkZSBCcmFzw61saWEuIFBvciBmYXZvciwgbGVpYSBhCmxpY2Vuw6dhIGF0ZW50YW1lbnRlLiBDYXNvIG5lY2Vzc2l0ZSBkZSBhbGd1bSBlc2NsYXJlY2ltZW50byBlbnRyZSBlbQpjb250YXRvIGF0cmF2w6lzIGRlOiBiZG1AYmNlLnVuYi5iciBvdSAzMTA3LTI2ODcuCgpMSUNFTsOHQSBERSBESVNUUklCVUnDh8ODTyBOw4NPLUVYQ0xVU0lWQQoKQW8gYXNzaW5hciBlIGVudHJlZ2FyIGVzdGEgbGljZW7Dp2EsIG8vYSBTci4vU3JhLiAoYXV0b3Igb3UgZGV0ZW50b3IgZG9zCmRpcmVpdG9zIGRlIGF1dG9yKToKCmEpIENvbmNlZGUgw6AgVW5pdmVyc2lkYWRlIGRlIEJyYXPDrWxpYSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUKcmVwcm9kdXppciwgY29udmVydGVyIChjb21vIGRlZmluaWRvIGFiYWl4byksIGNvbXVuaWNhciBlL291CmRpc3RyaWJ1aXIgbyBkb2N1bWVudG8gZW50cmVndWUgKGluY2x1aW5kbyBvIHJlc3Vtby9hYnN0cmFjdCkgZW0KZm9ybWF0byBkaWdpdGFsIG91IGltcHJlc3NvIGUgZW0gcXVhbHF1ZXIgbWVpby4KCmIpIERlY2xhcmEgcXVlIG8gZG9jdW1lbnRvIGVudHJlZ3VlIMOpIHNldSB0cmFiYWxobyBvcmlnaW5hbCwgZSBxdWUKZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gRGVjbGFyYQp0YW1iw6ltIHF1ZSBhIGVudHJlZ2EgZG8gZG9jdW1lbnRvIG7Do28gaW5mcmluZ2UsIHRhbnRvIHF1YW50byBsaGUgw6kKcG9zc8OtdmVsIHNhYmVyLCBvcyBkaXJlaXRvcyBkZSBxdWFscXVlciBvdXRyYSBwZXNzb2Egb3UgZW50aWRhZGUuCgpjKSBTZSBvIGRvY3VtZW50byBlbnRyZWd1ZSBjb250w6ltIG1hdGVyaWFsIGRvIHF1YWwgbsOjbyBkZXTDqW0gb3MKZGlyZWl0b3MgZGUgYXV0b3IsIGRlY2xhcmEgcXVlIG9idGV2ZSBhdXRvcml6YcOnw6NvIGRvIGRldGVudG9yIGRvcwpkaXJlaXRvcyBkZSBhdXRvciBwYXJhIGNvbmNlZGVyIMOgIFVuaXZlcnNpZGFkZSBkZSBCcmFzw61saWEgb3MgZGlyZWl0b3MKcmVxdWVyaWRvcyBwb3IgZXN0YSBsaWNlbsOnYSwgZSBxdWUgZXNzZSBtYXRlcmlhbCBjdWpvcyBkaXJlaXRvcyBzw6NvIGRlCnRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIGlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIG5vIHRleHRvIG91CmNvbnRlw7pkbyBkbyBkb2N1bWVudG8gZW50cmVndWUuCgpTZSBvIGRvY3VtZW50byBlbnRyZWd1ZSDDqSBiYXNlYWRvIGVtIHRyYWJhbGhvIGZpbmFuY2lhZG8gb3UgYXBvaWFkbwpwb3Igb3V0cmEgaW5zdGl0dWnDp8OjbyBxdWUgbsOjbyBhIFVuaXZlcnNpZGFkZSBkZSBCcmFzw61saWEsIGRlY2xhcmEgcXVlCmN1bXByaXUgcXVhaXNxdWVyIG9icmlnYcOnw7VlcyBleGlnaWRhcyBwZWxvIHJlc3BlY3Rpdm8gY29udHJhdG8gb3UKYWNvcmRvLgoKQSBVbml2ZXJzaWRhZGUgZGUgQnJhc8OtbGlhIGlkZW50aWZpY2Fyw6EgY2xhcmFtZW50ZSBvKHMpIHNldSAocykgbm9tZSAocykKY29tbyBvIChzKSBhdXRvciAoZXMpIG91IGRldGVudG9yIChlcykgZG9zIGRpcmVpdG9zIGRvIGRvY3VtZW50bwplbnRyZWd1ZSwgZSBuw6NvIGZhcsOhIHF1YWxxdWVyIGFsdGVyYcOnw6NvLCBwYXJhIGFsw6ltIGRhcyBwZXJtaXRpZGFzIHBvcgplc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Monografiahttps://bdm.unb.br/PUBhttp://bdm.unb.br/oai/requestbdm@bce.unb.br\|\|patricia@bce.unb.bropendoar:115712020-11-25T17:08:40Biblioteca Digital de Monografias da UnB - Universidade de Brasília (UnB)false
dc.title.pt_BR.fl_str_mv	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
title	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
spellingShingle	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos Mota, Ícaro da Costa Inteligência artificial
title_short	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
title_full	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
title_fullStr	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
title_full_unstemmed	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
title_sort	Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos
author	Mota, Ícaro da Costa
author_facet	Mota, Ícaro da Costa
author_role	author
dc.contributor.author.fl_str_mv	Mota, Ícaro da Costa
dc.contributor.advisor1.fl_str_mv	Lamar, Marcus Vinicius
contributor_str_mv	Lamar, Marcus Vinicius
dc.subject.keyword.pt_BR.fl_str_mv	Inteligência artificial
topic	Inteligência artificial
description	Trabalho de Conclusão de Curso (graduação)—Universidade de Brasília, Faculdade de Tecnologia, Curso de Graduação em Engenharia de Controle e Automação, 2018.
publishDate	2018
dc.date.submitted.none.fl_str_mv	2018-07
dc.date.accessioned.fl_str_mv	2019-02-06T10:36:11Z
dc.date.available.fl_str_mv	2019-02-06T10:36:11Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/bachelorThesis
format	bachelorThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	MOTA, Ícaro da Costa. Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos. 2018. 61 f., il. Trabalho de Conclusão de Curso (Bacharelado em Engenharia Mecatrônica)—Universidade de Brasília, Brasília, 2018.
dc.identifier.uri.fl_str_mv	http://bdm.unb.br/handle/10483/21323
identifier_str_mv	MOTA, Ícaro da Costa. Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos. 2018. 61 f., il. Trabalho de Conclusão de Curso (Bacharelado em Engenharia Mecatrônica)—Universidade de Brasília, Brasília, 2018.
url	http://bdm.unb.br/handle/10483/21323
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Monografias da UnB instname:Universidade de Brasília (UnB) instacron:UNB
instname_str	Universidade de Brasília (UnB)
instacron_str	UNB
institution	UNB
reponame_str	Biblioteca Digital de Monografias da UnB
collection	Biblioteca Digital de Monografias da UnB
bitstream.url.fl_str_mv	http://bdm.unb.br/xmlui/bitstream/10483/21323/5/license.txt http://bdm.unb.br/xmlui/bitstream/10483/21323/2/license_url http://bdm.unb.br/xmlui/bitstream/10483/21323/3/license_text http://bdm.unb.br/xmlui/bitstream/10483/21323/4/license_rdf http://bdm.unb.br/xmlui/bitstream/10483/21323/1/2018_IcaroDaCostaMota_tcc.pdf
bitstream.checksum.fl_str_mv	21554873e56ad8ddc69c092699b98f95 4afdbb8c545fd630ea7db775da747b2f d41d8cd98f00b204e9800998ecf8427e d41d8cd98f00b204e9800998ecf8427e 5da94bbd74a2645ec5ed697f77510a2c
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Monografias da UnB - Universidade de Brasília (UnB)
repository.mail.fl_str_mv	bdm@bce.unb.br\|\|patricia@bce.unb.br
_version_	1801493075806126080

Aprendizagem por reforço utilizando Q-Learning e redes neurais artificiais em jogos eletrônicos

Registros relacionados