Aplicação de programação genética na análise de sentimentos

Detalhes bibliográficos
Autor(a) principal: Bordin Junior, Airton
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Institucional da UFG
dARK ID: ark:/38995/001300000b84q
Texto Completo: http://repositorio.bc.ufg.br/tede/handle/tede/9211
Resumo: The Web is commonly used as a platform for debates, opinions, evaluations, etc. These data allowed the area of Sentiment Analysis (SA) to develop to extract information and knowledge that can be used in different applications. Among the challenges of SA we can highlight the creation of classifiers with good efficacy. Typically, the classification models are generated using specific heuristics, manually defined and not adaptable to different contexts. Thus, this work proposes the automated generation of hybrid SA classifiers - with Machine Learning (ML) techniques and lexical dictionaries - using Genetic Programming (GP). It is expected to reduce the cost of generating the classifiers and increase the predictive power for each domain analyzed. The goal is that these classifiers will be competitive with the classical ML algorithms used in SA, generalizable, adaptable to the context and able to determine the relevance of each lexical to the applied domain. In addition, the aim is allow to aggregate other ML techniques to create even more effective hybrid solutions. In order to validate the proposal, SemEval 2014 benchmark was used. The results show that the approach with GP is promising since the generated models are competitive, and sometimes better, with other researches. The ensemble proved to be effective in increasing the predictive power of the system, obtaining better results than the use of the techniques individually. Finally, we highlight the ability of models customization according to the context approached and the possibility of knowledge transfer of the users through the functions used by GP.
id UFG-2_7c7ddcc7dc6dc9ab23bf8f8b45cbe5a8
oai_identifier_str oai:repositorio.bc.ufg.br:tede/9211
network_acronym_str UFG-2
network_name_str Repositório Institucional da UFG
repository_id_str
spelling Silva, Nádia Félix Felipe dahttp://lattes.cnpq.br/7864834001694765Camilo Junior, Celso Gonçalveshttp://lattes.cnpq.br/6776569904919279Silva, Nádia Félix Felipe daCamilo Junior, Celso GonçalvesRosa, Thierson CoutoCovões, Thiago FerreiraFernandes, Deborah Silva Alveshttp://lattes.cnpq.br/5718967602727513Bordin Junior, Airton2019-01-09T11:18:52Z2018-12-14BORDIN JUNIOR, A. Aplicação de programação genética na análise de sentimentos. 2018. 142 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, Goiânia, 2018.http://repositorio.bc.ufg.br/tede/handle/tede/9211ark:/38995/001300000b84qThe Web is commonly used as a platform for debates, opinions, evaluations, etc. These data allowed the area of Sentiment Analysis (SA) to develop to extract information and knowledge that can be used in different applications. Among the challenges of SA we can highlight the creation of classifiers with good efficacy. Typically, the classification models are generated using specific heuristics, manually defined and not adaptable to different contexts. Thus, this work proposes the automated generation of hybrid SA classifiers - with Machine Learning (ML) techniques and lexical dictionaries - using Genetic Programming (GP). It is expected to reduce the cost of generating the classifiers and increase the predictive power for each domain analyzed. The goal is that these classifiers will be competitive with the classical ML algorithms used in SA, generalizable, adaptable to the context and able to determine the relevance of each lexical to the applied domain. In addition, the aim is allow to aggregate other ML techniques to create even more effective hybrid solutions. In order to validate the proposal, SemEval 2014 benchmark was used. The results show that the approach with GP is promising since the generated models are competitive, and sometimes better, with other researches. The ensemble proved to be effective in increasing the predictive power of the system, obtaining better results than the use of the techniques individually. Finally, we highlight the ability of models customization according to the context approached and the possibility of knowledge transfer of the users through the functions used by GP.A Web é comumente utilizada como plataforma para debates, opiniões, avaliações, etc. Esses dados permitiram que a área de Análise de Sentimentos (AS) se desenvolvesse para extrair informações e conhecimentos que possam ser utilizados em diferentes aplicações. Entre os desafios da AS, destacam-se a criação de classificadores com boa eficácia. Normalmente, os modelos de classificação gerados são heurísticas específicas, manualmente definidas e pouco adaptáveis a diferentes contextos. Assim, o presente trabalho propõe a geração automatizada de classificadores de sentimentos híbridos – utilizando técnicas de Aprendizado de Máquina (AM) e dicionários léxicos – com o uso da Programação Genética (PG). Com isso, espera-se reduzir o custo de geração dos classificadores e aumentar o poder de predição para cada domínio analisado. A intenção é que esses classificadores sejam competitivos com os algoritmos clássicos empregados na área de AS, generalizáveis, adaptáveis ao contexto e capazes de determinar a relevância de cada um dos dicionários léxicos ao domínio aplicado. Além disso, a ideia é que seja possível a agregação de outras técnicas de AM para a geração de soluções híbridas ainda mais eficazes. Para validar a proposta, foi utilizado o benchmark SemEval 2014 e os resultados mostram que a abordagem de geração automatizada com a PG é promissora, pois os modelos gerados são competitivos e, algumas vezes, superiores aos de outros trabalhos da literatura. A combinação dos classificadores em um comitê mostrou-se eficaz ao aumento do poder de predição do sistema, obtendo resultados superiores à utilização das técnicas individualmente. Por fim, destaca-se a capacidade de customização dos modelos de acordo com o contexto abordado e a possibilidade de transferência de conhecimento dos usuários por meio das funções utilizadas pela PG.Submitted by Ana Caroline Costa (ana_caroline212@hotmail.com) on 2019-01-08T17:57:49Z No. of bitstreams: 2 Dissertação - Airton Bordin Junior - 2018.pdf: 1915483 bytes, checksum: ce3cc567ea43be5719b609ec785f5200 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2019-01-09T11:18:52Z (GMT) No. of bitstreams: 2 Dissertação - Airton Bordin Junior - 2018.pdf: 1915483 bytes, checksum: ce3cc567ea43be5719b609ec785f5200 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)Made available in DSpace on 2019-01-09T11:18:52Z (GMT). No. of bitstreams: 2 Dissertação - Airton Bordin Junior - 2018.pdf: 1915483 bytes, checksum: ce3cc567ea43be5719b609ec785f5200 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-12-14Conselho Nacional de Pesquisa e Desenvolvimento Científico e Tecnológico - CNPqapplication/pdfporUniversidade Federal de GoiásPrograma de Pós-graduação em Ciência da Computação (INF)UFGBrasilInstituto de Informática - INF (RG)http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessAnálise de sentimentosMineração de opiniõesProgramação genéticaClassificadoresSentiment analysisOpinion miningGenetic programmingClassifiersCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOAplicação de programação genética na análise de sentimentosApplying genetic programming to sentiment analysisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis-3303550325223384799600600600600-77122667346336447683671711205811204509-2555911436985713659reponame:Repositório Institucional da UFGinstname:Universidade Federal de Goiás (UFG)instacron:UFGLICENSElicense.txtlicense.txttext/plain; charset=utf-82165http://repositorio.bc.ufg.br/tede/bitstreams/f11ef502-b860-434d-8b3a-47ac3c23de30/downloadbd3efa91386c1718a7f26a329fdcb468MD51CC-LICENSElicense_urllicense_urltext/plain; charset=utf-849http://repositorio.bc.ufg.br/tede/bitstreams/77dc6309-5b43-4736-9c6c-5a20eedc8271/download4afdbb8c545fd630ea7db775da747b2fMD52license_textlicense_texttext/html; charset=utf-80http://repositorio.bc.ufg.br/tede/bitstreams/92bd49b7-1ec7-4a54-967d-2189cee9bafb/downloadd41d8cd98f00b204e9800998ecf8427eMD53license_rdflicense_rdfapplication/rdf+xml; charset=utf-80http://repositorio.bc.ufg.br/tede/bitstreams/d544cef1-cb91-42c5-839a-9c473ecdf2bc/downloadd41d8cd98f00b204e9800998ecf8427eMD54ORIGINALDissertação - Airton Bordin Junior - 2018.pdfDissertação - Airton Bordin Junior - 2018.pdfapplication/pdf1915483http://repositorio.bc.ufg.br/tede/bitstreams/445fab3c-9d3a-42ff-9547-443b597f4dd7/downloadce3cc567ea43be5719b609ec785f5200MD55tede/92112019-01-09 09:18:52.267http://creativecommons.org/licenses/by-nc-nd/4.0/Acesso Abertoopen.accessoai:repositorio.bc.ufg.br:tede/9211http://repositorio.bc.ufg.br/tedeRepositório InstitucionalPUBhttp://repositorio.bc.ufg.br/oai/requesttasesdissertacoes.bc@ufg.bropendoar:2019-01-09T11:18:52Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)falseTk9UQTogQ09MT1FVRSBBUVVJIEEgU1VBIFBSw5NQUklBIExJQ0VOw4dBCkVzdGEgbGljZW7Dp2EgZGUgZXhlbXBsbyDDqSBmb3JuZWNpZGEgYXBlbmFzIHBhcmEgZmlucyBpbmZvcm1hdGl2b3MuCgpMSUNFTsOHQSBERSBESVNUUklCVUnDh8ODTyBOw4NPLUVYQ0xVU0lWQQoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgClhYWCAoU2lnbGEgZGEgVW5pdmVyc2lkYWRlKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBwb2RlLCBzZW0gYWx0ZXJhciBvIGNvbnRlw7pkbywgdHJhbnNwb3IgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIApwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgU2lnbGEgZGUgVW5pdmVyc2lkYWRlIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPDs3BpYSBhIHN1YSB0ZXNlIG91IApkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyAKbmVzdGEgbGljZW7Dp2EuIFZvY8OqIHRhbWLDqW0gZGVjbGFyYSBxdWUgbyBkZXDDs3NpdG8gZGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBuw6NvLCBxdWUgc2VqYSBkZSBzZXUgCmNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiAKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSAKb3MgZGlyZWl0b3MgYXByZXNlbnRhZG9zIG5lc3RhIGxpY2Vuw6dhLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIAppZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250ZcO6ZG8gZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFRFU0UgT1UgRElTU0VSVEHDh8ODTyBPUkEgREVQT1NJVEFEQSBURU5IQSBTSURPIFJFU1VMVEFETyBERSBVTSBQQVRST0PDjU5JTyBPVSAKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBTSUdMQSBERSAKVU5JVkVSU0lEQURFLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyAKVEFNQsOJTSBBUyBERU1BSVMgT0JSSUdBw4fDlUVTIEVYSUdJREFTIFBPUiBDT05UUkFUTyBPVSBBQ09SRE8uCgpBIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=
dc.title.eng.fl_str_mv Aplicação de programação genética na análise de sentimentos
dc.title.alternative.eng.fl_str_mv Applying genetic programming to sentiment analysis
title Aplicação de programação genética na análise de sentimentos
spellingShingle Aplicação de programação genética na análise de sentimentos
Bordin Junior, Airton
Análise de sentimentos
Mineração de opiniões
Programação genética
Classificadores
Sentiment analysis
Opinion mining
Genetic programming
Classifiers
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Aplicação de programação genética na análise de sentimentos
title_full Aplicação de programação genética na análise de sentimentos
title_fullStr Aplicação de programação genética na análise de sentimentos
title_full_unstemmed Aplicação de programação genética na análise de sentimentos
title_sort Aplicação de programação genética na análise de sentimentos
author Bordin Junior, Airton
author_facet Bordin Junior, Airton
author_role author
dc.contributor.advisor1.fl_str_mv Silva, Nádia Félix Felipe da
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/7864834001694765
dc.contributor.advisor-co1.fl_str_mv Camilo Junior, Celso Gonçalves
dc.contributor.advisor-co1Lattes.fl_str_mv http://lattes.cnpq.br/6776569904919279
dc.contributor.referee1.fl_str_mv Silva, Nádia Félix Felipe da
dc.contributor.referee2.fl_str_mv Camilo Junior, Celso Gonçalves
dc.contributor.referee3.fl_str_mv Rosa, Thierson Couto
dc.contributor.referee4.fl_str_mv Covões, Thiago Ferreira
dc.contributor.referee5.fl_str_mv Fernandes, Deborah Silva Alves
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/5718967602727513
dc.contributor.author.fl_str_mv Bordin Junior, Airton
contributor_str_mv Silva, Nádia Félix Felipe da
Camilo Junior, Celso Gonçalves
Silva, Nádia Félix Felipe da
Camilo Junior, Celso Gonçalves
Rosa, Thierson Couto
Covões, Thiago Ferreira
Fernandes, Deborah Silva Alves
dc.subject.por.fl_str_mv Análise de sentimentos
Mineração de opiniões
Programação genética
Classificadores
topic Análise de sentimentos
Mineração de opiniões
Programação genética
Classificadores
Sentiment analysis
Opinion mining
Genetic programming
Classifiers
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv Sentiment analysis
Opinion mining
Genetic programming
Classifiers
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description The Web is commonly used as a platform for debates, opinions, evaluations, etc. These data allowed the area of Sentiment Analysis (SA) to develop to extract information and knowledge that can be used in different applications. Among the challenges of SA we can highlight the creation of classifiers with good efficacy. Typically, the classification models are generated using specific heuristics, manually defined and not adaptable to different contexts. Thus, this work proposes the automated generation of hybrid SA classifiers - with Machine Learning (ML) techniques and lexical dictionaries - using Genetic Programming (GP). It is expected to reduce the cost of generating the classifiers and increase the predictive power for each domain analyzed. The goal is that these classifiers will be competitive with the classical ML algorithms used in SA, generalizable, adaptable to the context and able to determine the relevance of each lexical to the applied domain. In addition, the aim is allow to aggregate other ML techniques to create even more effective hybrid solutions. In order to validate the proposal, SemEval 2014 benchmark was used. The results show that the approach with GP is promising since the generated models are competitive, and sometimes better, with other researches. The ensemble proved to be effective in increasing the predictive power of the system, obtaining better results than the use of the techniques individually. Finally, we highlight the ability of models customization according to the context approached and the possibility of knowledge transfer of the users through the functions used by GP.
publishDate 2018
dc.date.issued.fl_str_mv 2018-12-14
dc.date.accessioned.fl_str_mv 2019-01-09T11:18:52Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv BORDIN JUNIOR, A. Aplicação de programação genética na análise de sentimentos. 2018. 142 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, Goiânia, 2018.
dc.identifier.uri.fl_str_mv http://repositorio.bc.ufg.br/tede/handle/tede/9211
dc.identifier.dark.fl_str_mv ark:/38995/001300000b84q
identifier_str_mv BORDIN JUNIOR, A. Aplicação de programação genética na análise de sentimentos. 2018. 142 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, Goiânia, 2018.
ark:/38995/001300000b84q
url http://repositorio.bc.ufg.br/tede/handle/tede/9211
dc.language.iso.fl_str_mv por
language por
dc.relation.program.fl_str_mv -3303550325223384799
dc.relation.confidence.fl_str_mv 600
600
600
600
dc.relation.department.fl_str_mv -7712266734633644768
dc.relation.cnpq.fl_str_mv 3671711205811204509
dc.relation.sponsorship.fl_str_mv -2555911436985713659
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Goiás
dc.publisher.program.fl_str_mv Programa de Pós-graduação em Ciência da Computação (INF)
dc.publisher.initials.fl_str_mv UFG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Instituto de Informática - INF (RG)
publisher.none.fl_str_mv Universidade Federal de Goiás
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFG
instname:Universidade Federal de Goiás (UFG)
instacron:UFG
instname_str Universidade Federal de Goiás (UFG)
instacron_str UFG
institution UFG
reponame_str Repositório Institucional da UFG
collection Repositório Institucional da UFG
bitstream.url.fl_str_mv http://repositorio.bc.ufg.br/tede/bitstreams/f11ef502-b860-434d-8b3a-47ac3c23de30/download
http://repositorio.bc.ufg.br/tede/bitstreams/77dc6309-5b43-4736-9c6c-5a20eedc8271/download
http://repositorio.bc.ufg.br/tede/bitstreams/92bd49b7-1ec7-4a54-967d-2189cee9bafb/download
http://repositorio.bc.ufg.br/tede/bitstreams/d544cef1-cb91-42c5-839a-9c473ecdf2bc/download
http://repositorio.bc.ufg.br/tede/bitstreams/445fab3c-9d3a-42ff-9547-443b597f4dd7/download
bitstream.checksum.fl_str_mv bd3efa91386c1718a7f26a329fdcb468
4afdbb8c545fd630ea7db775da747b2f
d41d8cd98f00b204e9800998ecf8427e
d41d8cd98f00b204e9800998ecf8427e
ce3cc567ea43be5719b609ec785f5200
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)
repository.mail.fl_str_mv tasesdissertacoes.bc@ufg.br
_version_ 1815172619740119040