Avaliação de características para extração automática de aspectos

Detalhes bibliográficos
Autor(a) principal: LIMA, Roberto Márcio Mota de
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações da UFRPE
Texto Completo: http://www.tede2.ufrpe.br:8080/tede2/handle/tede2/7855
Resumo: The increasing use of the Internet and other online interactions between people, such as chats, forum participation, e-commerce transactions, reviews of products and services, among others has led to the increasing need to extract, transform and analyze a vast amount of data, using a combination of text mining processes and others directly from the Web. Companies from several different sectors demand for customer feedback. Such institutions are increasingly interested in knowing what their real, or potential, customers say about them. From restaurants to hotels, from smartphones to cameras, the reviews are spread out over the internet, and they are essential to companies because they are created by people who somehow make, or use their services. People are more likely to express their opinions and practical experiences in products or services that they have used. Such feedback is important to business organizations and consumers. However, analyzing the hundreds or even thousands of customer input is difficult to be handled by humans. Therefore, it is necessary to provide concise information and concise summaries of such reviews. The Aspect-Based Sentiment Analysis is a recent trend and an approach that has much to explore since it has demonstrated relevant results in the literature as a more refined and punctual opinion extraction technique than those based purely on lexicon and rules. This MSc thesis aimed to research, implement and evaluate a new method for extraction of opinion terms in restaurant reviews taking into account and choosing some of the best features described in the state of the art. As the classification model, CRF was used, given its high efficiency as a conditional classifier. The results obtained showed good performance when compared to the principal related works of the area, highlighting the high coverage achieved by the developed method.
id URPE_d7671b6ecd972f2adbbc421b4213e0b5
oai_identifier_str oai:tede2:tede2/7855
network_acronym_str URPE
network_name_str Biblioteca Digital de Teses e Dissertações da UFRPE
repository_id_str
spelling MELLO, Rafael Ferreira Leite deLIMA, Rinaldo José deLIMA, Rinaldo José deCORRÊA, Renato FernandesLINS, Rafael Dueirehttp://lattes.cnpq.br/6177392887180542LIMA, Roberto Márcio Mota de2019-02-19T14:05:05Z2018-08-31LIMA, Roberto Márcio Mota de. Avaliação de características para extração automática de aspectos. 2018. 102 f. Dissertação (Programa de Pós-Graduação em Informática Aplicada) - Universidade Federal Rural de Pernambuco, Recife.http://www.tede2.ufrpe.br:8080/tede2/handle/tede2/7855The increasing use of the Internet and other online interactions between people, such as chats, forum participation, e-commerce transactions, reviews of products and services, among others has led to the increasing need to extract, transform and analyze a vast amount of data, using a combination of text mining processes and others directly from the Web. Companies from several different sectors demand for customer feedback. Such institutions are increasingly interested in knowing what their real, or potential, customers say about them. From restaurants to hotels, from smartphones to cameras, the reviews are spread out over the internet, and they are essential to companies because they are created by people who somehow make, or use their services. People are more likely to express their opinions and practical experiences in products or services that they have used. Such feedback is important to business organizations and consumers. However, analyzing the hundreds or even thousands of customer input is difficult to be handled by humans. Therefore, it is necessary to provide concise information and concise summaries of such reviews. The Aspect-Based Sentiment Analysis is a recent trend and an approach that has much to explore since it has demonstrated relevant results in the literature as a more refined and punctual opinion extraction technique than those based purely on lexicon and rules. This MSc thesis aimed to research, implement and evaluate a new method for extraction of opinion terms in restaurant reviews taking into account and choosing some of the best features described in the state of the art. As the classification model, CRF was used, given its high efficiency as a conditional classifier. The results obtained showed good performance when compared to the principal related works of the area, highlighting the high coverage achieved by the developed method.A utilização cada vez mais crescente da Internet e demais interações online entre pessoas, tais como chats, participação em fóruns, transações em comércio eletrônico, revisões de produtos e serviços, etc., tem levado à necessidade cada vez maior de extrair, transformar e analisar uma quantidade enorme de dados, utilizando-se de uma combinação de processos de mineração de textos e demais conteúdos obtidos diretamente da Web. Dentre as principais demandas para a área de mineração de dados, sobretudo de opinião, estão as grandes empresas dos mais diversos ramos. Essas instituições estão cada vez mais interessadas em saber o que seus clientes reais, ou em potencial, comentam sobre elas. De restaurantes a hotéis, de celulares a câmeras fotográficas, os fóruns de revisões espalhados na internet são os cartões de visita mais importantes para as empresas pelo simples fato de não serem criados por elas, mas por pessoas que de alguma forma fazem ou farão uso de seus serviços. As pessoas são mais propensas a expressar suas opiniões e experiências práticas em produtos ou serviços que eles utilizaram. Essas revisões são importantes para organizações empresariais e consumidores. Contudo, analisar todas as críticas de clientes é difícil, já que tal número de comentários pode ser de centenas ou até milhares. Portanto, é necessário fornecer informações coerentes e sumários concisos para essas revisões. A Análise de Sentimento Baseada em Aspecto é uma tendência recente e uma abordagem que tem muito a ser explorada, uma vez que tem demonstrado bons resultados na literatura como uma técnica de extração de opiniões mais refinada e pontual do que as baseadas puramente em léxica e regras. Esta dissertação de Mestrado teve por objetivo pesquisar, implementar e avaliar um novo método para extração de termos de opinião em revisões de restaurantes levando em conta e escolha da algumas das melhores características descritas no estado da arte. Como modelo de classificação, foi utilizado o CRF, dada sua alta eficiência como classificador condicional. Os resultados obtidos demonstraram um bom desempenho quando comparado aos principais trabalhos da área, tendo como destaque a alta cobertura alcançada pelo método desenvolvido.Submitted by Mario BC (mario@bc.ufrpe.br) on 2019-02-19T14:05:05Z No. of bitstreams: 1 Roberto Marcio Mota de Lima.pdf: 1603006 bytes, checksum: 1e248f012ff7a85608a30132a1bd4b6c (MD5)Made available in DSpace on 2019-02-19T14:05:05Z (GMT). No. of bitstreams: 1 Roberto Marcio Mota de Lima.pdf: 1603006 bytes, checksum: 1e248f012ff7a85608a30132a1bd4b6c (MD5) Previous issue date: 2018-08-31application/pdfporUniversidade Federal Rural de PernambucoPrograma de Pós-Graduação em Informática AplicadaUFRPEBrasilDepartamento de Estatística e InformáticaAnálise de dadosExtração de opiniãoModelo CRFCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOAvaliação de características para extração automática de aspectosinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis-8268485641417162699600600600-67745551403961205013671711205811204509info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFRPEinstname:Universidade Federal Rural de Pernambuco (UFRPE)instacron:UFRPEORIGINALRoberto Marcio Mota de Lima.pdfRoberto Marcio Mota de Lima.pdfapplication/pdf1603006http://www.tede2.ufrpe.br:8080/tede2/bitstream/tede2/7855/2/Roberto+Marcio+Mota+de+Lima.pdf1e248f012ff7a85608a30132a1bd4b6cMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82165http://www.tede2.ufrpe.br:8080/tede2/bitstream/tede2/7855/1/license.txtbd3efa91386c1718a7f26a329fdcb468MD51tede2/78552019-02-19 11:05:05.919oai:tede2:tede2/7855Tk9UQTogQ09MT1FVRSBBUVVJIEEgU1VBIFBSw5NQUklBIExJQ0VOw4dBCkVzdGEgbGljZW7Dp2EgZGUgZXhlbXBsbyDDqSBmb3JuZWNpZGEgYXBlbmFzIHBhcmEgZmlucyBpbmZvcm1hdGl2b3MuCgpMSUNFTsOHQSBERSBESVNUUklCVUnDh8ODTyBOw4NPLUVYQ0xVU0lWQQoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgClhYWCAoU2lnbGEgZGEgVW5pdmVyc2lkYWRlKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBwb2RlLCBzZW0gYWx0ZXJhciBvIGNvbnRlw7pkbywgdHJhbnNwb3IgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIApwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgU2lnbGEgZGUgVW5pdmVyc2lkYWRlIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPDs3BpYSBhIHN1YSB0ZXNlIG91IApkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyAKbmVzdGEgbGljZW7Dp2EuIFZvY8OqIHRhbWLDqW0gZGVjbGFyYSBxdWUgbyBkZXDDs3NpdG8gZGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBuw6NvLCBxdWUgc2VqYSBkZSBzZXUgCmNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiAKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSAKb3MgZGlyZWl0b3MgYXByZXNlbnRhZG9zIG5lc3RhIGxpY2Vuw6dhLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIAppZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250ZcO6ZG8gZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFRFU0UgT1UgRElTU0VSVEHDh8ODTyBPUkEgREVQT1NJVEFEQSBURU5IQSBTSURPIFJFU1VMVEFETyBERSBVTSBQQVRST0PDjU5JTyBPVSAKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBTSUdMQSBERSAKVU5JVkVSU0lEQURFLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyAKVEFNQsOJTSBBUyBERU1BSVMgT0JSSUdBw4fDlUVTIEVYSUdJREFTIFBPUiBDT05UUkFUTyBPVSBBQ09SRE8uCgpBIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Teses e Dissertaçõeshttp://www.tede2.ufrpe.br:8080/tede/PUBhttp://www.tede2.ufrpe.br:8080/oai/requestbdtd@ufrpe.br ||bdtd@ufrpe.bropendoar:2024-05-28T12:36:12.546263Biblioteca Digital de Teses e Dissertações da UFRPE - Universidade Federal Rural de Pernambuco (UFRPE)false
dc.title.por.fl_str_mv Avaliação de características para extração automática de aspectos
title Avaliação de características para extração automática de aspectos
spellingShingle Avaliação de características para extração automática de aspectos
LIMA, Roberto Márcio Mota de
Análise de dados
Extração de opinião
Modelo CRF
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Avaliação de características para extração automática de aspectos
title_full Avaliação de características para extração automática de aspectos
title_fullStr Avaliação de características para extração automática de aspectos
title_full_unstemmed Avaliação de características para extração automática de aspectos
title_sort Avaliação de características para extração automática de aspectos
author LIMA, Roberto Márcio Mota de
author_facet LIMA, Roberto Márcio Mota de
author_role author
dc.contributor.advisor1.fl_str_mv MELLO, Rafael Ferreira Leite de
dc.contributor.advisor-co1.fl_str_mv LIMA, Rinaldo José de
dc.contributor.referee1.fl_str_mv LIMA, Rinaldo José de
dc.contributor.referee2.fl_str_mv CORRÊA, Renato Fernandes
dc.contributor.referee3.fl_str_mv LINS, Rafael Dueire
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/6177392887180542
dc.contributor.author.fl_str_mv LIMA, Roberto Márcio Mota de
contributor_str_mv MELLO, Rafael Ferreira Leite de
LIMA, Rinaldo José de
LIMA, Rinaldo José de
CORRÊA, Renato Fernandes
LINS, Rafael Dueire
dc.subject.por.fl_str_mv Análise de dados
Extração de opinião
Modelo CRF
topic Análise de dados
Extração de opinião
Modelo CRF
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description The increasing use of the Internet and other online interactions between people, such as chats, forum participation, e-commerce transactions, reviews of products and services, among others has led to the increasing need to extract, transform and analyze a vast amount of data, using a combination of text mining processes and others directly from the Web. Companies from several different sectors demand for customer feedback. Such institutions are increasingly interested in knowing what their real, or potential, customers say about them. From restaurants to hotels, from smartphones to cameras, the reviews are spread out over the internet, and they are essential to companies because they are created by people who somehow make, or use their services. People are more likely to express their opinions and practical experiences in products or services that they have used. Such feedback is important to business organizations and consumers. However, analyzing the hundreds or even thousands of customer input is difficult to be handled by humans. Therefore, it is necessary to provide concise information and concise summaries of such reviews. The Aspect-Based Sentiment Analysis is a recent trend and an approach that has much to explore since it has demonstrated relevant results in the literature as a more refined and punctual opinion extraction technique than those based purely on lexicon and rules. This MSc thesis aimed to research, implement and evaluate a new method for extraction of opinion terms in restaurant reviews taking into account and choosing some of the best features described in the state of the art. As the classification model, CRF was used, given its high efficiency as a conditional classifier. The results obtained showed good performance when compared to the principal related works of the area, highlighting the high coverage achieved by the developed method.
publishDate 2018
dc.date.issued.fl_str_mv 2018-08-31
dc.date.accessioned.fl_str_mv 2019-02-19T14:05:05Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv LIMA, Roberto Márcio Mota de. Avaliação de características para extração automática de aspectos. 2018. 102 f. Dissertação (Programa de Pós-Graduação em Informática Aplicada) - Universidade Federal Rural de Pernambuco, Recife.
dc.identifier.uri.fl_str_mv http://www.tede2.ufrpe.br:8080/tede2/handle/tede2/7855
identifier_str_mv LIMA, Roberto Márcio Mota de. Avaliação de características para extração automática de aspectos. 2018. 102 f. Dissertação (Programa de Pós-Graduação em Informática Aplicada) - Universidade Federal Rural de Pernambuco, Recife.
url http://www.tede2.ufrpe.br:8080/tede2/handle/tede2/7855
dc.language.iso.fl_str_mv por
language por
dc.relation.program.fl_str_mv -8268485641417162699
dc.relation.confidence.fl_str_mv 600
600
600
dc.relation.department.fl_str_mv -6774555140396120501
dc.relation.cnpq.fl_str_mv 3671711205811204509
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal Rural de Pernambuco
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Informática Aplicada
dc.publisher.initials.fl_str_mv UFRPE
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Departamento de Estatística e Informática
publisher.none.fl_str_mv Universidade Federal Rural de Pernambuco
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UFRPE
instname:Universidade Federal Rural de Pernambuco (UFRPE)
instacron:UFRPE
instname_str Universidade Federal Rural de Pernambuco (UFRPE)
instacron_str UFRPE
institution UFRPE
reponame_str Biblioteca Digital de Teses e Dissertações da UFRPE
collection Biblioteca Digital de Teses e Dissertações da UFRPE
bitstream.url.fl_str_mv http://www.tede2.ufrpe.br:8080/tede2/bitstream/tede2/7855/2/Roberto+Marcio+Mota+de+Lima.pdf
http://www.tede2.ufrpe.br:8080/tede2/bitstream/tede2/7855/1/license.txt
bitstream.checksum.fl_str_mv 1e248f012ff7a85608a30132a1bd4b6c
bd3efa91386c1718a7f26a329fdcb468
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UFRPE - Universidade Federal Rural de Pernambuco (UFRPE)
repository.mail.fl_str_mv bdtd@ufrpe.br ||bdtd@ufrpe.br
_version_ 1810102255813132288