Feature-level sentiment analysis applied to brazilian portuguese reviews

Detalhes bibliográficos
Autor(a) principal: Freitas, Larissa Astrogildo de
Data de Publicação: 2015
Tipo de documento: Tese
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações da PUC_RS
Texto Completo: http://tede2.pucrs.br/tede2/handle/tede/6031
Resumo: Sentiment Analysis is the field of study that analyzes people’s opinions in texts. In the last decade, humans have come to share their opinions in social media on the Web (e.g., forum discussions and posts in social network sites). Opinions are important because whenever we need to take a decision, we want to know others’ points of view. The interest of industry and academia in this field of study is partly due to its potential applications, such as: marketing, public relations and political campaign. Research in this field often considers English data, while data from other languages are less explored. It is possible realize data analysis in different levels, in this work we choose a finer-grain analysis, at aspect-level. Ontologies can represent aspects, that are “part-of” an object or property of “part-of” an object, we proposed a method for feature-level sentiment analysis using ontologies applied to Brazilian Portuguese reviews. In order to obtain a complete analysis, we recognized features explicit and implicit using ontologies. Relatively less work has been done about implicit feature identification. Finally, determine whether the sentiment in relation to the aspects is positive or negative using sentiment lexicons and linguistic rules. Our method is comprised of four steps: preprocessing, feature identification, polarity identification and summarizing. For evaluate this work, we apply our proposal method to a dataset of accommodation sector. According to our experiments, in general the best results were obtained when using TreeTagger, synsets with polarities from Onto.PT and linguistic rule (adjective position) for negative polarity identification and (baseline) for positive polarity identificatio
id P_RS_defb3dec1e34e354f3154c3842e4a94f
oai_identifier_str oai:tede2.pucrs.br:tede/6031
network_acronym_str P_RS
network_name_str Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling Vieira, Renata451.334.330-34007.092.480-59http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4214363H6Freitas, Larissa Astrogildo de2015-05-19T12:00:48Z2015-03-23http://tede2.pucrs.br/tede2/handle/tede/6031Sentiment Analysis is the field of study that analyzes people’s opinions in texts. In the last decade, humans have come to share their opinions in social media on the Web (e.g., forum discussions and posts in social network sites). Opinions are important because whenever we need to take a decision, we want to know others’ points of view. The interest of industry and academia in this field of study is partly due to its potential applications, such as: marketing, public relations and political campaign. Research in this field often considers English data, while data from other languages are less explored. It is possible realize data analysis in different levels, in this work we choose a finer-grain analysis, at aspect-level. Ontologies can represent aspects, that are “part-of” an object or property of “part-of” an object, we proposed a method for feature-level sentiment analysis using ontologies applied to Brazilian Portuguese reviews. In order to obtain a complete analysis, we recognized features explicit and implicit using ontologies. Relatively less work has been done about implicit feature identification. Finally, determine whether the sentiment in relation to the aspects is positive or negative using sentiment lexicons and linguistic rules. Our method is comprised of four steps: preprocessing, feature identification, polarity identification and summarizing. For evaluate this work, we apply our proposal method to a dataset of accommodation sector. According to our experiments, in general the best results were obtained when using TreeTagger, synsets with polarities from Onto.PT and linguistic rule (adjective position) for negative polarity identification and (baseline) for positive polarity identificatioAnálise de sentimento é o campo de estudo que analisa a opinião de pessoas em textos. Na última década, humanos têm compartilhado suas opiniões em mídias sociais na Web (por exemplo, fóruns de discussão e posts em sites de redes sociais). Opiniões são importantes porque sempre que necessitamos tomar uma decisão, queremos saber o ponto de vista de outras pessoas. O interesse da indústria e da academia neste campo de estudo se deve a aplicações potenciais, tais como: compra/venda, relações públicas e campanhas políticas. Pesquisas neste campo muitas vezes consideram dados em inglês, enquanto dados em outros idiomas são pouco explorados. É possível realizar a análise dos dados em diferentes níveis, neste trabalho optamos pela análise no nível de aspecto, na qual a granularidade é mais fina. Como ontologias podem ser utilizadas para representar aspectos, que são “parte-de” um objeto ou propriedade de “parte-de” um objeto, propomos um método para análise de sentimento aplicado a comentários em português brasileiro, sob o nível de aspecto usando ontologias. A fim de obter uma análise completa, reconhecemos aspectos explícitos e implícitos usando ontologias. Relativamente poucos trabalhos têm sido feitos sobre identificação de aspectos implícitos. Finalmente determinamos se o sentimento em relação aos aspectos é positivo ou negativo usando léxicos de sentimento e regras linguísticas. Nosso método é composto de quatro etapas: pré-processamento, identificação de aspecto, identificação de polaridade e sumarização. Para avaliar este trabalho, aplicamos o método proposto nos comentários do setor hoteleiro. De acordo com nosso experimento, o melhor resultado obtido foi quando utilizamos o TreeTagger, o synset com polaridade do Onto.PT e a regra linguística (posição do adjetivo) na identificação da polaridade negativa e (baseline) na identificação da polaridade positivaSubmitted by Setor de Tratamento da Informação - BC/PUCRS (tede2@pucrs.br) on 2015-05-19T12:00:48Z No. of bitstreams: 1 468945 - Txto Completo.pdf: 990591 bytes, checksum: 7d04b4b3b2f91050851802c6d65349f1 (MD5)Made available in DSpace on 2015-05-19T12:00:48Z (GMT). No. of bitstreams: 1 468945 - Txto Completo.pdf: 990591 bytes, checksum: 7d04b4b3b2f91050851802c6d65349f1 (MD5) Previous issue date: 2015-03-23Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul - FAPERGSCoordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPESapplication/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/162812/468945%20-%20Texto%20Completo.pdf.jpgporPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilFaculdade de InformáticaINFORMÁTICAONTOLOGIALINGUÍSTICA COMPUTACIONALPROCESSAMENTO DA LINGUAGEM NATURALCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOFeature-level sentiment analysis applied to brazilian portuguese reviewsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis1974996533081274470600600600600600-30085425104011491443671711205811204509-36147355738911222542075167498588264571info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSORIGINAL468945 - Texto Completo.pdf468945 - Texto Completo.pdfapplication/pdf993755http://tede2.pucrs.br/tede2/bitstream/tede/6031/2/468945+-+Texto+Completo.pdfd89826d93dc36151af60cc921ecaecfbMD52THUMBNAIL468945 - Texto Completo.pdf.jpg468945 - Texto Completo.pdf.jpgimage/jpeg3434http://tede2.pucrs.br/tede2/bitstream/tede/6031/4/468945+-+Texto+Completo.pdf.jpg3e64e247a4c0aa7c36341815184b82a2MD54TEXT468945 - Texto Completo.pdf.txt468945 - Texto Completo.pdf.txttext/plain146964http://tede2.pucrs.br/tede2/bitstream/tede/6031/3/468945+-+Texto+Completo.pdf.txt88d4acde21bfdc5f6efaee56e039eddcMD53LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/6031/1/license.txt5a9d6006225b368ef605ba16b4f6d1beMD51tede/60312015-09-29 08:23:02.93oai:tede2.pucrs.br:tede/6031QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br||opendoar:2015-09-29T11:23:02Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv Feature-level sentiment analysis applied to brazilian portuguese reviews
title Feature-level sentiment analysis applied to brazilian portuguese reviews
spellingShingle Feature-level sentiment analysis applied to brazilian portuguese reviews
Freitas, Larissa Astrogildo de
INFORMÁTICA
ONTOLOGIA
LINGUÍSTICA COMPUTACIONAL
PROCESSAMENTO DA LINGUAGEM NATURAL
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Feature-level sentiment analysis applied to brazilian portuguese reviews
title_full Feature-level sentiment analysis applied to brazilian portuguese reviews
title_fullStr Feature-level sentiment analysis applied to brazilian portuguese reviews
title_full_unstemmed Feature-level sentiment analysis applied to brazilian portuguese reviews
title_sort Feature-level sentiment analysis applied to brazilian portuguese reviews
author Freitas, Larissa Astrogildo de
author_facet Freitas, Larissa Astrogildo de
author_role author
dc.contributor.advisor1.fl_str_mv Vieira, Renata
dc.contributor.advisor1ID.fl_str_mv 451.334.330-34
dc.contributor.authorID.fl_str_mv 007.092.480-59
dc.contributor.authorLattes.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4214363H6
dc.contributor.author.fl_str_mv Freitas, Larissa Astrogildo de
contributor_str_mv Vieira, Renata
dc.subject.por.fl_str_mv INFORMÁTICA
ONTOLOGIA
LINGUÍSTICA COMPUTACIONAL
PROCESSAMENTO DA LINGUAGEM NATURAL
topic INFORMÁTICA
ONTOLOGIA
LINGUÍSTICA COMPUTACIONAL
PROCESSAMENTO DA LINGUAGEM NATURAL
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description Sentiment Analysis is the field of study that analyzes people’s opinions in texts. In the last decade, humans have come to share their opinions in social media on the Web (e.g., forum discussions and posts in social network sites). Opinions are important because whenever we need to take a decision, we want to know others’ points of view. The interest of industry and academia in this field of study is partly due to its potential applications, such as: marketing, public relations and political campaign. Research in this field often considers English data, while data from other languages are less explored. It is possible realize data analysis in different levels, in this work we choose a finer-grain analysis, at aspect-level. Ontologies can represent aspects, that are “part-of” an object or property of “part-of” an object, we proposed a method for feature-level sentiment analysis using ontologies applied to Brazilian Portuguese reviews. In order to obtain a complete analysis, we recognized features explicit and implicit using ontologies. Relatively less work has been done about implicit feature identification. Finally, determine whether the sentiment in relation to the aspects is positive or negative using sentiment lexicons and linguistic rules. Our method is comprised of four steps: preprocessing, feature identification, polarity identification and summarizing. For evaluate this work, we apply our proposal method to a dataset of accommodation sector. According to our experiments, in general the best results were obtained when using TreeTagger, synsets with polarities from Onto.PT and linguistic rule (adjective position) for negative polarity identification and (baseline) for positive polarity identificatio
publishDate 2015
dc.date.accessioned.fl_str_mv 2015-05-19T12:00:48Z
dc.date.issued.fl_str_mv 2015-03-23
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://tede2.pucrs.br/tede2/handle/tede/6031
url http://tede2.pucrs.br/tede2/handle/tede/6031
dc.language.iso.fl_str_mv por
language por
dc.relation.program.fl_str_mv 1974996533081274470
dc.relation.confidence.fl_str_mv 600
600
600
600
600
dc.relation.department.fl_str_mv -3008542510401149144
dc.relation.cnpq.fl_str_mv 3671711205811204509
dc.relation.sponsorship.fl_str_mv -3614735573891122254
2075167498588264571
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Pontifícia Universidade Católica do Rio Grande do Sul
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv PUCRS
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Faculdade de Informática
publisher.none.fl_str_mv Pontifícia Universidade Católica do Rio Grande do Sul
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS
instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron:PUC_RS
instname_str Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str PUC_RS
institution PUC_RS
reponame_str Biblioteca Digital de Teses e Dissertações da PUC_RS
collection Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv http://tede2.pucrs.br/tede2/bitstream/tede/6031/2/468945+-+Texto+Completo.pdf
http://tede2.pucrs.br/tede2/bitstream/tede/6031/4/468945+-+Texto+Completo.pdf.jpg
http://tede2.pucrs.br/tede2/bitstream/tede/6031/3/468945+-+Texto+Completo.pdf.txt
http://tede2.pucrs.br/tede2/bitstream/tede/6031/1/license.txt
bitstream.checksum.fl_str_mv d89826d93dc36151af60cc921ecaecfb
3e64e247a4c0aa7c36341815184b82a2
88d4acde21bfdc5f6efaee56e039eddc
5a9d6006225b368ef605ba16b4f6d1be
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv biblioteca.central@pucrs.br||
_version_ 1799765312142311424