ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil

Detalhes bibliográficos
Autor(a) principal: Pedro, Gabriela Wick
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Institucional da UFSCAR
Texto Completo: https://repositorio.ufscar.br/handle/ufscar/10710
Resumo: Opinions on the Web have been increasing progressively and, thus, has aroused interest in areas of study of Linguistics and Computation, for example. In this context comes the Sentiment Analysis, or Opinion Mining, which aims to analyze computationally opinions, emotions, feelings and subjectivities present in texts (LIU, 2012), however, certain subjective sentences can carry irony, transforming the meaning of a sentence. This dissertation aims to investigate expressions of irony in social media, focusing on the description of linguistic devices as clues of irony in opinion texts in Brazilian Portuguese. To understand the functioning of this figurative mechanism, we will start from the search a corpus constructed by new commentaries from the Folha de S. Paulo portal. In addiction, based on pragmatic and cognitive theories, we developed a corpus annotation scheme for opinions and their intentions: ironic, other types of irony or non-ironic. As a result, we have obtained a list of subcategories that characterize expressions of irony that allow to collaborate with the development of the NLP area and Sentiment Analysis and, in addition, to improve tools of automatic identification of opinion through the descriptions and the linguistic resources elaborated here.
id SCAR_0c00ed46fc7834b0e3f4a1defd0c2d6a
oai_identifier_str oai:repositorio.ufscar.br:ufscar/10710
network_acronym_str SCAR
network_name_str Repositório Institucional da UFSCAR
repository_id_str 4322
spelling Pedro, Gabriela WickVale, Oto Araújohttp://lattes.cnpq.br/2277403284693571http://lattes.cnpq.br/3367416478527735a9f2dff2-a7d7-44be-9d91-114f346f08952018-11-26T16:12:48Z2018-11-26T16:12:48Z2018-03-16PEDRO, Gabriela Wick. ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil. 2018. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10710.https://repositorio.ufscar.br/handle/ufscar/10710Opinions on the Web have been increasing progressively and, thus, has aroused interest in areas of study of Linguistics and Computation, for example. In this context comes the Sentiment Analysis, or Opinion Mining, which aims to analyze computationally opinions, emotions, feelings and subjectivities present in texts (LIU, 2012), however, certain subjective sentences can carry irony, transforming the meaning of a sentence. This dissertation aims to investigate expressions of irony in social media, focusing on the description of linguistic devices as clues of irony in opinion texts in Brazilian Portuguese. To understand the functioning of this figurative mechanism, we will start from the search a corpus constructed by new commentaries from the Folha de S. Paulo portal. In addiction, based on pragmatic and cognitive theories, we developed a corpus annotation scheme for opinions and their intentions: ironic, other types of irony or non-ironic. As a result, we have obtained a list of subcategories that characterize expressions of irony that allow to collaborate with the development of the NLP area and Sentiment Analysis and, in addition, to improve tools of automatic identification of opinion through the descriptions and the linguistic resources elaborated here.Opiniões na Web têm crescido progressivamente e assim, vem despertando o interesse em áreas de estudo da Linguística e Computação, por exemplo. Nesse contexto, surge a Análise de Sentimentos, ou Mineração de Opinião, que tem como objetivo analisar computacionalmente opiniões, emoções, sentimentos e subjetividades presentes em textos (LIU, 2012), entretanto, certas sentenças subjetivas podem carregar ironia, transformando o sentido de uma sentença. Esta dissertação de mestrado tem como propósito investigar expressões de ironia em mídias sociais, com foco na descrição de dispositivos linguísticos como pistas de ironia em textos opinativos no português do Brasil. Para compreender o funcionamento deste mecanismo figurado, partiremos da busca um corpus construído por comentários de notícias do portal da Folha de S. Paulo. Juntamente, apoiado em teorias pragmáticas e cognitivas, desenvolvemos um esquema de anotação de corpus para opiniões e suas intenções: irônicas, outros tipos de ironia ou não irônica. Como resultado, obtivemos uma lista de subcategorias que caracterizam expressões de ironia que permite colaborar com o desenvolvimento da área de PLN e Análise de Sentimentos e, além disso, aperfeiçoar ferramentas de identificação automática de opinião através das descrições e dos recursos linguísticos aqui elaborados.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)porUniversidade Federal de São CarlosCâmpus São CarlosPrograma de Pós-Graduação em Linguística - PPGLUFSCarAnálise de sentimentosLinguística de corpusIroniaOpinião públicaProcessamento de linguagem natural (Computação)Sentiment analysisPublic opinionNatural language processing (Computer science)IronyCorpora (Linguistics)LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADAComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do BrasilComentCorpus: identification and linguistic cues for detection of irony in Brazilian Portugueseinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisOnline60060085b6c37a-aa0c-4ee5-9acc-50f6bc172f4finfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALGWP_Dissertação.pdfGWP_Dissertação.pdfDissertação Finalapplication/pdf1997047https://repositorio.ufscar.br/bitstream/ufscar/10710/1/GWP_Dissertac%cc%a7a%cc%83o.pdfbe18319417702dcfce7375301b97c8a5MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstream/ufscar/10710/3/license.txtae0398b6f8b235e40ad82cba6c50031dMD53TEXTGWP_Dissertação.pdf.txtGWP_Dissertação.pdf.txtExtracted texttext/plain182400https://repositorio.ufscar.br/bitstream/ufscar/10710/4/GWP_Dissertac%cc%a7a%cc%83o.pdf.txt29b8880855f692df478c91cf3c3a103fMD54THUMBNAILGWP_Dissertação.pdf.jpgGWP_Dissertação.pdf.jpgIM Thumbnailimage/jpeg10583https://repositorio.ufscar.br/bitstream/ufscar/10710/5/GWP_Dissertac%cc%a7a%cc%83o.pdf.jpgd5ee1acab3c181094411bd1a6eb737ccMD55ufscar/107102023-09-18 18:31:18.067oai:repositorio.ufscar.br:ufscar/10710TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg==Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:31:18Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.por.fl_str_mv ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
dc.title.alternative.eng.fl_str_mv ComentCorpus: identification and linguistic cues for detection of irony in Brazilian Portuguese
title ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
spellingShingle ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
Pedro, Gabriela Wick
Análise de sentimentos
Linguística de corpus
Ironia
Opinião pública
Processamento de linguagem natural (Computação)
Sentiment analysis
Public opinion
Natural language processing (Computer science)
Irony
Corpora (Linguistics)
LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA
title_short ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
title_full ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
title_fullStr ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
title_full_unstemmed ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
title_sort ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
author Pedro, Gabriela Wick
author_facet Pedro, Gabriela Wick
author_role author
dc.contributor.authorlattes.por.fl_str_mv http://lattes.cnpq.br/3367416478527735
dc.contributor.author.fl_str_mv Pedro, Gabriela Wick
dc.contributor.advisor1.fl_str_mv Vale, Oto Araújo
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/2277403284693571
dc.contributor.authorID.fl_str_mv a9f2dff2-a7d7-44be-9d91-114f346f0895
contributor_str_mv Vale, Oto Araújo
dc.subject.por.fl_str_mv Análise de sentimentos
Linguística de corpus
Ironia
Opinião pública
Processamento de linguagem natural (Computação)
topic Análise de sentimentos
Linguística de corpus
Ironia
Opinião pública
Processamento de linguagem natural (Computação)
Sentiment analysis
Public opinion
Natural language processing (Computer science)
Irony
Corpora (Linguistics)
LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA
dc.subject.eng.fl_str_mv Sentiment analysis
Public opinion
Natural language processing (Computer science)
Irony
Corpora (Linguistics)
dc.subject.cnpq.fl_str_mv LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA
description Opinions on the Web have been increasing progressively and, thus, has aroused interest in areas of study of Linguistics and Computation, for example. In this context comes the Sentiment Analysis, or Opinion Mining, which aims to analyze computationally opinions, emotions, feelings and subjectivities present in texts (LIU, 2012), however, certain subjective sentences can carry irony, transforming the meaning of a sentence. This dissertation aims to investigate expressions of irony in social media, focusing on the description of linguistic devices as clues of irony in opinion texts in Brazilian Portuguese. To understand the functioning of this figurative mechanism, we will start from the search a corpus constructed by new commentaries from the Folha de S. Paulo portal. In addiction, based on pragmatic and cognitive theories, we developed a corpus annotation scheme for opinions and their intentions: ironic, other types of irony or non-ironic. As a result, we have obtained a list of subcategories that characterize expressions of irony that allow to collaborate with the development of the NLP area and Sentiment Analysis and, in addition, to improve tools of automatic identification of opinion through the descriptions and the linguistic resources elaborated here.
publishDate 2018
dc.date.accessioned.fl_str_mv 2018-11-26T16:12:48Z
dc.date.available.fl_str_mv 2018-11-26T16:12:48Z
dc.date.issued.fl_str_mv 2018-03-16
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv PEDRO, Gabriela Wick. ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil. 2018. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10710.
dc.identifier.uri.fl_str_mv https://repositorio.ufscar.br/handle/ufscar/10710
identifier_str_mv PEDRO, Gabriela Wick. ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil. 2018. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10710.
url https://repositorio.ufscar.br/handle/ufscar/10710
dc.language.iso.fl_str_mv por
language por
dc.relation.confidence.fl_str_mv 600
600
dc.relation.authority.fl_str_mv 85b6c37a-aa0c-4ee5-9acc-50f6bc172f4f
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Linguística - PPGL
dc.publisher.initials.fl_str_mv UFSCar
publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFSCAR
instname:Universidade Federal de São Carlos (UFSCAR)
instacron:UFSCAR
instname_str Universidade Federal de São Carlos (UFSCAR)
instacron_str UFSCAR
institution UFSCAR
reponame_str Repositório Institucional da UFSCAR
collection Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv https://repositorio.ufscar.br/bitstream/ufscar/10710/1/GWP_Dissertac%cc%a7a%cc%83o.pdf
https://repositorio.ufscar.br/bitstream/ufscar/10710/3/license.txt
https://repositorio.ufscar.br/bitstream/ufscar/10710/4/GWP_Dissertac%cc%a7a%cc%83o.pdf.txt
https://repositorio.ufscar.br/bitstream/ufscar/10710/5/GWP_Dissertac%cc%a7a%cc%83o.pdf.jpg
bitstream.checksum.fl_str_mv be18319417702dcfce7375301b97c8a5
ae0398b6f8b235e40ad82cba6c50031d
29b8880855f692df478c91cf3c3a103f
d5ee1acab3c181094411bd1a6eb737cc
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv
_version_ 1802136349723066368