ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFSCAR |
Texto Completo: | https://repositorio.ufscar.br/handle/ufscar/10710 |
Resumo: | Opinions on the Web have been increasing progressively and, thus, has aroused interest in areas of study of Linguistics and Computation, for example. In this context comes the Sentiment Analysis, or Opinion Mining, which aims to analyze computationally opinions, emotions, feelings and subjectivities present in texts (LIU, 2012), however, certain subjective sentences can carry irony, transforming the meaning of a sentence. This dissertation aims to investigate expressions of irony in social media, focusing on the description of linguistic devices as clues of irony in opinion texts in Brazilian Portuguese. To understand the functioning of this figurative mechanism, we will start from the search a corpus constructed by new commentaries from the Folha de S. Paulo portal. In addiction, based on pragmatic and cognitive theories, we developed a corpus annotation scheme for opinions and their intentions: ironic, other types of irony or non-ironic. As a result, we have obtained a list of subcategories that characterize expressions of irony that allow to collaborate with the development of the NLP area and Sentiment Analysis and, in addition, to improve tools of automatic identification of opinion through the descriptions and the linguistic resources elaborated here. |
id |
SCAR_0c00ed46fc7834b0e3f4a1defd0c2d6a |
---|---|
oai_identifier_str |
oai:repositorio.ufscar.br:ufscar/10710 |
network_acronym_str |
SCAR |
network_name_str |
Repositório Institucional da UFSCAR |
repository_id_str |
4322 |
spelling |
Pedro, Gabriela WickVale, Oto Araújohttp://lattes.cnpq.br/2277403284693571http://lattes.cnpq.br/3367416478527735a9f2dff2-a7d7-44be-9d91-114f346f08952018-11-26T16:12:48Z2018-11-26T16:12:48Z2018-03-16PEDRO, Gabriela Wick. ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil. 2018. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10710.https://repositorio.ufscar.br/handle/ufscar/10710Opinions on the Web have been increasing progressively and, thus, has aroused interest in areas of study of Linguistics and Computation, for example. In this context comes the Sentiment Analysis, or Opinion Mining, which aims to analyze computationally opinions, emotions, feelings and subjectivities present in texts (LIU, 2012), however, certain subjective sentences can carry irony, transforming the meaning of a sentence. This dissertation aims to investigate expressions of irony in social media, focusing on the description of linguistic devices as clues of irony in opinion texts in Brazilian Portuguese. To understand the functioning of this figurative mechanism, we will start from the search a corpus constructed by new commentaries from the Folha de S. Paulo portal. In addiction, based on pragmatic and cognitive theories, we developed a corpus annotation scheme for opinions and their intentions: ironic, other types of irony or non-ironic. As a result, we have obtained a list of subcategories that characterize expressions of irony that allow to collaborate with the development of the NLP area and Sentiment Analysis and, in addition, to improve tools of automatic identification of opinion through the descriptions and the linguistic resources elaborated here.Opiniões na Web têm crescido progressivamente e assim, vem despertando o interesse em áreas de estudo da Linguística e Computação, por exemplo. Nesse contexto, surge a Análise de Sentimentos, ou Mineração de Opinião, que tem como objetivo analisar computacionalmente opiniões, emoções, sentimentos e subjetividades presentes em textos (LIU, 2012), entretanto, certas sentenças subjetivas podem carregar ironia, transformando o sentido de uma sentença. Esta dissertação de mestrado tem como propósito investigar expressões de ironia em mídias sociais, com foco na descrição de dispositivos linguísticos como pistas de ironia em textos opinativos no português do Brasil. Para compreender o funcionamento deste mecanismo figurado, partiremos da busca um corpus construído por comentários de notícias do portal da Folha de S. Paulo. Juntamente, apoiado em teorias pragmáticas e cognitivas, desenvolvemos um esquema de anotação de corpus para opiniões e suas intenções: irônicas, outros tipos de ironia ou não irônica. Como resultado, obtivemos uma lista de subcategorias que caracterizam expressões de ironia que permite colaborar com o desenvolvimento da área de PLN e Análise de Sentimentos e, além disso, aperfeiçoar ferramentas de identificação automática de opinião através das descrições e dos recursos linguísticos aqui elaborados.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)porUniversidade Federal de São CarlosCâmpus São CarlosPrograma de Pós-Graduação em Linguística - PPGLUFSCarAnálise de sentimentosLinguística de corpusIroniaOpinião públicaProcessamento de linguagem natural (Computação)Sentiment analysisPublic opinionNatural language processing (Computer science)IronyCorpora (Linguistics)LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADAComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do BrasilComentCorpus: identification and linguistic cues for detection of irony in Brazilian Portugueseinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisOnline60060085b6c37a-aa0c-4ee5-9acc-50f6bc172f4finfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALGWP_Dissertação.pdfGWP_Dissertação.pdfDissertação Finalapplication/pdf1997047https://repositorio.ufscar.br/bitstream/ufscar/10710/1/GWP_Dissertac%cc%a7a%cc%83o.pdfbe18319417702dcfce7375301b97c8a5MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstream/ufscar/10710/3/license.txtae0398b6f8b235e40ad82cba6c50031dMD53TEXTGWP_Dissertação.pdf.txtGWP_Dissertação.pdf.txtExtracted texttext/plain182400https://repositorio.ufscar.br/bitstream/ufscar/10710/4/GWP_Dissertac%cc%a7a%cc%83o.pdf.txt29b8880855f692df478c91cf3c3a103fMD54THUMBNAILGWP_Dissertação.pdf.jpgGWP_Dissertação.pdf.jpgIM Thumbnailimage/jpeg10583https://repositorio.ufscar.br/bitstream/ufscar/10710/5/GWP_Dissertac%cc%a7a%cc%83o.pdf.jpgd5ee1acab3c181094411bd1a6eb737ccMD55ufscar/107102023-09-18 18:31:18.067oai:repositorio.ufscar.br:ufscar/10710TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg==Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:31:18Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false |
dc.title.por.fl_str_mv |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
dc.title.alternative.eng.fl_str_mv |
ComentCorpus: identification and linguistic cues for detection of irony in Brazilian Portuguese |
title |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
spellingShingle |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil Pedro, Gabriela Wick Análise de sentimentos Linguística de corpus Ironia Opinião pública Processamento de linguagem natural (Computação) Sentiment analysis Public opinion Natural language processing (Computer science) Irony Corpora (Linguistics) LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA |
title_short |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
title_full |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
title_fullStr |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
title_full_unstemmed |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
title_sort |
ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil |
author |
Pedro, Gabriela Wick |
author_facet |
Pedro, Gabriela Wick |
author_role |
author |
dc.contributor.authorlattes.por.fl_str_mv |
http://lattes.cnpq.br/3367416478527735 |
dc.contributor.author.fl_str_mv |
Pedro, Gabriela Wick |
dc.contributor.advisor1.fl_str_mv |
Vale, Oto Araújo |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/2277403284693571 |
dc.contributor.authorID.fl_str_mv |
a9f2dff2-a7d7-44be-9d91-114f346f0895 |
contributor_str_mv |
Vale, Oto Araújo |
dc.subject.por.fl_str_mv |
Análise de sentimentos Linguística de corpus Ironia Opinião pública Processamento de linguagem natural (Computação) |
topic |
Análise de sentimentos Linguística de corpus Ironia Opinião pública Processamento de linguagem natural (Computação) Sentiment analysis Public opinion Natural language processing (Computer science) Irony Corpora (Linguistics) LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA |
dc.subject.eng.fl_str_mv |
Sentiment analysis Public opinion Natural language processing (Computer science) Irony Corpora (Linguistics) |
dc.subject.cnpq.fl_str_mv |
LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA |
description |
Opinions on the Web have been increasing progressively and, thus, has aroused interest in areas of study of Linguistics and Computation, for example. In this context comes the Sentiment Analysis, or Opinion Mining, which aims to analyze computationally opinions, emotions, feelings and subjectivities present in texts (LIU, 2012), however, certain subjective sentences can carry irony, transforming the meaning of a sentence. This dissertation aims to investigate expressions of irony in social media, focusing on the description of linguistic devices as clues of irony in opinion texts in Brazilian Portuguese. To understand the functioning of this figurative mechanism, we will start from the search a corpus constructed by new commentaries from the Folha de S. Paulo portal. In addiction, based on pragmatic and cognitive theories, we developed a corpus annotation scheme for opinions and their intentions: ironic, other types of irony or non-ironic. As a result, we have obtained a list of subcategories that characterize expressions of irony that allow to collaborate with the development of the NLP area and Sentiment Analysis and, in addition, to improve tools of automatic identification of opinion through the descriptions and the linguistic resources elaborated here. |
publishDate |
2018 |
dc.date.accessioned.fl_str_mv |
2018-11-26T16:12:48Z |
dc.date.available.fl_str_mv |
2018-11-26T16:12:48Z |
dc.date.issued.fl_str_mv |
2018-03-16 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
PEDRO, Gabriela Wick. ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil. 2018. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10710. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufscar.br/handle/ufscar/10710 |
identifier_str_mv |
PEDRO, Gabriela Wick. ComentCorpus: identificação e pistas linguísticas para detecção de ironia no português do Brasil. 2018. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10710. |
url |
https://repositorio.ufscar.br/handle/ufscar/10710 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.confidence.fl_str_mv |
600 600 |
dc.relation.authority.fl_str_mv |
85b6c37a-aa0c-4ee5-9acc-50f6bc172f4f |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus São Carlos |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Linguística - PPGL |
dc.publisher.initials.fl_str_mv |
UFSCar |
publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus São Carlos |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR |
instname_str |
Universidade Federal de São Carlos (UFSCAR) |
instacron_str |
UFSCAR |
institution |
UFSCAR |
reponame_str |
Repositório Institucional da UFSCAR |
collection |
Repositório Institucional da UFSCAR |
bitstream.url.fl_str_mv |
https://repositorio.ufscar.br/bitstream/ufscar/10710/1/GWP_Dissertac%cc%a7a%cc%83o.pdf https://repositorio.ufscar.br/bitstream/ufscar/10710/3/license.txt https://repositorio.ufscar.br/bitstream/ufscar/10710/4/GWP_Dissertac%cc%a7a%cc%83o.pdf.txt https://repositorio.ufscar.br/bitstream/ufscar/10710/5/GWP_Dissertac%cc%a7a%cc%83o.pdf.jpg |
bitstream.checksum.fl_str_mv |
be18319417702dcfce7375301b97c8a5 ae0398b6f8b235e40ad82cba6c50031d 29b8880855f692df478c91cf3c3a103f d5ee1acab3c181094411bd1a6eb737cc |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR) |
repository.mail.fl_str_mv |
|
_version_ |
1802136349723066368 |