Análise de sentimento em artigos de opinião
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Outros Autores: | , , , , , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/120635 |
Resumo: | The present study, which is developed in the interface between linguistics andcomputer science within the framework of sentiment analysis, aims at making a computationalanalysis of opinion articles in the area of economics and finance. The main objectives of thestudy are: i) to determine the semantic orientation of text segments that express opinion byannotating the polarity (positive or negative) and the strength (scale from -3 to 3) of nounsand adjectives, and ii) to verify if a specific lexicon for the area of economics and finance hasadvantages in automatic annotation of sentiment over a general lexicon. To achieve theseobjectives, a corpus of 45 texts was selected and analyzed in 2 phases, by annotators withdifferent training. First, a sample of 10 texts was annotated by linguists, co-authors of thispaper, with the objective of developing a linguistic annotation model to ascertain the polarityand strength of words in opinion articles and extract the relevant words for this area of study.Then, a set of 35 texts was annotated by university students, replicating the annotation modeldeveloped during the first phase. Based on the linguistic annotation, the computer science teamtried to establish to what extent a general sentiment lexicon for Portuguese - SentiLex - wassufficient to extract the sentiment of a sentence in a satisfactory manner or whether EconoLex,a specific sentiment lexicon, would be more efficient. The specific lexicon includes terms andmultiword expressions that are relevant to the area of economics and finance and to Portugueselanguage, and it was developed by the authors of this study. The data was analyzed accordingto a blending methodology, qualitative and quantitative. The results of the analysis allow usto consider the following items as contributes of this study: i) the development of a linguisticannotation model for the analysis of the polarity and strength of the lexicon, especially of nounsand adjectives; ii) the key role, though not exclusive, of the adjectives to determine the polarityof opinion segments of the corpus articles; iii) the creation of a new specific sentiment lexiconfor Portuguese in the area of economics and finance; iv) the improvement of the computationalperformance of EconoLex⨁SentiLex in relation to SentiLex regarding the performance inautomatic annotation of sentiment. In spite of these positive results, there are some limitations,which we intend to overcome in the continuity of this interdisciplinary work, namely a moredetailed linguistic analysis of the word classes that we studied, the consideration of otherelements/ linguistic structures that are essential to ascertain the sentiment in NP/sentence, theextension of the corpus, the expansion of the specific lexicon of the area of economics andfinance and the improvement of automatic methods for identifying evaluative words in texts ofopinion and for assigning them polarity and strength. |
id |
RCAP_88fc44b44e9c917eb382b23a92ffb1ef |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/120635 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
|
spelling |
Análise de sentimento em artigos de opiniãoLinguísticaLinguisticsThe present study, which is developed in the interface between linguistics andcomputer science within the framework of sentiment analysis, aims at making a computationalanalysis of opinion articles in the area of economics and finance. The main objectives of thestudy are: i) to determine the semantic orientation of text segments that express opinion byannotating the polarity (positive or negative) and the strength (scale from -3 to 3) of nounsand adjectives, and ii) to verify if a specific lexicon for the area of economics and finance hasadvantages in automatic annotation of sentiment over a general lexicon. To achieve theseobjectives, a corpus of 45 texts was selected and analyzed in 2 phases, by annotators withdifferent training. First, a sample of 10 texts was annotated by linguists, co-authors of thispaper, with the objective of developing a linguistic annotation model to ascertain the polarityand strength of words in opinion articles and extract the relevant words for this area of study.Then, a set of 35 texts was annotated by university students, replicating the annotation modeldeveloped during the first phase. Based on the linguistic annotation, the computer science teamtried to establish to what extent a general sentiment lexicon for Portuguese - SentiLex - wassufficient to extract the sentiment of a sentence in a satisfactory manner or whether EconoLex,a specific sentiment lexicon, would be more efficient. The specific lexicon includes terms andmultiword expressions that are relevant to the area of economics and finance and to Portugueselanguage, and it was developed by the authors of this study. The data was analyzed accordingto a blending methodology, qualitative and quantitative. The results of the analysis allow usto consider the following items as contributes of this study: i) the development of a linguisticannotation model for the analysis of the polarity and strength of the lexicon, especially of nounsand adjectives; ii) the key role, though not exclusive, of the adjectives to determine the polarityof opinion segments of the corpus articles; iii) the creation of a new specific sentiment lexiconfor Portuguese in the area of economics and finance; iv) the improvement of the computationalperformance of EconoLex⨁SentiLex in relation to SentiLex regarding the performance inautomatic annotation of sentiment. In spite of these positive results, there are some limitations,which we intend to overcome in the continuity of this interdisciplinary work, namely a moredetailed linguistic analysis of the word classes that we studied, the consideration of otherelements/ linguistic structures that are essential to ascertain the sentiment in NP/sentence, theextension of the corpus, the expansion of the specific lexicon of the area of economics andfinance and the improvement of automatic methods for identifying evaluative words in texts ofopinion and for assigning them polarity and strength.20182018-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/10216/120635por1646-6195Silva, Maria de Fátima Henriques daSilvano, Maria da PurificaçãoLeal, AntónioOliveira, FátimaBrazdil, PavelCordeiro, JoãoOliveira, Déborainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-26T14:57:45ZPortal AgregadorONG |
dc.title.none.fl_str_mv |
Análise de sentimento em artigos de opinião |
title |
Análise de sentimento em artigos de opinião |
spellingShingle |
Análise de sentimento em artigos de opinião Silva, Maria de Fátima Henriques da Linguística Linguistics |
title_short |
Análise de sentimento em artigos de opinião |
title_full |
Análise de sentimento em artigos de opinião |
title_fullStr |
Análise de sentimento em artigos de opinião |
title_full_unstemmed |
Análise de sentimento em artigos de opinião |
title_sort |
Análise de sentimento em artigos de opinião |
author |
Silva, Maria de Fátima Henriques da |
author_facet |
Silva, Maria de Fátima Henriques da Silvano, Maria da Purificação Leal, António Oliveira, Fátima Brazdil, Pavel Cordeiro, João Oliveira, Débora |
author_role |
author |
author2 |
Silvano, Maria da Purificação Leal, António Oliveira, Fátima Brazdil, Pavel Cordeiro, João Oliveira, Débora |
author2_role |
author author author author author author |
dc.contributor.author.fl_str_mv |
Silva, Maria de Fátima Henriques da Silvano, Maria da Purificação Leal, António Oliveira, Fátima Brazdil, Pavel Cordeiro, João Oliveira, Débora |
dc.subject.por.fl_str_mv |
Linguística Linguistics |
topic |
Linguística Linguistics |
description |
The present study, which is developed in the interface between linguistics andcomputer science within the framework of sentiment analysis, aims at making a computationalanalysis of opinion articles in the area of economics and finance. The main objectives of thestudy are: i) to determine the semantic orientation of text segments that express opinion byannotating the polarity (positive or negative) and the strength (scale from -3 to 3) of nounsand adjectives, and ii) to verify if a specific lexicon for the area of economics and finance hasadvantages in automatic annotation of sentiment over a general lexicon. To achieve theseobjectives, a corpus of 45 texts was selected and analyzed in 2 phases, by annotators withdifferent training. First, a sample of 10 texts was annotated by linguists, co-authors of thispaper, with the objective of developing a linguistic annotation model to ascertain the polarityand strength of words in opinion articles and extract the relevant words for this area of study.Then, a set of 35 texts was annotated by university students, replicating the annotation modeldeveloped during the first phase. Based on the linguistic annotation, the computer science teamtried to establish to what extent a general sentiment lexicon for Portuguese - SentiLex - wassufficient to extract the sentiment of a sentence in a satisfactory manner or whether EconoLex,a specific sentiment lexicon, would be more efficient. The specific lexicon includes terms andmultiword expressions that are relevant to the area of economics and finance and to Portugueselanguage, and it was developed by the authors of this study. The data was analyzed accordingto a blending methodology, qualitative and quantitative. The results of the analysis allow usto consider the following items as contributes of this study: i) the development of a linguisticannotation model for the analysis of the polarity and strength of the lexicon, especially of nounsand adjectives; ii) the key role, though not exclusive, of the adjectives to determine the polarityof opinion segments of the corpus articles; iii) the creation of a new specific sentiment lexiconfor Portuguese in the area of economics and finance; iv) the improvement of the computationalperformance of EconoLex⨁SentiLex in relation to SentiLex regarding the performance inautomatic annotation of sentiment. In spite of these positive results, there are some limitations,which we intend to overcome in the continuity of this interdisciplinary work, namely a moredetailed linguistic analysis of the word classes that we studied, the consideration of otherelements/ linguistic structures that are essential to ascertain the sentiment in NP/sentence, theextension of the corpus, the expansion of the specific lexicon of the area of economics andfinance and the improvement of automatic methods for identifying evaluative words in texts ofopinion and for assigning them polarity and strength. |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018 2018-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/120635 |
url |
https://hdl.handle.net/10216/120635 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
1646-6195 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
|
repository.mail.fl_str_mv |
|
_version_ |
1777304310493741057 |