SA-MAIS: Hybrid automatic sentiment analyser for stock market
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10071/30120 |
Resumo: | Sentiment analysis of stock-related tweets is a challenging task, not only due to the specificity of the domain but also because of the short nature of the texts. This work proposes SA-MAIS, a two-step lightweight methodology, specially adapted to perform sentiment analysis in domain-constrained short-text messages. To tackle the issue of domain specificity, based on word frequency, the most relevant words are automatically extracted from the new domain and then manually tagged to update an existing domain-specific sentiment lexicon. The sentiment classification is then performed by combining the updated domain-specific lexicon with VADER sentiment analysis, a well-known and widely used sentiment analysis tool. The proposed method is compared with other well-known and widely used sentiment analysis tools, including transformer-based models, such as BERTweet, Twitter-roBERTa and FinBERT, on a domain-specific corpus of stock market-related tweets comprising 1 million messages. The experimental results show that the proposed approach largely surpasses the performance of the other sentiment analysis tools, reaching an overall accuracy of 72.0%. The achieved results highlight the advantage of using a hybrid method that combines domain-specific lexicons with existing generalist tools for the inference of textual sentiment in domain-specific short-text messages. |
id |
RCAP_19ebe83dc23ecae6a5f4421acfebd99a |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/30120 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
SA-MAIS: Hybrid automatic sentiment analyser for stock marketSentiment analysisSentiment classificationSentiment lexiconStock marketTweetsSentiment analysis of stock-related tweets is a challenging task, not only due to the specificity of the domain but also because of the short nature of the texts. This work proposes SA-MAIS, a two-step lightweight methodology, specially adapted to perform sentiment analysis in domain-constrained short-text messages. To tackle the issue of domain specificity, based on word frequency, the most relevant words are automatically extracted from the new domain and then manually tagged to update an existing domain-specific sentiment lexicon. The sentiment classification is then performed by combining the updated domain-specific lexicon with VADER sentiment analysis, a well-known and widely used sentiment analysis tool. The proposed method is compared with other well-known and widely used sentiment analysis tools, including transformer-based models, such as BERTweet, Twitter-roBERTa and FinBERT, on a domain-specific corpus of stock market-related tweets comprising 1 million messages. The experimental results show that the proposed approach largely surpasses the performance of the other sentiment analysis tools, reaching an overall accuracy of 72.0%. The achieved results highlight the advantage of using a hybrid method that combines domain-specific lexicons with existing generalist tools for the inference of textual sentiment in domain-specific short-text messages.SAGE2023-12-28T11:18:17Z2023-01-01T00:00:00Z20232023-12-28T11:17:39Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/30120eng0165-551510.1177/01655515231171361Taborda, B.de Almeida, A.Dias, J. C.Batista, F.Ribeiro, R.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-12-31T01:17:45Zoai:repositorio.iscte-iul.pt:10071/30120Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:56:51.835170Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
title |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
spellingShingle |
SA-MAIS: Hybrid automatic sentiment analyser for stock market Taborda, B. Sentiment analysis Sentiment classification Sentiment lexicon Stock market Tweets |
title_short |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
title_full |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
title_fullStr |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
title_full_unstemmed |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
title_sort |
SA-MAIS: Hybrid automatic sentiment analyser for stock market |
author |
Taborda, B. |
author_facet |
Taborda, B. de Almeida, A. Dias, J. C. Batista, F. Ribeiro, R. |
author_role |
author |
author2 |
de Almeida, A. Dias, J. C. Batista, F. Ribeiro, R. |
author2_role |
author author author author |
dc.contributor.author.fl_str_mv |
Taborda, B. de Almeida, A. Dias, J. C. Batista, F. Ribeiro, R. |
dc.subject.por.fl_str_mv |
Sentiment analysis Sentiment classification Sentiment lexicon Stock market Tweets |
topic |
Sentiment analysis Sentiment classification Sentiment lexicon Stock market Tweets |
description |
Sentiment analysis of stock-related tweets is a challenging task, not only due to the specificity of the domain but also because of the short nature of the texts. This work proposes SA-MAIS, a two-step lightweight methodology, specially adapted to perform sentiment analysis in domain-constrained short-text messages. To tackle the issue of domain specificity, based on word frequency, the most relevant words are automatically extracted from the new domain and then manually tagged to update an existing domain-specific sentiment lexicon. The sentiment classification is then performed by combining the updated domain-specific lexicon with VADER sentiment analysis, a well-known and widely used sentiment analysis tool. The proposed method is compared with other well-known and widely used sentiment analysis tools, including transformer-based models, such as BERTweet, Twitter-roBERTa and FinBERT, on a domain-specific corpus of stock market-related tweets comprising 1 million messages. The experimental results show that the proposed approach largely surpasses the performance of the other sentiment analysis tools, reaching an overall accuracy of 72.0%. The achieved results highlight the advantage of using a hybrid method that combines domain-specific lexicons with existing generalist tools for the inference of textual sentiment in domain-specific short-text messages. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-12-28T11:18:17Z 2023-01-01T00:00:00Z 2023 2023-12-28T11:17:39Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/30120 |
url |
http://hdl.handle.net/10071/30120 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
0165-5515 10.1177/01655515231171361 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
SAGE |
publisher.none.fl_str_mv |
SAGE |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136452819288064 |