Predictive analysis of COVID-19 symptoms in social networks through machine learning

Detalhes bibliográficos
Autor(a) principal: Silva, Clístenes Fernandes da
Data de Publicação: 2022
Outros Autores: Junior, Arnaldo Candido, Lopes, Rui Pedro
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10198/25323
Resumo: Social media is a great source of data for analyses, since they provide ways for people to share emotions, feelings, ideas, and even symptoms of diseases. By the end of 2019, a global pandemic alert was raised, relative to a virus that had a high contamination rate and could cause respiratory complications. To help identify those who may have the symptoms of this disease or to detect who is already infected, this paper analyzed the performance of eight machine learning algorithms (KNN, Naive Bayes, Decision Tree, Random Forest, SVM, simple Multilayer Perceptron, Convolutional Neural Networks and BERT) in the search and classification of tweets that mention self-report of COVID-19 symptoms. The dataset was labeled using a set of disease symptom keywords provided by the World Health Organization. The tests showed that Random Forest algorithm had the best results, closely followed by BERT and Convolution Neural Network, although traditional machine learning algorithms also have can also provide good results. This work could also aid in the selection of algorithms in the identification of diseases symptoms in social media content.
id RCAP_b33fbf6247d671f53d9a5ede987713c6
oai_identifier_str oai:bibliotecadigital.ipb.pt:10198/25323
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Predictive analysis of COVID-19 symptoms in social networks through machine learningNatural language processingMachine learningText classificationCOVID-19Tweet analysisSocial media is a great source of data for analyses, since they provide ways for people to share emotions, feelings, ideas, and even symptoms of diseases. By the end of 2019, a global pandemic alert was raised, relative to a virus that had a high contamination rate and could cause respiratory complications. To help identify those who may have the symptoms of this disease or to detect who is already infected, this paper analyzed the performance of eight machine learning algorithms (KNN, Naive Bayes, Decision Tree, Random Forest, SVM, simple Multilayer Perceptron, Convolutional Neural Networks and BERT) in the search and classification of tweets that mention self-report of COVID-19 symptoms. The dataset was labeled using a set of disease symptom keywords provided by the World Health Organization. The tests showed that Random Forest algorithm had the best results, closely followed by BERT and Convolution Neural Network, although traditional machine learning algorithms also have can also provide good results. This work could also aid in the selection of algorithms in the identification of diseases symptoms in social media content.This work has been supported by FCT—Fundação para a Ciência e Tecnologia within the Project Scope: DSAIPA/AI/0088/2020Biblioteca Digital do IPBSilva, Clístenes Fernandes daJunior, Arnaldo CandidoLopes, Rui Pedro2022-04-04T10:28:25Z20222022-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10198/25323engSilva, Clístenes Fernandes da; Junior, Arnaldo Candido; Lopes, Rui Pedro (2022). Predictive analysis of COVID-19 symptoms in social networks through machine learning. Electronics. ISSN 2079-9292. 11:4, p. 1-1410.3390/electronics110405802079-9292info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-21T10:56:41Zoai:bibliotecadigital.ipb.pt:10198/25323Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:15:59.718995Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Predictive analysis of COVID-19 symptoms in social networks through machine learning
title Predictive analysis of COVID-19 symptoms in social networks through machine learning
spellingShingle Predictive analysis of COVID-19 symptoms in social networks through machine learning
Silva, Clístenes Fernandes da
Natural language processing
Machine learning
Text classification
COVID-19
Tweet analysis
title_short Predictive analysis of COVID-19 symptoms in social networks through machine learning
title_full Predictive analysis of COVID-19 symptoms in social networks through machine learning
title_fullStr Predictive analysis of COVID-19 symptoms in social networks through machine learning
title_full_unstemmed Predictive analysis of COVID-19 symptoms in social networks through machine learning
title_sort Predictive analysis of COVID-19 symptoms in social networks through machine learning
author Silva, Clístenes Fernandes da
author_facet Silva, Clístenes Fernandes da
Junior, Arnaldo Candido
Lopes, Rui Pedro
author_role author
author2 Junior, Arnaldo Candido
Lopes, Rui Pedro
author2_role author
author
dc.contributor.none.fl_str_mv Biblioteca Digital do IPB
dc.contributor.author.fl_str_mv Silva, Clístenes Fernandes da
Junior, Arnaldo Candido
Lopes, Rui Pedro
dc.subject.por.fl_str_mv Natural language processing
Machine learning
Text classification
COVID-19
Tweet analysis
topic Natural language processing
Machine learning
Text classification
COVID-19
Tweet analysis
description Social media is a great source of data for analyses, since they provide ways for people to share emotions, feelings, ideas, and even symptoms of diseases. By the end of 2019, a global pandemic alert was raised, relative to a virus that had a high contamination rate and could cause respiratory complications. To help identify those who may have the symptoms of this disease or to detect who is already infected, this paper analyzed the performance of eight machine learning algorithms (KNN, Naive Bayes, Decision Tree, Random Forest, SVM, simple Multilayer Perceptron, Convolutional Neural Networks and BERT) in the search and classification of tweets that mention self-report of COVID-19 symptoms. The dataset was labeled using a set of disease symptom keywords provided by the World Health Organization. The tests showed that Random Forest algorithm had the best results, closely followed by BERT and Convolution Neural Network, although traditional machine learning algorithms also have can also provide good results. This work could also aid in the selection of algorithms in the identification of diseases symptoms in social media content.
publishDate 2022
dc.date.none.fl_str_mv 2022-04-04T10:28:25Z
2022
2022-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10198/25323
url http://hdl.handle.net/10198/25323
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Silva, Clístenes Fernandes da; Junior, Arnaldo Candido; Lopes, Rui Pedro (2022). Predictive analysis of COVID-19 symptoms in social networks through machine learning. Electronics. ISSN 2079-9292. 11:4, p. 1-14
10.3390/electronics11040580
2079-9292
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135443921403904