A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era

Detalhes bibliográficos
Autor(a) principal: Pérez-Pérez, Martín
Data de Publicação: 2021
Outros Autores: Igrejas, Gilberto, Fdez-Riverola, Florentino, Lourenço, Anália
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/1822/73523
Resumo: Journal pre proof
id RCAP_20e3044ee55e060b4056e9bf0ad5c11f
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/73523
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital eraSocial mediaSociome profilingText miningGraph miningMachine learningHealth for informaticsCiências Médicas::Biotecnologia MédicaScience & TechnologyJournal pre proofBig data importance and potential are becoming more and more relevant nowadays, enhanced by the explosive growth of information volume that is being generated on the Internet in the last years. In this sense, many experts agree that social media networks are one of the internet areas with higher growth in recent years and one of the fields that are expected to have a more significant increment in the coming years. Similarly, social media sites are quickly becoming one of the most popular platforms to discuss health issues and exchange social support with others. In this context, this work presents a new methodology to process, classify, visualise and analyse the big data knowledge produced by the sociome on social media platforms. This work proposes a methodology that combines natural language processing techniques, ontology-based named entity recognition methods, machine learning algorithms and graph mining techniques to: (i) reduce the irrelevant messages by identifying and focusing the analysis only on individuals and patient experiences from the public discussion; (ii) reduce the lexical noise produced by the different ways in how users express themselves through the use of domain ontologies; (iii) infer the demographic data of the individuals through the combined analysis of textual, geographical and visual profile information; (iv) perform a community detection and evaluate the health topic study combining the semantic processing of the public discourse with knowledge graph representation techniques; and (v) gain information about the shared resources combining the social media statistics with the semantical analysis of the web contents. The practical relevance of the proposed methodology has been proven in the study of 1.1 million unique messages from more than 400,000 distinct users related to one of the most popular dietary fads that evolve into a multibillion-dollar industry, i.e., gluten-free food. Besides, this work analysed one of the least research fields studied on Twitter concerning public health (i.e., the allergies or immunology diseases as celiac disease), discovering a wide range of health-related conclusions.SING group thanks CITI (Centro de Investigacion, Transferencia e Innovacion) from the University of Vigo for hosting its IT infrastructure. This work was supported by: the Associate Laboratory for Green Chemistry-LAQV, which is financed by national funds from and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of [UIDB/50006/2020] and [UIDB/04469/2020] units, and BioTecNorte operation [NORTE010145FEDER000004] funded by the European Regional Development Fund under the scope of Norte2020Programa Operacional Regional do Norte, the Xunta de Galicia (Centro singular de investigacion de Galicia accreditation 2019-2022) and the European Union (European Regional Development Fund - ERDF)- Ref. [ED431G2019/06] , and Conselleria de Educacion, Universidades e Formacion Profesional (Xunta de Galicia) under the scope of the strategic funding of [ED431C2018/55GRC] Competitive Reference Group. The authors also acknowledge the post-doctoral fellowship [ED481B2019032] of Martin PerezPerez, funded by the Xunta de Galicia. Funding for open access charge: Universidade de Vigo/CISUGinfo:eu-repo/semantics/publishedVersionElsevierUniversidade do MinhoPérez-Pérez, MartínIgrejas, GilbertoFdez-Riverola, FlorentinoLourenço, Anália2021-082021-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/73523engPérez-Pérez, Martín; Igrejas, Gilberto; Fdez-Riverola, Florentino; Lourenço, Anália, A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era. Artificial Intelligence in Medicine, 118(102131), 20210933-365710.1016/j.artmed.2021.10213134412847https://www.sciencedirect.com/science/article/pii/S093336572100124Xinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:50:32Zoai:repositorium.sdum.uminho.pt:1822/73523Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:49:15.384155Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
title A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
spellingShingle A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
Pérez-Pérez, Martín
Social media
Sociome profiling
Text mining
Graph mining
Machine learning
Health for informatics
Ciências Médicas::Biotecnologia Médica
Science & Technology
title_short A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
title_full A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
title_fullStr A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
title_full_unstemmed A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
title_sort A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
author Pérez-Pérez, Martín
author_facet Pérez-Pérez, Martín
Igrejas, Gilberto
Fdez-Riverola, Florentino
Lourenço, Anália
author_role author
author2 Igrejas, Gilberto
Fdez-Riverola, Florentino
Lourenço, Anália
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Pérez-Pérez, Martín
Igrejas, Gilberto
Fdez-Riverola, Florentino
Lourenço, Anália
dc.subject.por.fl_str_mv Social media
Sociome profiling
Text mining
Graph mining
Machine learning
Health for informatics
Ciências Médicas::Biotecnologia Médica
Science & Technology
topic Social media
Sociome profiling
Text mining
Graph mining
Machine learning
Health for informatics
Ciências Médicas::Biotecnologia Médica
Science & Technology
description Journal pre proof
publishDate 2021
dc.date.none.fl_str_mv 2021-08
2021-08-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/73523
url http://hdl.handle.net/1822/73523
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Pérez-Pérez, Martín; Igrejas, Gilberto; Fdez-Riverola, Florentino; Lourenço, Anália, A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era. Artificial Intelligence in Medicine, 118(102131), 2021
0933-3657
10.1016/j.artmed.2021.102131
34412847
https://www.sciencedirect.com/science/article/pii/S093336572100124X
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133072859332608