A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/1822/73523 |
Resumo: | Journal pre proof |
id |
RCAP_20e3044ee55e060b4056e9bf0ad5c11f |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/73523 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital eraSocial mediaSociome profilingText miningGraph miningMachine learningHealth for informaticsCiências Médicas::Biotecnologia MédicaScience & TechnologyJournal pre proofBig data importance and potential are becoming more and more relevant nowadays, enhanced by the explosive growth of information volume that is being generated on the Internet in the last years. In this sense, many experts agree that social media networks are one of the internet areas with higher growth in recent years and one of the fields that are expected to have a more significant increment in the coming years. Similarly, social media sites are quickly becoming one of the most popular platforms to discuss health issues and exchange social support with others. In this context, this work presents a new methodology to process, classify, visualise and analyse the big data knowledge produced by the sociome on social media platforms. This work proposes a methodology that combines natural language processing techniques, ontology-based named entity recognition methods, machine learning algorithms and graph mining techniques to: (i) reduce the irrelevant messages by identifying and focusing the analysis only on individuals and patient experiences from the public discussion; (ii) reduce the lexical noise produced by the different ways in how users express themselves through the use of domain ontologies; (iii) infer the demographic data of the individuals through the combined analysis of textual, geographical and visual profile information; (iv) perform a community detection and evaluate the health topic study combining the semantic processing of the public discourse with knowledge graph representation techniques; and (v) gain information about the shared resources combining the social media statistics with the semantical analysis of the web contents. The practical relevance of the proposed methodology has been proven in the study of 1.1 million unique messages from more than 400,000 distinct users related to one of the most popular dietary fads that evolve into a multibillion-dollar industry, i.e., gluten-free food. Besides, this work analysed one of the least research fields studied on Twitter concerning public health (i.e., the allergies or immunology diseases as celiac disease), discovering a wide range of health-related conclusions.SING group thanks CITI (Centro de Investigacion, Transferencia e Innovacion) from the University of Vigo for hosting its IT infrastructure. This work was supported by: the Associate Laboratory for Green Chemistry-LAQV, which is financed by national funds from and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of [UIDB/50006/2020] and [UIDB/04469/2020] units, and BioTecNorte operation [NORTE010145FEDER000004] funded by the European Regional Development Fund under the scope of Norte2020Programa Operacional Regional do Norte, the Xunta de Galicia (Centro singular de investigacion de Galicia accreditation 2019-2022) and the European Union (European Regional Development Fund - ERDF)- Ref. [ED431G2019/06] , and Conselleria de Educacion, Universidades e Formacion Profesional (Xunta de Galicia) under the scope of the strategic funding of [ED431C2018/55GRC] Competitive Reference Group. The authors also acknowledge the post-doctoral fellowship [ED481B2019032] of Martin PerezPerez, funded by the Xunta de Galicia. Funding for open access charge: Universidade de Vigo/CISUGinfo:eu-repo/semantics/publishedVersionElsevierUniversidade do MinhoPérez-Pérez, MartínIgrejas, GilbertoFdez-Riverola, FlorentinoLourenço, Anália2021-082021-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/73523engPérez-Pérez, Martín; Igrejas, Gilberto; Fdez-Riverola, Florentino; Lourenço, Anália, A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era. Artificial Intelligence in Medicine, 118(102131), 20210933-365710.1016/j.artmed.2021.10213134412847https://www.sciencedirect.com/science/article/pii/S093336572100124Xinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:50:32Zoai:repositorium.sdum.uminho.pt:1822/73523Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:49:15.384155Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
title |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
spellingShingle |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era Pérez-Pérez, Martín Social media Sociome profiling Text mining Graph mining Machine learning Health for informatics Ciências Médicas::Biotecnologia Médica Science & Technology |
title_short |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
title_full |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
title_fullStr |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
title_full_unstemmed |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
title_sort |
A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era |
author |
Pérez-Pérez, Martín |
author_facet |
Pérez-Pérez, Martín Igrejas, Gilberto Fdez-Riverola, Florentino Lourenço, Anália |
author_role |
author |
author2 |
Igrejas, Gilberto Fdez-Riverola, Florentino Lourenço, Anália |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Pérez-Pérez, Martín Igrejas, Gilberto Fdez-Riverola, Florentino Lourenço, Anália |
dc.subject.por.fl_str_mv |
Social media Sociome profiling Text mining Graph mining Machine learning Health for informatics Ciências Médicas::Biotecnologia Médica Science & Technology |
topic |
Social media Sociome profiling Text mining Graph mining Machine learning Health for informatics Ciências Médicas::Biotecnologia Médica Science & Technology |
description |
Journal pre proof |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-08 2021-08-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1822/73523 |
url |
http://hdl.handle.net/1822/73523 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Pérez-Pérez, Martín; Igrejas, Gilberto; Fdez-Riverola, Florentino; Lourenço, Anália, A framework to extract biomedical knowledge from gluten-related tweets: the case of dietary concerns in digital era. Artificial Intelligence in Medicine, 118(102131), 2021 0933-3657 10.1016/j.artmed.2021.102131 34412847 https://www.sciencedirect.com/science/article/pii/S093336572100124X |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799133072859332608 |