A deep learning relation extraction approach to support a biomedical semi-automatic curation task

Pérez-Pérez, Martín; Ferreira, Tânia; Igrejas, Gilberto; Fdez-Riverola, Florentino

A deep learning relation extraction approach to support a biomedical semi-automatic curation task

Detalhes bibliográficos
Autor(a) principal:	Pérez-Pérez, Martín
Data de Publicação:	2022
Outros Autores:	Ferreira, Tânia, Igrejas, Gilberto, Fdez-Riverola, Florentino
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10362/151080
Resumo:	SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from the University of Vigo for hosting its IT infrastructure. the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia) under the scope of the strategic funding of ED431C2018/55-GRC Competitive Reference Group, the “Centro singular de investigación de Galicia” (accreditation 2019-2022) funded by the European Regional Development Fund (ERDF)-Ref. ED431G2019/06. The authors also acknowledge the postdoctoral fellowship [ED481B-2019-032] of Martín Pérez-Pérez, funded by Xunta de Galicia. Funding for open access charge: Universidade de Vigo/CISUG. Publisher Copyright: © 2022 The Author(s)

Metadados do item

id	RCAP_daa369eec5e9e48dba96e7004116169e
oai_identifier_str	oai:run.unl.pt:10362/151080
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	A deep learning relation extraction approach to support a biomedical semi-automatic curation taskThe case of the gluten bibliomeDeep learningGlutenLiterature curationOntology-based methodsRelation extractionText miningEngineering(all)Computer Science ApplicationsArtificial IntelligenceSING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from the University of Vigo for hosting its IT infrastructure. the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia) under the scope of the strategic funding of ED431C2018/55-GRC Competitive Reference Group, the “Centro singular de investigación de Galicia” (accreditation 2019-2022) funded by the European Regional Development Fund (ERDF)-Ref. ED431G2019/06. The authors also acknowledge the postdoctoral fellowship [ED481B-2019-032] of Martín Pérez-Pérez, funded by Xunta de Galicia. Funding for open access charge: Universidade de Vigo/CISUG. Publisher Copyright: © 2022 The Author(s)Discover relevant biomedical interactions in the literature is crucial for enhancing biology research. This curation process has an essential role in studying the different processes and interactions reported that affect the biological process (e.g., genome, metabolome, and transcriptome). In this sense, the objective of this work is twofold: reduce the manual effort required to curate and review the existing biochemical interactions reported in the gluten-related bibliome, while proposing a novel vector-space integrated into a deep learning model to assists manual curators in a real curation task by learning from their previous decisions. With this objective, the present work proposes a novel vector-space that combine (i) high-level lexical and syntactic inference features as Wordnets and Health-related domain ontologies, (ii) unsupervised semantic resources as word embedding, (iii) semantic and syntactic sentence knowledge, (iv) abbreviation resolution support, (v) several state-of-the-art Named-entity recognition methods, and, finally, (vi) different feature construction and optimization techniques to support a semi-automatic curation workflow. Therefore, the application of the proposed workflow over a classified set of 2,451 relevant gluten-related documents produces a total of 8,349 relevant and 471,813 irrelevant relations distributed in thirteen domain health-related categories. Experimental results showed that the proposed workflow is a valuable approach for a semi-automatic relation extraction task. It was able to obtain satisfactory results in the early stages of a real-world curation task and saved manual annotation efforts by learning from the decisions made by manual curators in iterative annotation rounds. The average F.score for the proposed relation categories was 0.731, being the lowest F.score at 0.47 and the highest F.score at 0.929. The different resources used in this work as well as the manually curated corpus are public available on our GitHub repository.LAQV@REQUIMTERUNPérez-Pérez, MartínFerreira, TâniaIgrejas, GilbertoFdez-Riverola, Florentino2023-03-22T22:29:05Z2022-062022-06-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article18application/pdfhttp://hdl.handle.net/10362/151080eng0957-4174PURE: 56627517https://doi.org/10.1016/j.eswa.2022.116616info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:33:31Zoai:run.unl.pt:10362/151080Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:54:27.189257Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	A deep learning relation extraction approach to support a biomedical semi-automatic curation task The case of the gluten bibliome
title	A deep learning relation extraction approach to support a biomedical semi-automatic curation task
spellingShingle	A deep learning relation extraction approach to support a biomedical semi-automatic curation task Pérez-Pérez, Martín Deep learning Gluten Literature curation Ontology-based methods Relation extraction Text mining Engineering(all) Computer Science Applications Artificial Intelligence
title_short	A deep learning relation extraction approach to support a biomedical semi-automatic curation task
title_full	A deep learning relation extraction approach to support a biomedical semi-automatic curation task
title_fullStr	A deep learning relation extraction approach to support a biomedical semi-automatic curation task
title_full_unstemmed	A deep learning relation extraction approach to support a biomedical semi-automatic curation task
title_sort	A deep learning relation extraction approach to support a biomedical semi-automatic curation task
author	Pérez-Pérez, Martín
author_facet	Pérez-Pérez, Martín Ferreira, Tânia Igrejas, Gilberto Fdez-Riverola, Florentino
author_role	author
author2	Ferreira, Tânia Igrejas, Gilberto Fdez-Riverola, Florentino
author2_role	author author author
dc.contributor.none.fl_str_mv	LAQV@REQUIMTE RUN
dc.contributor.author.fl_str_mv	Pérez-Pérez, Martín Ferreira, Tânia Igrejas, Gilberto Fdez-Riverola, Florentino
dc.subject.por.fl_str_mv	Deep learning Gluten Literature curation Ontology-based methods Relation extraction Text mining Engineering(all) Computer Science Applications Artificial Intelligence
topic	Deep learning Gluten Literature curation Ontology-based methods Relation extraction Text mining Engineering(all) Computer Science Applications Artificial Intelligence
description	SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from the University of Vigo for hosting its IT infrastructure. the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia) under the scope of the strategic funding of ED431C2018/55-GRC Competitive Reference Group, the “Centro singular de investigación de Galicia” (accreditation 2019-2022) funded by the European Regional Development Fund (ERDF)-Ref. ED431G2019/06. The authors also acknowledge the postdoctoral fellowship [ED481B-2019-032] of Martín Pérez-Pérez, funded by Xunta de Galicia. Funding for open access charge: Universidade de Vigo/CISUG. Publisher Copyright: © 2022 The Author(s)
publishDate	2022
dc.date.none.fl_str_mv	2022-06 2022-06-01T00:00:00Z 2023-03-22T22:29:05Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10362/151080
url	http://hdl.handle.net/10362/151080
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	0957-4174 PURE: 56627517 https://doi.org/10.1016/j.eswa.2022.116616
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	18 application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799138133074247680

A deep learning relation extraction approach to support a biomedical semi-automatic curation task

Registros relacionados