Querying semantic catalogues of biomedical databases
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10198/1159 |
Resumo: | Background: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.Methods: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.Results: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical on-tologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https:// bioinformatics-ua.github.io/BioKBQA/.Conclusion: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages. |
id |
RCAP_1d777db5473a835c66ad4b46bb5d681d |
---|---|
oai_identifier_str |
oai:bibliotecadigital.ipb.pt:10198/1159 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Querying semantic catalogues of biomedical databasesBiomedical dataKnowledge basesSemantic dataLinked dataInformation extractionNatural language interfacesQuestion answeringBackground: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.Methods: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.Results: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical on-tologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https:// bioinformatics-ua.github.io/BioKBQA/.Conclusion: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.ElsevierBiblioteca Digital do IPBPereira, ArnaldoAlmeida, Joao RafaelLopes, Rui PedroOliveira, José Luís2009-04-23T14:18:23Z20232023-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10198/1159engPereira, Arnaldo; Almeida, Joao Rafael; Lopes, Rui Pedro; Oliveira, Jose Luis. (2023). Querying semantic catalogues of biomedical databases. Journal of Biomedical Informatics. eISSN 1532-0480. 137, p. 1-121532-046410.1016/j.jbi.2022.1042721532-0480info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-07T01:17:49Zoai:bibliotecadigital.ipb.pt:10198/1159Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:54:43.361253Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Querying semantic catalogues of biomedical databases |
title |
Querying semantic catalogues of biomedical databases |
spellingShingle |
Querying semantic catalogues of biomedical databases Pereira, Arnaldo Biomedical data Knowledge bases Semantic data Linked data Information extraction Natural language interfaces Question answering |
title_short |
Querying semantic catalogues of biomedical databases |
title_full |
Querying semantic catalogues of biomedical databases |
title_fullStr |
Querying semantic catalogues of biomedical databases |
title_full_unstemmed |
Querying semantic catalogues of biomedical databases |
title_sort |
Querying semantic catalogues of biomedical databases |
author |
Pereira, Arnaldo |
author_facet |
Pereira, Arnaldo Almeida, Joao Rafael Lopes, Rui Pedro Oliveira, José Luís |
author_role |
author |
author2 |
Almeida, Joao Rafael Lopes, Rui Pedro Oliveira, José Luís |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Biblioteca Digital do IPB |
dc.contributor.author.fl_str_mv |
Pereira, Arnaldo Almeida, Joao Rafael Lopes, Rui Pedro Oliveira, José Luís |
dc.subject.por.fl_str_mv |
Biomedical data Knowledge bases Semantic data Linked data Information extraction Natural language interfaces Question answering |
topic |
Biomedical data Knowledge bases Semantic data Linked data Information extraction Natural language interfaces Question answering |
description |
Background: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.Methods: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.Results: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical on-tologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https:// bioinformatics-ua.github.io/BioKBQA/.Conclusion: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-04-23T14:18:23Z 2023 2023-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10198/1159 |
url |
http://hdl.handle.net/10198/1159 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Pereira, Arnaldo; Almeida, Joao Rafael; Lopes, Rui Pedro; Oliveira, Jose Luis. (2023). Querying semantic catalogues of biomedical databases. Journal of Biomedical Informatics. eISSN 1532-0480. 137, p. 1-12 1532-0464 10.1016/j.jbi.2022.104272 1532-0480 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135146679468032 |