@Note: a workbench for biomedical text mining
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | , , , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/1822/9416 |
Resumo: | Biomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature. However, most efforts have addressed the benchmarking of new algorithms rather than user operational needs. Bridging the gap between BioTM researchers and biologists’ needs is crucial to solve real-world problems and promote further research. We present @Note, a platform for BioTM that aims at the effective translation of the advances between three distinct classes of users: biologists, text miners and software developers. Its main functional contributions are the ability to process abstracts and full-texts; an information retrieval module enabling PubMed search and journal crawling; a pre-processing module with PDF-to-text conversion, tokenisation and stopword removal; a semantic annotation schema; a lexicon-based annotator; a user-friendly annotation view that allows to correct annotations and a Text Mining Module supporting dataset preparation and algorithm evaluation. @Note improves the interoperability, modularity and flexibility when integrating in-home and open-source third-party components. Its component-based architecture allows the rapid development of new applications, emphasizing the principles of transparency and simplicity of use. Although it is still on-going, it has already allowed the development of applications that are currently being used. |
id |
RCAP_25d0f96be66dade007ad14c5baa4a436 |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/9416 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
@Note: a workbench for biomedical text miningBiomedical text miningNamed entity recognitionInformation retrievalInformation extractionLiterature curationSemantic annotationComponent-based software developmentScience & TechnologyBiomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature. However, most efforts have addressed the benchmarking of new algorithms rather than user operational needs. Bridging the gap between BioTM researchers and biologists’ needs is crucial to solve real-world problems and promote further research. We present @Note, a platform for BioTM that aims at the effective translation of the advances between three distinct classes of users: biologists, text miners and software developers. Its main functional contributions are the ability to process abstracts and full-texts; an information retrieval module enabling PubMed search and journal crawling; a pre-processing module with PDF-to-text conversion, tokenisation and stopword removal; a semantic annotation schema; a lexicon-based annotator; a user-friendly annotation view that allows to correct annotations and a Text Mining Module supporting dataset preparation and algorithm evaluation. @Note improves the interoperability, modularity and flexibility when integrating in-home and open-source third-party components. Its component-based architecture allows the rapid development of new applications, emphasizing the principles of transparency and simplicity of use. Although it is still on-going, it has already allowed the development of applications that are currently being used.Fundação para a Ciência e a Tecnologia (FCT)ElsevierUniversidade do MinhoLourenço, AnáliaCarreira, RafaelCarneiro, S.Maia, PauloGlez-Peña, DanielFdez-Riverola, FlorentinoFerreira, Eugénio C.Rocha, I.Rocha, Miguel2009-082009-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/1822/9416eng"Journal of Biomedical Informatics". ISSN 1532-0464. 42:4 (Aug. 2009) 710-720.1532-046410.1016/j.jbi.2009.04.00219393341info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-11T06:50:14Zoai:repositorium.sdum.uminho.pt:1822/9416Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-11T06:50:14Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
@Note: a workbench for biomedical text mining |
title |
@Note: a workbench for biomedical text mining |
spellingShingle |
@Note: a workbench for biomedical text mining Lourenço, Anália Biomedical text mining Named entity recognition Information retrieval Information extraction Literature curation Semantic annotation Component-based software development Science & Technology |
title_short |
@Note: a workbench for biomedical text mining |
title_full |
@Note: a workbench for biomedical text mining |
title_fullStr |
@Note: a workbench for biomedical text mining |
title_full_unstemmed |
@Note: a workbench for biomedical text mining |
title_sort |
@Note: a workbench for biomedical text mining |
author |
Lourenço, Anália |
author_facet |
Lourenço, Anália Carreira, Rafael Carneiro, S. Maia, Paulo Glez-Peña, Daniel Fdez-Riverola, Florentino Ferreira, Eugénio C. Rocha, I. Rocha, Miguel |
author_role |
author |
author2 |
Carreira, Rafael Carneiro, S. Maia, Paulo Glez-Peña, Daniel Fdez-Riverola, Florentino Ferreira, Eugénio C. Rocha, I. Rocha, Miguel |
author2_role |
author author author author author author author author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Lourenço, Anália Carreira, Rafael Carneiro, S. Maia, Paulo Glez-Peña, Daniel Fdez-Riverola, Florentino Ferreira, Eugénio C. Rocha, I. Rocha, Miguel |
dc.subject.por.fl_str_mv |
Biomedical text mining Named entity recognition Information retrieval Information extraction Literature curation Semantic annotation Component-based software development Science & Technology |
topic |
Biomedical text mining Named entity recognition Information retrieval Information extraction Literature curation Semantic annotation Component-based software development Science & Technology |
description |
Biomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature. However, most efforts have addressed the benchmarking of new algorithms rather than user operational needs. Bridging the gap between BioTM researchers and biologists’ needs is crucial to solve real-world problems and promote further research. We present @Note, a platform for BioTM that aims at the effective translation of the advances between three distinct classes of users: biologists, text miners and software developers. Its main functional contributions are the ability to process abstracts and full-texts; an information retrieval module enabling PubMed search and journal crawling; a pre-processing module with PDF-to-text conversion, tokenisation and stopword removal; a semantic annotation schema; a lexicon-based annotator; a user-friendly annotation view that allows to correct annotations and a Text Mining Module supporting dataset preparation and algorithm evaluation. @Note improves the interoperability, modularity and flexibility when integrating in-home and open-source third-party components. Its component-based architecture allows the rapid development of new applications, emphasizing the principles of transparency and simplicity of use. Although it is still on-going, it has already allowed the development of applications that are currently being used. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-08 2009-08-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1822/9416 |
url |
https://hdl.handle.net/1822/9416 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
"Journal of Biomedical Informatics". ISSN 1532-0464. 42:4 (Aug. 2009) 710-720. 1532-0464 10.1016/j.jbi.2009.04.002 19393341 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
mluisa.alvim@gmail.com |
_version_ |
1817545108900282368 |