BioTextRetriever: A Tool to Retrieve Relevant Papers.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2011 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/67120 |
Resumo: | Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier. |
id |
RCAP_548b52cc4c9c05130886cf7c3002cad6 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/67120 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
BioTextRetriever: A Tool to Retrieve Relevant Papers.Ciências da computação e da informaçãoComputer and information sciencesWhenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.20112011-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/10216/67120eng1947-911510.4018/jkdb.2011070102Célia Talma GonçalvesRui CamachoEugénio Oliveirainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:50:51Zoai:repositorio-aberto.up.pt:10216/67120Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:09:56.214858Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
title |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
spellingShingle |
BioTextRetriever: A Tool to Retrieve Relevant Papers. Célia Talma Gonçalves Ciências da computação e da informação Computer and information sciences |
title_short |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
title_full |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
title_fullStr |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
title_full_unstemmed |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
title_sort |
BioTextRetriever: A Tool to Retrieve Relevant Papers. |
author |
Célia Talma Gonçalves |
author_facet |
Célia Talma Gonçalves Rui Camacho Eugénio Oliveira |
author_role |
author |
author2 |
Rui Camacho Eugénio Oliveira |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Célia Talma Gonçalves Rui Camacho Eugénio Oliveira |
dc.subject.por.fl_str_mv |
Ciências da computação e da informação Computer and information sciences |
topic |
Ciências da computação e da informação Computer and information sciences |
description |
Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier. |
publishDate |
2011 |
dc.date.none.fl_str_mv |
2011 2011-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/67120 |
url |
https://hdl.handle.net/10216/67120 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1947-9115 10.4018/jkdb.2011070102 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136023865720832 |