Automatic assignment of prokaryotic genes to functional categories using literature profiling.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da FIOCRUZ (ARCA) |
Texto Completo: | https://www.arca.fiocruz.br/handle/icict/7868 |
Resumo: | Fundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, Brasil |
id |
CRUZ_569e08be4ab38ac704bb82dc50808502 |
---|---|
oai_identifier_str |
oai:www.arca.fiocruz.br:icict/7868 |
network_acronym_str |
CRUZ |
network_name_str |
Repositório Institucional da FIOCRUZ (ARCA) |
repository_id_str |
2135 |
spelling |
Torrieri, RaulOliveira, Francislon Silva deOliveira, Guilherme Corrêa deCoimbra, Roney Santos2014-07-04T14:29:41Z2014-07-04T14:29:41Z2012TORRIERI, Raul et al. Automatic assignment of prokaryotic genes to functional categories using literature profiling. Plos One. 2012, vol.7, pp. e474361932-6203https://www.arca.fiocruz.br/handle/icict/786810.1371/journal.pone.0047436engPublic Library of ScienceAutomatic assignment of prokaryotic genes to functional categories using literature profiling.info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, BrasilFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, BrasilFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, BrasilFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, Brasil/Fundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Grupo de Biologia Genômica e Computacional. Belo Horizonte, MG, BrasilFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, Brasil/Fundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Grupo de Biologia Genômica e Computacional. Belo Horizonte, MG, BrasilIn the last years, there was an exponential increase in the number of publicly available genomes. Once finished, most genome projects lack financial support to review annotations. A few of these gene annotations are based on a combination of bioinformatics evidence, however, in most cases, annotations are based solely on sequence similarity to a previously known gene, which was most probably annotated in the same way. As a result, a large number of predicted genes remain unassigned to any functional category despite the fact that there is enough evidence in the literature to predict their function. We developed a classifier trained with term-frequency vectors automatically disclosed from text corpora of an ensemble of genes representative of each functional category of the J. Craig Venter Institute Comprehensive Microbial Resource (JCVI-CMR) ontology. The classifier achieved up to 84% precision with 68% recall (for confidence≥0.4), F-measure 0.76 (recall and precision equally weighted) in an independent set of 2,220 genes, from 13 bacterial species, previously classified by JCVI-CMR into unambiguous categories of its ontology. Finally, the classifier assigned (confidence≥0.7) to functional categories a total of 5,235 out of the ~24 thousand genes previously in categories “Unknown function” or “Unclassified” for which there is literature in MEDLINE. Two biologists reviewed the literature of 100 of these genes, randomly picket, and assigned them to the same functional categories predicted by the automatic classifier. Our results confirmed the hypothesis that it is possible to confidently assign genes of a real world repository to functional categories, based exclusively on the automatic profiling of its associated literature.Gene ontologiesGene ontology annotationsGene ontology annotationsGenomic databasesMetabolic processesOntologiesProkaryotic cellsinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da FIOCRUZ (ARCA)instname:Fundação Oswaldo Cruz (FIOCRUZ)instacron:FIOCRUZORIGINALAutomatic assignment of prokaryotic genes to functional categories using literature profiling.pdfAutomatic assignment of prokaryotic genes to functional categories using literature profiling.pdfapplication/pdf1400414https://www.arca.fiocruz.br/bitstream/icict/7868/1/Automatic%20assignment%20of%20prokaryotic%20genes%20to%20functional%20categories%20using%20literature%20profiling.pdf2cd0fca991716d620965797b9e8d55f4MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81914https://www.arca.fiocruz.br/bitstream/icict/7868/2/license.txt7d48279ffeed55da8dfe2f8e81f3b81fMD52TEXTAutomatic assignment of prokaryotic genes to functional categories using literature profiling.pdf.txtAutomatic assignment of prokaryotic genes to functional categories using literature profiling.pdf.txtExtracted texttext/plain7https://www.arca.fiocruz.br/bitstream/icict/7868/3/Automatic%20assignment%20of%20prokaryotic%20genes%20to%20functional%20categories%20using%20literature%20profiling.pdf.txt212b0306580d4f0044d18f9a3edcc832MD53icict/78682018-04-06 08:43:56.975oai:www.arca.fiocruz.br:icict/7868TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkFvIGNvbmNvcmRhciBlIGFjZWl0YXIgZXN0YSBsaWNlbsOnYSB2b2PDqiAoYXV0b3Igb3UgZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzKToKCmEpIERlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zDrXRpY2EgZGUgY29weXJpZ2h0IGRhIGVkaXRvcmEgZG8gc2V1IGRvY3VtZW50by4KCmIpIERlY2xhcmEgcXVlIGNvbmhlY2UgZSBhY2VpdGEgYXMgRGlyZXRyaXplcyBwYXJhIG8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgRnVuZGHDp8OjbyBPc3dhbGRvIENydXogKEZJT0NSVVopLgoKYykgQ29uY2VkZSDDoCBGSU9DUlVaIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSBhcnF1aXZhciwgcmVwcm9kdXppciwgY29udmVydGVyIChjb21vIGRlZmluaWRvIGEgc2VndWlyKSwgY29tdW5pY2FyCiAKZS9vdSBkaXN0cmlidWlyIG5vIFJlcG9zaXTDs3JpbyBkYSBGSU9DUlVaLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgCgpwb3IgcXVhbHF1ZXIgb3V0cm8gbWVpby4KCmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgRklPQ1JVWiBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgCgpwYXJhIHF1YWxxdWVyIGZvcm1hdG8gZGUgYXJxdWl2bywgbWVpbyBvdSBzdXBvcnRlLCBwYXJhIGVmZWl0b3MgZGUgc2VndXJhbsOnYSwgcHJlc2VydmHDp8OjbyAoYmFja3VwKSBlIGFjZXNzby4KCmUpIERlY2xhcmEgcXVlIG8gZG9jdW1lbnRvIHN1Ym1ldGlkbyDDqSBvIHNldSB0cmFiYWxobyBvcmlnaW5hbCwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyAKCmNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBEZWNsYXJhIHRhbWLDqW0gcXVlIGEgZW50cmVnYSBkbyBkb2N1bWVudG8gbsOjbyBpbmZyaW5nZSBvcyBkaXJlaXRvcyBkZSBxdWFscXVlciBvdXRyYSBwZXNzb2Egb3UgZW50aWRhZGUuCgpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlIGF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIAoKaXJyZXN0cml0YSBkbyByZXNwZWN0aXZvIGRldGVudG9yIGRlc3NlcyBkaXJlaXRvcywgcGFyYSBjZWRlciBhIEZJT0NSVVogb3MgZGlyZWl0b3MgcmVxdWVyaWRvcyBwb3IgZXN0YSBMaWNlbsOnYSBlIGF1dG9yaXphciBhIAoKdXRpbGl6w6EtbG9zIGxlZ2FsbWVudGUuIERlY2xhcmEgdGFtYsOpbSBxdWUgZXNzZSBtYXRlcmlhbCBjdWpvcyBkaXJlaXRvcyBzw6NvIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIGlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIAoKbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZS4KCmcpIFNFIE8gRE9DVU1FTlRPIEVOVFJFR1VFIMOJIEJBU0VBRE8gRU0gVFJBQkFMSE8gRklOQU5DSUFETyBPVSBBUE9JQURPIFBPUiBPVVRSQSBJTlNUSVRVScOHw4NPIFFVRSBOw4NPIEEgRklPQ1JVWiwgREVDTEFSQSBRVUUgQ1VNUFJJVSAKClFVQUlTUVVFUiBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUEVMTyBSRVNQRUNUSVZPIENPTlRSQVRPIE9VIEFDT1JETy4gQSBGSU9DUlVaIGlkZW50aWZpY2Fyw6EgY2xhcmFtZW50ZSBvKHMpIG5vbWUocykgZG8ocykgYXV0b3IoZXMpIGRvcyAKCmRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://www.arca.fiocruz.br/oai/requestrepositorio.arca@fiocruz.bropendoar:21352018-04-06T11:43:56Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)false |
dc.title.pt_BR.fl_str_mv |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
title |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
spellingShingle |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. Torrieri, Raul Gene ontologies Gene ontology annotations Gene ontology annotations Genomic databases Metabolic processes Ontologies Prokaryotic cells |
title_short |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
title_full |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
title_fullStr |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
title_full_unstemmed |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
title_sort |
Automatic assignment of prokaryotic genes to functional categories using literature profiling. |
author |
Torrieri, Raul |
author_facet |
Torrieri, Raul Oliveira, Francislon Silva de Oliveira, Guilherme Corrêa de Coimbra, Roney Santos |
author_role |
author |
author2 |
Oliveira, Francislon Silva de Oliveira, Guilherme Corrêa de Coimbra, Roney Santos |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Torrieri, Raul Oliveira, Francislon Silva de Oliveira, Guilherme Corrêa de Coimbra, Roney Santos |
dc.subject.en.pt_BR.fl_str_mv |
Gene ontologies Gene ontology annotations Gene ontology annotations Genomic databases Metabolic processes Ontologies Prokaryotic cells |
topic |
Gene ontologies Gene ontology annotations Gene ontology annotations Genomic databases Metabolic processes Ontologies Prokaryotic cells |
description |
Fundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Centro de Excelência em Bioinformática. Belo Horizonte, MG, Brasil |
publishDate |
2012 |
dc.date.issued.fl_str_mv |
2012 |
dc.date.accessioned.fl_str_mv |
2014-07-04T14:29:41Z |
dc.date.available.fl_str_mv |
2014-07-04T14:29:41Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
TORRIERI, Raul et al. Automatic assignment of prokaryotic genes to functional categories using literature profiling. Plos One. 2012, vol.7, pp. e47436 |
dc.identifier.uri.fl_str_mv |
https://www.arca.fiocruz.br/handle/icict/7868 |
dc.identifier.issn.none.fl_str_mv |
1932-6203 |
dc.identifier.doi.none.fl_str_mv |
10.1371/journal.pone.0047436 |
identifier_str_mv |
TORRIERI, Raul et al. Automatic assignment of prokaryotic genes to functional categories using literature profiling. Plos One. 2012, vol.7, pp. e47436 1932-6203 10.1371/journal.pone.0047436 |
url |
https://www.arca.fiocruz.br/handle/icict/7868 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Public Library of Science |
publisher.none.fl_str_mv |
Public Library of Science |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da FIOCRUZ (ARCA) instname:Fundação Oswaldo Cruz (FIOCRUZ) instacron:FIOCRUZ |
instname_str |
Fundação Oswaldo Cruz (FIOCRUZ) |
instacron_str |
FIOCRUZ |
institution |
FIOCRUZ |
reponame_str |
Repositório Institucional da FIOCRUZ (ARCA) |
collection |
Repositório Institucional da FIOCRUZ (ARCA) |
bitstream.url.fl_str_mv |
https://www.arca.fiocruz.br/bitstream/icict/7868/1/Automatic%20assignment%20of%20prokaryotic%20genes%20to%20functional%20categories%20using%20literature%20profiling.pdf https://www.arca.fiocruz.br/bitstream/icict/7868/2/license.txt https://www.arca.fiocruz.br/bitstream/icict/7868/3/Automatic%20assignment%20of%20prokaryotic%20genes%20to%20functional%20categories%20using%20literature%20profiling.pdf.txt |
bitstream.checksum.fl_str_mv |
2cd0fca991716d620965797b9e8d55f4 7d48279ffeed55da8dfe2f8e81f3b81f 212b0306580d4f0044d18f9a3edcc832 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ) |
repository.mail.fl_str_mv |
repositorio.arca@fiocruz.br |
_version_ |
1813009300538261504 |