PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results

Detalhes bibliográficos
Autor(a) principal: Leprevost, Felipe da Veiga
Data de Publicação: 2014
Outros Autores: Valente, Richard Hemmi, Lima, Diogo Borges, Perales, Jonas, Melani, Rafael, Yates III, John R., Barbosa, Valmir Carneiro, Junqueira, Magno, Carvalho, Paulo Costa
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da FIOCRUZ (ARCA)
Texto Completo: https://www.arca.fiocruz.br/handle/icict/8941
Resumo: 2015-08-31
id CRUZ_b4d8a017e7cbdd3d1fb2dba756c6073c
oai_identifier_str oai:www.arca.fiocruz.br:icict/8941
network_acronym_str CRUZ
network_name_str Repositório Institucional da FIOCRUZ (ARCA)
repository_id_str 2135
spelling Leprevost, Felipe da VeigaValente, Richard HemmiLima, Diogo BorgesPerales, JonasMelani, RafaelYates III, John R.Barbosa, Valmir CarneiroJunqueira, MagnoCarvalho, Paulo Costa2014-11-24T11:56:13Z2015-09-01T07:30:06Z2014LEPREVOST, Felipe da Veiga et al. PepExplorer: a similarity-driven tool for analyzing de novo sequencing results. Molecular & Cellular Proteomics, v. 13, p. 2480-2489, 2014. 1535-9476https://www.arca.fiocruz.br/handle/icict/894110.1074/mcp.M113.037002engMolecular & Cellular ProteomicsPepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Resultsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article2015-08-31Fundação Oswaldo Cruz. Instituto Carlos Chagas. Laboratório de Proteômica e Engenharia de Proteínas. Curitiba, PR, Brasil.Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Toxicologia. Rio de Janeiro, RJ, Brasil.Instituto Nacional de Ciência e Tecnologia em Toxinas (INCTTox / CNPq), Brasil.Universidade Federal do Rio de Janeiro. Unidade de Proteômica. Rede Proteômica Rio de Janeiro. Departamento de Bioquímica. Rio de Janeiro, RJ, Brasil.Instituto de Pesquisa Scripps. Departamento de Química Fisiologia. La Jolla, Califórnia.Universidade Federal do Rio de Janeiro. Sistemas Engenharia e Programa de Ciência da Computação. Rio de Janeiro, RJ, Brasil.Peptide spectrum matching is the current gold standard for protein identification via mass-spectrometry-based proteomics. Peptide spectrum matching compares experimental mass spectra against theoretical spectra generated from a protein sequence database to perform identification, but protein sequences not present in a database cannot be identified unless their sequences are in part conserved. The alternative approach, de novo sequencing, can make it possible to infer a peptide sequence directly from a mass spectrum, but interpreting long lists of peptide sequences resulting from large-scale experiments is not trivial. With this as motivation, PepExplorer was developed to use rigorous pattern recognition to assemble a list of homologue proteins using de novo sequencing data coupled to sequence alignment to allow biological interpretation of the data. PepExplorer can read the output of various widely adopted de novo sequencing tools and converge to a list of proteins with a global false-discovery rate. To this end, it employs a radial basis function neural network that considers precursor charge states, de novo sequencing scores, peptide lengths, and alignment scores to select similar protein candidates, from a target-decoy database, usually obtained from phylogenetically related species. Alignments are performed using a modified Smith–Waterman algorithm tailored for the task at hand. We verified the effectiveness of our approach using a reference set of identifications generated by ProLuCID when searching for Pyrococcus furiosus mass spectra on the corresponding NCBI RefSeq database. We then modified the sequence database by swapping amino acids until ProLuCID was no longer capable of identifying any proteins. By searching the mass spectra using PepExplorer on the modified database, we were able to recover most of the identifications at a 1% false-discovery rate. Finally, we employed PepExplorer to disclose a comprehensive proteomic assessment of the Bothrops jararaca plasma, a known biological source of natural inhibitors of snake toxins. PepExplorer is integrated into the PatternLab for Proteomics environment, which makes available various tools for downstream data analysis, including resources for quantitative and differential proteomics.PepExplorerProteomicsPeptideinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da FIOCRUZ (ARCA)instname:Fundação Oswaldo Cruz (FIOCRUZ)instacron:FIOCRUZORIGINALartigo2.pdfartigo2.pdfapplication/pdf2027015https://www.arca.fiocruz.br/bitstream/icict/8941/2/artigo2.pdfa17387f838ee6ae2a47cf30e17f3b9abMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81914https://www.arca.fiocruz.br/bitstream/icict/8941/3/license.txt7d48279ffeed55da8dfe2f8e81f3b81fMD53TEXTartigo2.pdf.txtartigo2.pdf.txtExtracted texttext/plain55932https://www.arca.fiocruz.br/bitstream/icict/8941/4/artigo2.pdf.txt0b353ecd5be67ac5ae87ef11f5e10b24MD54icict/89412021-03-24 16:32:18.214oai:www.arca.fiocruz.br:icict/8941TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkFvIGNvbmNvcmRhciBlIGFjZWl0YXIgZXN0YSBsaWNlbsOnYSB2b2PDqiAoYXV0b3Igb3UgZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzKToKCmEpIERlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zDrXRpY2EgZGUgY29weXJpZ2h0IGRhIGVkaXRvcmEgZG8gc2V1IGRvY3VtZW50by4KCmIpIERlY2xhcmEgcXVlIGNvbmhlY2UgZSBhY2VpdGEgYXMgRGlyZXRyaXplcyBwYXJhIG8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgRnVuZGHDp8OjbyBPc3dhbGRvIENydXogKEZJT0NSVVopLgoKYykgQ29uY2VkZSDDoCBGSU9DUlVaIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSBhcnF1aXZhciwgcmVwcm9kdXppciwgY29udmVydGVyIChjb21vIGRlZmluaWRvIGEgc2VndWlyKSwgY29tdW5pY2FyCiAKZS9vdSBkaXN0cmlidWlyIG5vIFJlcG9zaXTDs3JpbyBkYSBGSU9DUlVaLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgCgpwb3IgcXVhbHF1ZXIgb3V0cm8gbWVpby4KCmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgRklPQ1JVWiBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgCgpwYXJhIHF1YWxxdWVyIGZvcm1hdG8gZGUgYXJxdWl2bywgbWVpbyBvdSBzdXBvcnRlLCBwYXJhIGVmZWl0b3MgZGUgc2VndXJhbsOnYSwgcHJlc2VydmHDp8OjbyAoYmFja3VwKSBlIGFjZXNzby4KCmUpIERlY2xhcmEgcXVlIG8gZG9jdW1lbnRvIHN1Ym1ldGlkbyDDqSBvIHNldSB0cmFiYWxobyBvcmlnaW5hbCwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyAKCmNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBEZWNsYXJhIHRhbWLDqW0gcXVlIGEgZW50cmVnYSBkbyBkb2N1bWVudG8gbsOjbyBpbmZyaW5nZSBvcyBkaXJlaXRvcyBkZSBxdWFscXVlciBvdXRyYSBwZXNzb2Egb3UgZW50aWRhZGUuCgpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlIGF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIAoKaXJyZXN0cml0YSBkbyByZXNwZWN0aXZvIGRldGVudG9yIGRlc3NlcyBkaXJlaXRvcywgcGFyYSBjZWRlciBhIEZJT0NSVVogb3MgZGlyZWl0b3MgcmVxdWVyaWRvcyBwb3IgZXN0YSBMaWNlbsOnYSBlIGF1dG9yaXphciBhIAoKdXRpbGl6w6EtbG9zIGxlZ2FsbWVudGUuIERlY2xhcmEgdGFtYsOpbSBxdWUgZXNzZSBtYXRlcmlhbCBjdWpvcyBkaXJlaXRvcyBzw6NvIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIGlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIAoKbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZS4KCmcpIFNFIE8gRE9DVU1FTlRPIEVOVFJFR1VFIMOJIEJBU0VBRE8gRU0gVFJBQkFMSE8gRklOQU5DSUFETyBPVSBBUE9JQURPIFBPUiBPVVRSQSBJTlNUSVRVScOHw4NPIFFVRSBOw4NPIEEgRklPQ1JVWiwgREVDTEFSQSBRVUUgQ1VNUFJJVSAKClFVQUlTUVVFUiBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUEVMTyBSRVNQRUNUSVZPIENPTlRSQVRPIE9VIEFDT1JETy4gQSBGSU9DUlVaIGlkZW50aWZpY2Fyw6EgY2xhcmFtZW50ZSBvKHMpIG5vbWUocykgZG8ocykgYXV0b3IoZXMpIGRvcyAKCmRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://www.arca.fiocruz.br/oai/requestrepositorio.arca@fiocruz.bropendoar:21352021-03-24T19:32:18Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)false
dc.title.pt_BR.fl_str_mv PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
title PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
spellingShingle PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
Leprevost, Felipe da Veiga
PepExplorer
Proteomics
Peptide
title_short PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
title_full PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
title_fullStr PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
title_full_unstemmed PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
title_sort PepExplorer: A Similarity-driven Tool for Analyzing de Novo Sequencing Results
author Leprevost, Felipe da Veiga
author_facet Leprevost, Felipe da Veiga
Valente, Richard Hemmi
Lima, Diogo Borges
Perales, Jonas
Melani, Rafael
Yates III, John R.
Barbosa, Valmir Carneiro
Junqueira, Magno
Carvalho, Paulo Costa
author_role author
author2 Valente, Richard Hemmi
Lima, Diogo Borges
Perales, Jonas
Melani, Rafael
Yates III, John R.
Barbosa, Valmir Carneiro
Junqueira, Magno
Carvalho, Paulo Costa
author2_role author
author
author
author
author
author
author
author
dc.contributor.author.fl_str_mv Leprevost, Felipe da Veiga
Valente, Richard Hemmi
Lima, Diogo Borges
Perales, Jonas
Melani, Rafael
Yates III, John R.
Barbosa, Valmir Carneiro
Junqueira, Magno
Carvalho, Paulo Costa
dc.subject.en.pt_BR.fl_str_mv PepExplorer
Proteomics
Peptide
topic PepExplorer
Proteomics
Peptide
description 2015-08-31
publishDate 2014
dc.date.accessioned.fl_str_mv 2014-11-24T11:56:13Z
dc.date.issued.fl_str_mv 2014
dc.date.available.fl_str_mv 2015-09-01T07:30:06Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.citation.fl_str_mv LEPREVOST, Felipe da Veiga et al. PepExplorer: a similarity-driven tool for analyzing de novo sequencing results. Molecular & Cellular Proteomics, v. 13, p. 2480-2489, 2014. 
dc.identifier.uri.fl_str_mv https://www.arca.fiocruz.br/handle/icict/8941
dc.identifier.issn.none.fl_str_mv 1535-9476
dc.identifier.doi.pt_BR.fl_str_mv 10.1074/mcp.M113.037002
identifier_str_mv LEPREVOST, Felipe da Veiga et al. PepExplorer: a similarity-driven tool for analyzing de novo sequencing results. Molecular & Cellular Proteomics, v. 13, p. 2480-2489, 2014. 
1535-9476
10.1074/mcp.M113.037002
url https://www.arca.fiocruz.br/handle/icict/8941
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Molecular & Cellular Proteomics
publisher.none.fl_str_mv Molecular & Cellular Proteomics
dc.source.none.fl_str_mv reponame:Repositório Institucional da FIOCRUZ (ARCA)
instname:Fundação Oswaldo Cruz (FIOCRUZ)
instacron:FIOCRUZ
instname_str Fundação Oswaldo Cruz (FIOCRUZ)
instacron_str FIOCRUZ
institution FIOCRUZ
reponame_str Repositório Institucional da FIOCRUZ (ARCA)
collection Repositório Institucional da FIOCRUZ (ARCA)
bitstream.url.fl_str_mv https://www.arca.fiocruz.br/bitstream/icict/8941/2/artigo2.pdf
https://www.arca.fiocruz.br/bitstream/icict/8941/3/license.txt
https://www.arca.fiocruz.br/bitstream/icict/8941/4/artigo2.pdf.txt
bitstream.checksum.fl_str_mv a17387f838ee6ae2a47cf30e17f3b9ab
7d48279ffeed55da8dfe2f8e81f3b81f
0b353ecd5be67ac5ae87ef11f5e10b24
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)
repository.mail.fl_str_mv repositorio.arca@fiocruz.br
_version_ 1813009219582951424