The Development of a Universal In Silico Predictor of Protein-Protein Interactions

Detalhes bibliográficos
Autor(a) principal: Valente, Guilherme T. [UNESP]
Data de Publicação: 2013
Outros Autores: Acencio, Marcio L. [UNESP], Martins, Cesar [UNESP], Lemke, Ney [UNESP]
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1371/journal.pone.0065587
http://hdl.handle.net/11449/75468
Resumo: Protein-protein interactions (PPIs) are essential for understanding the function of biological systems and have been characterized using a vast array of experimental techniques. These techniques detect only a small proportion of all PPIs and are labor intensive and time consuming. Therefore, the development of computational methods capable of predicting PPIs accelerates the pace of discovery of new interactions. This paper reports a machine learning-based prediction model, the Universal In Silico Predictor of Protein-Protein Interactions (UNISPPI), which is a decision tree model that can reliably predict PPIs for all species (including proteins from parasite-host associations) using only 20 combinations of amino acids frequencies from interacting and non-interacting proteins as learning features. UNISPPI was able to correctly classify 79.4% and 72.6% of experimentally supported interactions and non-interacting protein pairs, respectively, from an independent test set. Moreover, UNISPPI suggests that the frequencies of the amino acids asparagine, cysteine and isoleucine are important features for distinguishing between interacting and non-interacting protein pairs. We envisage that UNISPPI can be a useful tool for prioritizing interactions for experimental validation. © 2013 Valente et al.
id UNSP_6af7f4033382722a8eeb30ce2fbd398f
oai_identifier_str oai:repositorio.unesp.br:11449/75468
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling The Development of a Universal In Silico Predictor of Protein-Protein Interactionsamino acidasparaginecysteineisoleucineamino acid sequenceclassificationdecision treemachine learningpredictionprotein protein interactionstatistical analysisstatistical modeluniversal in silico predictor of protein protein interactionProtein-protein interactions (PPIs) are essential for understanding the function of biological systems and have been characterized using a vast array of experimental techniques. These techniques detect only a small proportion of all PPIs and are labor intensive and time consuming. Therefore, the development of computational methods capable of predicting PPIs accelerates the pace of discovery of new interactions. This paper reports a machine learning-based prediction model, the Universal In Silico Predictor of Protein-Protein Interactions (UNISPPI), which is a decision tree model that can reliably predict PPIs for all species (including proteins from parasite-host associations) using only 20 combinations of amino acids frequencies from interacting and non-interacting proteins as learning features. UNISPPI was able to correctly classify 79.4% and 72.6% of experimentally supported interactions and non-interacting protein pairs, respectively, from an independent test set. Moreover, UNISPPI suggests that the frequencies of the amino acids asparagine, cysteine and isoleucine are important features for distinguishing between interacting and non-interacting protein pairs. We envisage that UNISPPI can be a useful tool for prioritizing interactions for experimental validation. © 2013 Valente et al.Department of Morphology Universidade Estadual Paulista (UNESP), Botucatu, Sao PauloDepartment of Physics and Biophysics Universidade Estadual Paulista (UNESP), Botucatu, Sao PauloDepartment of Morphology Universidade Estadual Paulista (UNESP), Botucatu, Sao PauloDepartment of Physics and Biophysics Universidade Estadual Paulista (UNESP), Botucatu, Sao PauloUniversidade Estadual Paulista (Unesp)Valente, Guilherme T. [UNESP]Acencio, Marcio L. [UNESP]Martins, Cesar [UNESP]Lemke, Ney [UNESP]2014-05-27T11:29:33Z2014-05-27T11:29:33Z2013-05-31info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://dx.doi.org/10.1371/journal.pone.0065587PLoS ONE, v. 8, n. 5, 2013.1932-6203http://hdl.handle.net/11449/7546810.1371/journal.pone.0065587WOS:0003197999002122-s2.0-848785830332-s2.0-84878583033.pdf885880069942535279770359109521410000-0003-3534-974XScopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengPLOS ONE2.7661,164info:eu-repo/semantics/openAccess2023-12-17T06:16:34Zoai:repositorio.unesp.br:11449/75468Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462023-12-17T06:16:34Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv The Development of a Universal In Silico Predictor of Protein-Protein Interactions
title The Development of a Universal In Silico Predictor of Protein-Protein Interactions
spellingShingle The Development of a Universal In Silico Predictor of Protein-Protein Interactions
Valente, Guilherme T. [UNESP]
amino acid
asparagine
cysteine
isoleucine
amino acid sequence
classification
decision tree
machine learning
prediction
protein protein interaction
statistical analysis
statistical model
universal in silico predictor of protein protein interaction
title_short The Development of a Universal In Silico Predictor of Protein-Protein Interactions
title_full The Development of a Universal In Silico Predictor of Protein-Protein Interactions
title_fullStr The Development of a Universal In Silico Predictor of Protein-Protein Interactions
title_full_unstemmed The Development of a Universal In Silico Predictor of Protein-Protein Interactions
title_sort The Development of a Universal In Silico Predictor of Protein-Protein Interactions
author Valente, Guilherme T. [UNESP]
author_facet Valente, Guilherme T. [UNESP]
Acencio, Marcio L. [UNESP]
Martins, Cesar [UNESP]
Lemke, Ney [UNESP]
author_role author
author2 Acencio, Marcio L. [UNESP]
Martins, Cesar [UNESP]
Lemke, Ney [UNESP]
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Valente, Guilherme T. [UNESP]
Acencio, Marcio L. [UNESP]
Martins, Cesar [UNESP]
Lemke, Ney [UNESP]
dc.subject.por.fl_str_mv amino acid
asparagine
cysteine
isoleucine
amino acid sequence
classification
decision tree
machine learning
prediction
protein protein interaction
statistical analysis
statistical model
universal in silico predictor of protein protein interaction
topic amino acid
asparagine
cysteine
isoleucine
amino acid sequence
classification
decision tree
machine learning
prediction
protein protein interaction
statistical analysis
statistical model
universal in silico predictor of protein protein interaction
description Protein-protein interactions (PPIs) are essential for understanding the function of biological systems and have been characterized using a vast array of experimental techniques. These techniques detect only a small proportion of all PPIs and are labor intensive and time consuming. Therefore, the development of computational methods capable of predicting PPIs accelerates the pace of discovery of new interactions. This paper reports a machine learning-based prediction model, the Universal In Silico Predictor of Protein-Protein Interactions (UNISPPI), which is a decision tree model that can reliably predict PPIs for all species (including proteins from parasite-host associations) using only 20 combinations of amino acids frequencies from interacting and non-interacting proteins as learning features. UNISPPI was able to correctly classify 79.4% and 72.6% of experimentally supported interactions and non-interacting protein pairs, respectively, from an independent test set. Moreover, UNISPPI suggests that the frequencies of the amino acids asparagine, cysteine and isoleucine are important features for distinguishing between interacting and non-interacting protein pairs. We envisage that UNISPPI can be a useful tool for prioritizing interactions for experimental validation. © 2013 Valente et al.
publishDate 2013
dc.date.none.fl_str_mv 2013-05-31
2014-05-27T11:29:33Z
2014-05-27T11:29:33Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1371/journal.pone.0065587
PLoS ONE, v. 8, n. 5, 2013.
1932-6203
http://hdl.handle.net/11449/75468
10.1371/journal.pone.0065587
WOS:000319799900212
2-s2.0-84878583033
2-s2.0-84878583033.pdf
8858800699425352
7977035910952141
0000-0003-3534-974X
url http://dx.doi.org/10.1371/journal.pone.0065587
http://hdl.handle.net/11449/75468
identifier_str_mv PLoS ONE, v. 8, n. 5, 2013.
1932-6203
10.1371/journal.pone.0065587
WOS:000319799900212
2-s2.0-84878583033
2-s2.0-84878583033.pdf
8858800699425352
7977035910952141
0000-0003-3534-974X
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv PLOS ONE
2.766
1,164
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1799965312123142144