Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Research, Society and Development |
Texto Completo: | https://rsdjournal.org/index.php/rsd/article/view/31222 |
Resumo: | Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex. |
id |
UNIFEI_479d5f91cca3039153da20413e41ae8d |
---|---|
oai_identifier_str |
oai:ojs.pkp.sfu.ca:article/31222 |
network_acronym_str |
UNIFEI |
network_name_str |
Research, Society and Development |
repository_id_str |
|
spelling |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressorsModelos de estimación de afinidad de proteínas para el diseño inteligente de fármacos basados en pseudoconvoluciones y regresores no linealesModelos de estimativa de afinidade de proteínas para design inteligente de drogas com base em pseudoconvoluções e regressores não linearesAffinity MarkersAmino Acids, Peptides and ProteinsArtificial intelligence.Marcadores de afinidadAminoácidos, Péptidos y ProteínasInteligencia artificial.Marcadores de AfinidadeAminoácidos, Peptídeos e ProteínasInteligência artificial.Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex.Propósito: La aparición de nuevos virus y, en consecuencia, de nuevas enfermedades hace cada vez más necesaria la producción rápida y precisa de nuevos fármacos. Con la disponibilidad de grandes bases de datos de proteínas y medidas de afinidad, es posible crear funciones de puntuación para predecir la afinidad molecular. Estas funciones son fundamentales para el desarrollo inteligente de fármacos. Objetivo: En este trabajo, proponemos una función de puntuación para predecir la afinidad entre dos proteínas. El método se basa en la extracción de características por transferencia de aprendizaje en secuencias representadas en pseudoconvulsiones. Método: Las pseudoconvulsiones organizan secuencias en distribuciones de vecindario base. Cada distribución se representa mediante una imagen. Luego, dos proteínas se transforman en dos imágenes que se concatenan, formando la tercera imagen. A través del aprendizaje de transferencia profundo, esta imagen resultante se representa en un vector de atributos, que he reducido dimensionalmente por Random Forest. Finalmente, el vector de atributos reducido se aplica a un algoritmo de regresión que devuelve el grado de afinidad de las dos proteínas. Resultados: Utilizamos la base de datos Affinity Benchmark Versión 2. Se utilizaron 145 complejos para entrenar el modelo y 35 para probar. Los resultados mostraron un desempeño igual o superior a los métodos de evaluación de afinidad de proteínas de última generación, considerando los coeficientes de correlación de Pearson, Spearman y Kendall. Los mejores resultados fueron 0.66, 0.70 y 0.52. Conclusión: El método propuesto puede caracterizar secuencias de proteínas de modo que se pueda estimar la afinidad de unión entre dos proteínas sin simular la estructura tridimensional del complejo.Propósito: O surgimento de novos vírus e, consequentemente, novas doenças torna cada vez mais necessária a produção rápida e precisa de novos medicamentos. Com a disponibilidade de grandes bancos de dados de proteínas e medidas de afinidade, é possível construir funções de pontuação para prever a afinidade molecular. Essas funções são fundamentais para o design inteligente de medicamentos. Objetivo: Neste trabalho, propomos uma função de pontuação para prever a afinidade entre duas proteínas. O método é baseado na extração de características por transferência de aprendizado em sequências representadas em pseudo-convoluções. Método: As pseudo-convoluções organizam as sequências em distribuições de vizinhança de base. Cada distribuição é então representada por uma imagem. Duas proteínas são então transformadas em duas imagens que são concatenadas, formando a terceira imagem. Por meio de deep transfer learning, essa imagem resultante é então representada por um vetor de atributos, que tem a dimensionalidade reduzida por Random Forest. Por fim, o vetor de atributos reduzido é aplicado a uma máquina de aprendizado de regressão que retorna o grau de afinidade das duas proteínas. Resultados: Usamos o banco de dados Affinity Benchmark Versão 2. 145 complexos foram usados para treinamento do modelo e 35 para teste. Os resultados mostraram um desempenho igual ou superior aos métodos de avaliação de afinidade de proteínas no estado da arte, considerando os coeficientes de correlação de Pearson, Spearman e Kendall. Os melhores resultados foram 0.66, 0.70 e 0.52. Conclusão: O método proposto pode caracterizar sequências proteicas de forma que a afinidade de ligação entre duas proteínas possa ser estimada sem simular a estrutura tridimensional do complexo.Research, Society and Development2022-06-24info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://rsdjournal.org/index.php/rsd/article/view/3122210.33448/rsd-v11i8.31222Research, Society and Development; Vol. 11 No. 8; e40311831222Research, Society and Development; Vol. 11 Núm. 8; e40311831222Research, Society and Development; v. 11 n. 8; e403118312222525-3409reponame:Research, Society and Developmentinstname:Universidade Federal de Itajubá (UNIFEI)instacron:UNIFEIenghttps://rsdjournal.org/index.php/rsd/article/view/31222/26640Copyright (c) 2022 Laila Barros Campos; Janderson Romário Borges da Cruz Ferreira; Wellington Pinheiro dos Santoshttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessCampos, Laila BarrosFerreira, Janderson Romário Borges da CruzSantos, Wellington Pinheiro dos2022-07-01T13:34:06Zoai:ojs.pkp.sfu.ca:article/31222Revistahttps://rsdjournal.org/index.php/rsd/indexPUBhttps://rsdjournal.org/index.php/rsd/oairsd.articles@gmail.com2525-34092525-3409opendoar:2024-01-17T09:47:39.047507Research, Society and Development - Universidade Federal de Itajubá (UNIFEI)false |
dc.title.none.fl_str_mv |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors Modelos de estimación de afinidad de proteínas para el diseño inteligente de fármacos basados en pseudoconvoluciones y regresores no lineales Modelos de estimativa de afinidade de proteínas para design inteligente de drogas com base em pseudoconvoluções e regressores não lineares |
title |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors |
spellingShingle |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors Campos, Laila Barros Affinity Markers Amino Acids, Peptides and Proteins Artificial intelligence. Marcadores de afinidad Aminoácidos, Péptidos y Proteínas Inteligencia artificial. Marcadores de Afinidade Aminoácidos, Peptídeos e Proteínas Inteligência artificial. |
title_short |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors |
title_full |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors |
title_fullStr |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors |
title_full_unstemmed |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors |
title_sort |
Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors |
author |
Campos, Laila Barros |
author_facet |
Campos, Laila Barros Ferreira, Janderson Romário Borges da Cruz Santos, Wellington Pinheiro dos |
author_role |
author |
author2 |
Ferreira, Janderson Romário Borges da Cruz Santos, Wellington Pinheiro dos |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Campos, Laila Barros Ferreira, Janderson Romário Borges da Cruz Santos, Wellington Pinheiro dos |
dc.subject.por.fl_str_mv |
Affinity Markers Amino Acids, Peptides and Proteins Artificial intelligence. Marcadores de afinidad Aminoácidos, Péptidos y Proteínas Inteligencia artificial. Marcadores de Afinidade Aminoácidos, Peptídeos e Proteínas Inteligência artificial. |
topic |
Affinity Markers Amino Acids, Peptides and Proteins Artificial intelligence. Marcadores de afinidad Aminoácidos, Péptidos y Proteínas Inteligencia artificial. Marcadores de Afinidade Aminoácidos, Peptídeos e Proteínas Inteligência artificial. |
description |
Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-06-24 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://rsdjournal.org/index.php/rsd/article/view/31222 10.33448/rsd-v11i8.31222 |
url |
https://rsdjournal.org/index.php/rsd/article/view/31222 |
identifier_str_mv |
10.33448/rsd-v11i8.31222 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://rsdjournal.org/index.php/rsd/article/view/31222/26640 |
dc.rights.driver.fl_str_mv |
https://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/4.0 |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Research, Society and Development |
publisher.none.fl_str_mv |
Research, Society and Development |
dc.source.none.fl_str_mv |
Research, Society and Development; Vol. 11 No. 8; e40311831222 Research, Society and Development; Vol. 11 Núm. 8; e40311831222 Research, Society and Development; v. 11 n. 8; e40311831222 2525-3409 reponame:Research, Society and Development instname:Universidade Federal de Itajubá (UNIFEI) instacron:UNIFEI |
instname_str |
Universidade Federal de Itajubá (UNIFEI) |
instacron_str |
UNIFEI |
institution |
UNIFEI |
reponame_str |
Research, Society and Development |
collection |
Research, Society and Development |
repository.name.fl_str_mv |
Research, Society and Development - Universidade Federal de Itajubá (UNIFEI) |
repository.mail.fl_str_mv |
rsd.articles@gmail.com |
_version_ |
1797052767927271424 |