Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors

Detalhes bibliográficos
Autor(a) principal: Campos, Laila Barros
Data de Publicação: 2022
Outros Autores: Ferreira, Janderson Romário Borges da Cruz, Santos, Wellington Pinheiro dos
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Research, Society and Development
Texto Completo: https://rsdjournal.org/index.php/rsd/article/view/31222
Resumo: Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex.
id UNIFEI_479d5f91cca3039153da20413e41ae8d
oai_identifier_str oai:ojs.pkp.sfu.ca:article/31222
network_acronym_str UNIFEI
network_name_str Research, Society and Development
repository_id_str
spelling Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressorsModelos de estimación de afinidad de proteínas para el diseño inteligente de fármacos basados en pseudoconvoluciones y regresores no linealesModelos de estimativa de afinidade de proteínas para design inteligente de drogas com base em pseudoconvoluções e regressores não linearesAffinity MarkersAmino Acids, Peptides and ProteinsArtificial intelligence.Marcadores de afinidadAminoácidos, Péptidos y ProteínasInteligencia artificial.Marcadores de AfinidadeAminoácidos, Peptídeos e ProteínasInteligência artificial.Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex.Propósito: La aparición de nuevos virus y, en consecuencia, de nuevas enfermedades hace cada vez más necesaria la producción rápida y precisa de nuevos fármacos. Con la disponibilidad de grandes bases de datos de proteínas y medidas de afinidad, es posible crear funciones de puntuación para predecir la afinidad molecular. Estas funciones son fundamentales para el desarrollo inteligente de fármacos. Objetivo: En este trabajo, proponemos una función de puntuación para predecir la afinidad entre dos proteínas. El método se basa en la extracción de características por transferencia de aprendizaje en secuencias representadas en pseudoconvulsiones. Método: Las pseudoconvulsiones organizan secuencias en distribuciones de vecindario base. Cada distribución se representa mediante una imagen. Luego, dos proteínas se transforman en dos imágenes que se concatenan, formando la tercera imagen. A través del aprendizaje de transferencia profundo, esta imagen resultante se representa en un vector de atributos, que he reducido dimensionalmente por Random Forest. Finalmente, el vector de atributos reducido se aplica a un algoritmo de regresión que devuelve el grado de afinidad de las dos proteínas. Resultados: Utilizamos la base de datos Affinity Benchmark Versión 2. Se utilizaron 145 complejos para entrenar el modelo y 35 para probar. Los resultados mostraron un desempeño igual o superior a los métodos de evaluación de afinidad de proteínas de última generación, considerando los coeficientes de correlación de Pearson, Spearman y Kendall. Los mejores resultados fueron 0.66, 0.70 y 0.52. Conclusión: El método propuesto puede caracterizar secuencias de proteínas de modo que se pueda estimar la afinidad de unión entre dos proteínas sin simular la estructura tridimensional del complejo.Propósito: O surgimento de novos vírus e, consequentemente, novas doenças torna cada vez mais necessária a produção rápida e precisa de novos medicamentos. Com a disponibilidade de grandes bancos de dados de proteínas e medidas de afinidade, é possível construir funções de pontuação para prever a afinidade molecular. Essas funções são fundamentais para o design inteligente de medicamentos. Objetivo: Neste trabalho, propomos uma função de pontuação para prever a afinidade entre duas proteínas. O método é baseado na extração de características por transferência de aprendizado em sequências representadas em pseudo-convoluções. Método: As pseudo-convoluções organizam as sequências em distribuições de vizinhança de base. Cada distribuição é então representada por uma imagem. Duas proteínas são então transformadas em duas imagens que são concatenadas, formando a terceira imagem. Por meio de deep transfer learning, essa imagem resultante é então representada por um vetor de atributos, que tem a dimensionalidade reduzida por Random Forest. Por fim, o vetor de atributos reduzido é aplicado a uma máquina de aprendizado de regressão que retorna o grau de afinidade das duas proteínas. Resultados: Usamos o banco de dados Affinity Benchmark Versão 2. 145 complexos foram usados para treinamento do modelo e 35 para teste. Os resultados mostraram um desempenho igual ou superior aos métodos de avaliação de afinidade de proteínas no estado da arte, considerando os coeficientes de correlação de Pearson, Spearman e Kendall. Os melhores resultados foram 0.66, 0.70 e 0.52. Conclusão: O método proposto pode caracterizar sequências proteicas de forma que a afinidade de ligação entre duas proteínas possa ser estimada sem simular a estrutura tridimensional do complexo.Research, Society and Development2022-06-24info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://rsdjournal.org/index.php/rsd/article/view/3122210.33448/rsd-v11i8.31222Research, Society and Development; Vol. 11 No. 8; e40311831222Research, Society and Development; Vol. 11 Núm. 8; e40311831222Research, Society and Development; v. 11 n. 8; e403118312222525-3409reponame:Research, Society and Developmentinstname:Universidade Federal de Itajubá (UNIFEI)instacron:UNIFEIenghttps://rsdjournal.org/index.php/rsd/article/view/31222/26640Copyright (c) 2022 Laila Barros Campos; Janderson Romário Borges da Cruz Ferreira; Wellington Pinheiro dos Santoshttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessCampos, Laila BarrosFerreira, Janderson Romário Borges da CruzSantos, Wellington Pinheiro dos2022-07-01T13:34:06Zoai:ojs.pkp.sfu.ca:article/31222Revistahttps://rsdjournal.org/index.php/rsd/indexPUBhttps://rsdjournal.org/index.php/rsd/oairsd.articles@gmail.com2525-34092525-3409opendoar:2024-01-17T09:47:39.047507Research, Society and Development - Universidade Federal de Itajubá (UNIFEI)false
dc.title.none.fl_str_mv Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
Modelos de estimación de afinidad de proteínas para el diseño inteligente de fármacos basados en pseudoconvoluciones y regresores no lineales
Modelos de estimativa de afinidade de proteínas para design inteligente de drogas com base em pseudoconvoluções e regressores não lineares
title Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
spellingShingle Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
Campos, Laila Barros
Affinity Markers
Amino Acids, Peptides and Proteins
Artificial intelligence.
Marcadores de afinidad
Aminoácidos, Péptidos y Proteínas
Inteligencia artificial.
Marcadores de Afinidade
Aminoácidos, Peptídeos e Proteínas
Inteligência artificial.
title_short Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
title_full Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
title_fullStr Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
title_full_unstemmed Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
title_sort Affinity estimation models of proteins for intelligent drug design based on pseudoconvolutions and nonlinear regressors
author Campos, Laila Barros
author_facet Campos, Laila Barros
Ferreira, Janderson Romário Borges da Cruz
Santos, Wellington Pinheiro dos
author_role author
author2 Ferreira, Janderson Romário Borges da Cruz
Santos, Wellington Pinheiro dos
author2_role author
author
dc.contributor.author.fl_str_mv Campos, Laila Barros
Ferreira, Janderson Romário Borges da Cruz
Santos, Wellington Pinheiro dos
dc.subject.por.fl_str_mv Affinity Markers
Amino Acids, Peptides and Proteins
Artificial intelligence.
Marcadores de afinidad
Aminoácidos, Péptidos y Proteínas
Inteligencia artificial.
Marcadores de Afinidade
Aminoácidos, Peptídeos e Proteínas
Inteligência artificial.
topic Affinity Markers
Amino Acids, Peptides and Proteins
Artificial intelligence.
Marcadores de afinidad
Aminoácidos, Péptidos y Proteínas
Inteligencia artificial.
Marcadores de Afinidade
Aminoácidos, Peptídeos e Proteínas
Inteligência artificial.
description Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex.
publishDate 2022
dc.date.none.fl_str_mv 2022-06-24
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://rsdjournal.org/index.php/rsd/article/view/31222
10.33448/rsd-v11i8.31222
url https://rsdjournal.org/index.php/rsd/article/view/31222
identifier_str_mv 10.33448/rsd-v11i8.31222
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://rsdjournal.org/index.php/rsd/article/view/31222/26640
dc.rights.driver.fl_str_mv https://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Research, Society and Development
publisher.none.fl_str_mv Research, Society and Development
dc.source.none.fl_str_mv Research, Society and Development; Vol. 11 No. 8; e40311831222
Research, Society and Development; Vol. 11 Núm. 8; e40311831222
Research, Society and Development; v. 11 n. 8; e40311831222
2525-3409
reponame:Research, Society and Development
instname:Universidade Federal de Itajubá (UNIFEI)
instacron:UNIFEI
instname_str Universidade Federal de Itajubá (UNIFEI)
instacron_str UNIFEI
institution UNIFEI
reponame_str Research, Society and Development
collection Research, Society and Development
repository.name.fl_str_mv Research, Society and Development - Universidade Federal de Itajubá (UNIFEI)
repository.mail.fl_str_mv rsd.articles@gmail.com
_version_ 1797052767927271424