Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation

Detalhes bibliográficos
Autor(a) principal: Cobre, Alexandre de Fátima
Data de Publicação: 2024
Outros Autores: Ara, Anderson, Alves, Alexessander Couto, Neto, Moisés Maia, Fachi, Mariana Millan, Beca, Laize Botas, Tonin, Fernanda, Pontarolo, Roberto
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.21/17630
Resumo: Recent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for HIV. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (Google Collab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/code
id RCAP_016d23e08a4e6231f043a7891ff12679
oai_identifier_str oai:repositorio.ipl.pt:10400.21/17630
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulationHIVCCR5Drug discoveryMachine learningMolecular dockingMolecular dynamicsRecent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for HIV. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (Google Collab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/codeElsevierRCIPLCobre, Alexandre de FátimaAra, AndersonAlves, Alexessander CoutoNeto, Moisés MaiaFachi, Mariana MillanBeca, Laize BotasTonin, FernandaPontarolo, Roberto2024-072026-08-01T00:00:00Z2024-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.21/17630engCobre AF, Ara A, Alves AC, Neto MM, Fachi MM, Tonin FS, et al. Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation. Chemometr Intell Lab Syst. 2024;250:105145.10.1016/j.chemolab.2024.105145info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-09-11T02:16:49Zoai:repositorio.ipl.pt:10400.21/17630Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-09-11T02:16:49Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
title Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
spellingShingle Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
Cobre, Alexandre de Fátima
HIV
CCR5
Drug discovery
Machine learning
Molecular docking
Molecular dynamics
title_short Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
title_full Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
title_fullStr Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
title_full_unstemmed Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
title_sort Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
author Cobre, Alexandre de Fátima
author_facet Cobre, Alexandre de Fátima
Ara, Anderson
Alves, Alexessander Couto
Neto, Moisés Maia
Fachi, Mariana Millan
Beca, Laize Botas
Tonin, Fernanda
Pontarolo, Roberto
author_role author
author2 Ara, Anderson
Alves, Alexessander Couto
Neto, Moisés Maia
Fachi, Mariana Millan
Beca, Laize Botas
Tonin, Fernanda
Pontarolo, Roberto
author2_role author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv RCIPL
dc.contributor.author.fl_str_mv Cobre, Alexandre de Fátima
Ara, Anderson
Alves, Alexessander Couto
Neto, Moisés Maia
Fachi, Mariana Millan
Beca, Laize Botas
Tonin, Fernanda
Pontarolo, Roberto
dc.subject.por.fl_str_mv HIV
CCR5
Drug discovery
Machine learning
Molecular docking
Molecular dynamics
topic HIV
CCR5
Drug discovery
Machine learning
Molecular docking
Molecular dynamics
description Recent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for HIV. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (Google Collab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/code
publishDate 2024
dc.date.none.fl_str_mv 2024-07
2024-07-01T00:00:00Z
2026-08-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.21/17630
url http://hdl.handle.net/10400.21/17630
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Cobre AF, Ara A, Alves AC, Neto MM, Fachi MM, Tonin FS, et al. Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation. Chemometr Intell Lab Syst. 2024;250:105145.
10.1016/j.chemolab.2024.105145
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv mluisa.alvim@gmail.com
_version_ 1817547160797839360