Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
Autor(a) principal: | |
---|---|
Data de Publicação: | 2024 |
Outros Autores: | , , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.21/17630 |
Resumo: | Recent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for HIV. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (Google Collab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/code |
id |
RCAP_016d23e08a4e6231f043a7891ff12679 |
---|---|
oai_identifier_str |
oai:repositorio.ipl.pt:10400.21/17630 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulationHIVCCR5Drug discoveryMachine learningMolecular dockingMolecular dynamicsRecent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for HIV. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (Google Collab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/codeElsevierRCIPLCobre, Alexandre de FátimaAra, AndersonAlves, Alexessander CoutoNeto, Moisés MaiaFachi, Mariana MillanBeca, Laize BotasTonin, FernandaPontarolo, Roberto2024-072026-08-01T00:00:00Z2024-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.21/17630engCobre AF, Ara A, Alves AC, Neto MM, Fachi MM, Tonin FS, et al. Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation. Chemometr Intell Lab Syst. 2024;250:105145.10.1016/j.chemolab.2024.105145info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-09-11T02:16:49Zoai:repositorio.ipl.pt:10400.21/17630Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-09-11T02:16:49Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
title |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
spellingShingle |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation Cobre, Alexandre de Fátima HIV CCR5 Drug discovery Machine learning Molecular docking Molecular dynamics |
title_short |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
title_full |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
title_fullStr |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
title_full_unstemmed |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
title_sort |
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation |
author |
Cobre, Alexandre de Fátima |
author_facet |
Cobre, Alexandre de Fátima Ara, Anderson Alves, Alexessander Couto Neto, Moisés Maia Fachi, Mariana Millan Beca, Laize Botas Tonin, Fernanda Pontarolo, Roberto |
author_role |
author |
author2 |
Ara, Anderson Alves, Alexessander Couto Neto, Moisés Maia Fachi, Mariana Millan Beca, Laize Botas Tonin, Fernanda Pontarolo, Roberto |
author2_role |
author author author author author author author |
dc.contributor.none.fl_str_mv |
RCIPL |
dc.contributor.author.fl_str_mv |
Cobre, Alexandre de Fátima Ara, Anderson Alves, Alexessander Couto Neto, Moisés Maia Fachi, Mariana Millan Beca, Laize Botas Tonin, Fernanda Pontarolo, Roberto |
dc.subject.por.fl_str_mv |
HIV CCR5 Drug discovery Machine learning Molecular docking Molecular dynamics |
topic |
HIV CCR5 Drug discovery Machine learning Molecular docking Molecular dynamics |
description |
Recent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for HIV. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (Google Collab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/code |
publishDate |
2024 |
dc.date.none.fl_str_mv |
2024-07 2024-07-01T00:00:00Z 2026-08-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.21/17630 |
url |
http://hdl.handle.net/10400.21/17630 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Cobre AF, Ara A, Alves AC, Neto MM, Fachi MM, Tonin FS, et al. Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation. Chemometr Intell Lab Syst. 2024;250:105145. 10.1016/j.chemolab.2024.105145 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
mluisa.alvim@gmail.com |
_version_ |
1817547160797839360 |