Improving the drug discovery process by using multiple classifier systems
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10071/17465 |
Resumo: | Machine learning methods have become an indispensable tool for utilizing large knowledge and data repositories in science and technology. In the context of the pharmaceutical domain, the amount of acquired knowledge about the design and synthesis of pharmaceutical agents and bioactive molecules (drugs) is enormous. The primary challenge for automatically discovering new drugs from molecular screening information is related to the high dimensionality of datasets, where a wide range of features is included for each candidate drug. Thus, the implementation of improved techniques to ensure an adequate manipulation and interpretation of data becomes mandatory. To mitigate this problem, our tool (called D2-MCS) can split homogeneously the dataset into several groups (the subset of features) and subsequently, determine the most suitable classifier for each group. Finally, the tool allows determining the biological activity of each molecule by a voting scheme. The application of the D2-MCS tool was tested on a standardized, high quality dataset gathered from ChEMBL and have shown outperformance of our tool when compare to well-known single classification models. |
id |
RCAP_4e6a7621363ee7463292a4eeef331db2 |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/17465 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Improving the drug discovery process by using multiple classifier systemsDrug discoveryMachine learning algorithmsFeature clusteringMultiple classifier systemsMachine learning methods have become an indispensable tool for utilizing large knowledge and data repositories in science and technology. In the context of the pharmaceutical domain, the amount of acquired knowledge about the design and synthesis of pharmaceutical agents and bioactive molecules (drugs) is enormous. The primary challenge for automatically discovering new drugs from molecular screening information is related to the high dimensionality of datasets, where a wide range of features is included for each candidate drug. Thus, the implementation of improved techniques to ensure an adequate manipulation and interpretation of data becomes mandatory. To mitigate this problem, our tool (called D2-MCS) can split homogeneously the dataset into several groups (the subset of features) and subsequently, determine the most suitable classifier for each group. Finally, the tool allows determining the biological activity of each molecule by a voting scheme. The application of the D2-MCS tool was tested on a standardized, high quality dataset gathered from ChEMBL and have shown outperformance of our tool when compare to well-known single classification models.Pergamon/Elsevier2019-02-28T16:35:38Z2020-02-28T00:00:00Z2019-01-01T00:00:00Z20192019-02-28T16:34:38+0000info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/17465eng0957-417410.1016/j.eswa.2018.12.032Ruano-Ordás, D.Yevseyeva, I.Basto-Fernandes, V.Méndez, J. R.Emmerichd, M. T. M.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:42:53Zoai:repositorio.iscte-iul.pt:10071/17465Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:20:07.459383Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Improving the drug discovery process by using multiple classifier systems |
title |
Improving the drug discovery process by using multiple classifier systems |
spellingShingle |
Improving the drug discovery process by using multiple classifier systems Ruano-Ordás, D. Drug discovery Machine learning algorithms Feature clustering Multiple classifier systems |
title_short |
Improving the drug discovery process by using multiple classifier systems |
title_full |
Improving the drug discovery process by using multiple classifier systems |
title_fullStr |
Improving the drug discovery process by using multiple classifier systems |
title_full_unstemmed |
Improving the drug discovery process by using multiple classifier systems |
title_sort |
Improving the drug discovery process by using multiple classifier systems |
author |
Ruano-Ordás, D. |
author_facet |
Ruano-Ordás, D. Yevseyeva, I. Basto-Fernandes, V. Méndez, J. R. Emmerichd, M. T. M. |
author_role |
author |
author2 |
Yevseyeva, I. Basto-Fernandes, V. Méndez, J. R. Emmerichd, M. T. M. |
author2_role |
author author author author |
dc.contributor.author.fl_str_mv |
Ruano-Ordás, D. Yevseyeva, I. Basto-Fernandes, V. Méndez, J. R. Emmerichd, M. T. M. |
dc.subject.por.fl_str_mv |
Drug discovery Machine learning algorithms Feature clustering Multiple classifier systems |
topic |
Drug discovery Machine learning algorithms Feature clustering Multiple classifier systems |
description |
Machine learning methods have become an indispensable tool for utilizing large knowledge and data repositories in science and technology. In the context of the pharmaceutical domain, the amount of acquired knowledge about the design and synthesis of pharmaceutical agents and bioactive molecules (drugs) is enormous. The primary challenge for automatically discovering new drugs from molecular screening information is related to the high dimensionality of datasets, where a wide range of features is included for each candidate drug. Thus, the implementation of improved techniques to ensure an adequate manipulation and interpretation of data becomes mandatory. To mitigate this problem, our tool (called D2-MCS) can split homogeneously the dataset into several groups (the subset of features) and subsequently, determine the most suitable classifier for each group. Finally, the tool allows determining the biological activity of each molecule by a voting scheme. The application of the D2-MCS tool was tested on a standardized, high quality dataset gathered from ChEMBL and have shown outperformance of our tool when compare to well-known single classification models. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-02-28T16:35:38Z 2019-01-01T00:00:00Z 2019 2019-02-28T16:34:38+0000 2020-02-28T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/17465 |
url |
http://hdl.handle.net/10071/17465 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
0957-4174 10.1016/j.eswa.2018.12.032 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Pergamon/Elsevier |
publisher.none.fl_str_mv |
Pergamon/Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134761283747840 |