Predicting model training time to optimize distributed machine learning applications

Guimarães, Miguel; Carneiro, Davide; Palumbo, Guilherme; Oliveira, Filipe; Oliveira, Óscar; Alves, Victor; Novais, Paulo

Predicting model training time to optimize distributed machine learning applications

Detalhes bibliográficos
Autor(a) principal:	Guimarães, Miguel
Data de Publicação:	2023
Outros Autores:	Carneiro, Davide, Palumbo, Guilherme, Oliveira, Filipe, Oliveira, Óscar, Alves, Victor, Novais, Paulo
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	https://hdl.handle.net/1822/85498
Resumo:	Despite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solution is the use of distributed learning algorithms, which can learn in a distributed manner, from distributed datasets. In this paper, we describe CEDEs—a distributed learning system in which models are heterogeneous distributed Ensembles, i.e., complex models constituted by different base models, trained with different and distributed subsets of data. Specifically, we address the issue of predicting the training time of a given model, given its characteristics and the characteristics of the data. Given that the creation of an Ensemble may imply the training of hundreds of base models, information about the predicted duration of each of these individual tasks is paramount for an efficient management of the cluster’s computational resources and for minimizing makespan, i.e., the time it takes to train the whole Ensemble. Results show that the proposed approach is able to predict the training time of Decision Trees with an average error of 0.103 s, and the training time of Neural Networks with an average error of 21.263 s. We also show how results depend significantly on the hyperparameters of the model and on the characteristics of the input data.

Metadados do item

id	RCAP_7d84f981c877333a3f81a7b1cd485e51
oai_identifier_str	oai:repositorium.sdum.uminho.pt:1822/85498
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Predicting model training time to optimize distributed machine learning applicationsMeta-learningMachine learningDistributed learningTraining timeOptimizationScience & TechnologyDespite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solution is the use of distributed learning algorithms, which can learn in a distributed manner, from distributed datasets. In this paper, we describe CEDEs—a distributed learning system in which models are heterogeneous distributed Ensembles, i.e., complex models constituted by different base models, trained with different and distributed subsets of data. Specifically, we address the issue of predicting the training time of a given model, given its characteristics and the characteristics of the data. Given that the creation of an Ensemble may imply the training of hundreds of base models, information about the predicted duration of each of these individual tasks is paramount for an efficient management of the cluster’s computational resources and for minimizing makespan, i.e., the time it takes to train the whole Ensemble. Results show that the proposed approach is able to predict the training time of Decision Trees with an average error of 0.103 s, and the training time of Neural Networks with an average error of 21.263 s. We also show how results depend significantly on the hyperparameters of the model and on the characteristics of the input data.This work has been supported by national funds through FCT – Fundação para a Ciência e Tecnologia through projects UIDB/04728/2020, EXPL/CCI-COM/0706/2021, and CPCA-IAC/AV/475278/2022.Multidisciplinary Digital Publishing InstituteUniversidade do MinhoGuimarães, MiguelCarneiro, DavidePalumbo, GuilhermeOliveira, FilipeOliveira, ÓscarAlves, VictorNovais, Paulo2023-02-082023-02-08T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/1822/85498engGuimarães, M.; Carneiro, D.; Palumbo, G.; Oliveira, F.; Oliveira, Ó.; Alves, V.; Novais, P. Predicting Model Training Time to Optimize Distributed Machine Learning Applications. Electronics 2023, 12, 871. https://doi.org/10.3390/electronics120408712079-929210.3390/electronics12040871https://www.mdpi.com/2079-9292/12/4/871info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-12-23T01:36:26Zoai:repositorium.sdum.uminho.pt:1822/85498Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:47:10.571155Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Predicting model training time to optimize distributed machine learning applications
title	Predicting model training time to optimize distributed machine learning applications
spellingShingle	Predicting model training time to optimize distributed machine learning applications Guimarães, Miguel Meta-learning Machine learning Distributed learning Training time Optimization Science & Technology
title_short	Predicting model training time to optimize distributed machine learning applications
title_full	Predicting model training time to optimize distributed machine learning applications
title_fullStr	Predicting model training time to optimize distributed machine learning applications
title_full_unstemmed	Predicting model training time to optimize distributed machine learning applications
title_sort	Predicting model training time to optimize distributed machine learning applications
author	Guimarães, Miguel
author_facet	Guimarães, Miguel Carneiro, Davide Palumbo, Guilherme Oliveira, Filipe Oliveira, Óscar Alves, Victor Novais, Paulo
author_role	author
author2	Carneiro, Davide Palumbo, Guilherme Oliveira, Filipe Oliveira, Óscar Alves, Victor Novais, Paulo
author2_role	author author author author author author
dc.contributor.none.fl_str_mv	Universidade do Minho
dc.contributor.author.fl_str_mv	Guimarães, Miguel Carneiro, Davide Palumbo, Guilherme Oliveira, Filipe Oliveira, Óscar Alves, Victor Novais, Paulo
dc.subject.por.fl_str_mv	Meta-learning Machine learning Distributed learning Training time Optimization Science & Technology
topic	Meta-learning Machine learning Distributed learning Training time Optimization Science & Technology
description	Despite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solution is the use of distributed learning algorithms, which can learn in a distributed manner, from distributed datasets. In this paper, we describe CEDEs—a distributed learning system in which models are heterogeneous distributed Ensembles, i.e., complex models constituted by different base models, trained with different and distributed subsets of data. Specifically, we address the issue of predicting the training time of a given model, given its characteristics and the characteristics of the data. Given that the creation of an Ensemble may imply the training of hundreds of base models, information about the predicted duration of each of these individual tasks is paramount for an efficient management of the cluster’s computational resources and for minimizing makespan, i.e., the time it takes to train the whole Ensemble. Results show that the proposed approach is able to predict the training time of Decision Trees with an average error of 0.103 s, and the training time of Neural Networks with an average error of 21.263 s. We also show how results depend significantly on the hyperparameters of the model and on the characteristics of the input data.
publishDate	2023
dc.date.none.fl_str_mv	2023-02-08 2023-02-08T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://hdl.handle.net/1822/85498
url	https://hdl.handle.net/1822/85498
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	Guimarães, M.; Carneiro, D.; Palumbo, G.; Oliveira, F.; Oliveira, Ó.; Alves, V.; Novais, P. Predicting Model Training Time to Optimize Distributed Machine Learning Applications. Electronics 2023, 12, 871. https://doi.org/10.3390/electronics12040871 2079-9292 10.3390/electronics12040871 https://www.mdpi.com/2079-9292/12/4/871
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Multidisciplinary Digital Publishing Institute
publisher.none.fl_str_mv	Multidisciplinary Digital Publishing Institute
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799133044560363520

Predicting model training time to optimize distributed machine learning applications

Registros relacionados