An efficient parallel implementation for training supervised optimum-path forest classifiers

Detalhes bibliográficos
Autor(a) principal: Culquicondor, Aldo
Data de Publicação: 2020
Outros Autores: Baldassin, Alexandro [UNESP], Castelo-Fernández, Cesar, de Carvalho, João P.L., Papa, João Paulo [UNESP]
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1016/j.neucom.2018.10.115
http://hdl.handle.net/11449/201723
Resumo: In this work, we propose and analyze parallel training algorithms for the Optimum-Path Forest (OPF) classifier. We start with a naïve parallelization approach where, following traditional sequential training that considers the supervised OPF, a priority queue is used to store the best samples at each learning iteration. The proposed approach replaces the priority queue with an array and a linear search aiming at using a parallel-friendly data structure. We show that this approach leads to less competition among threads, thus yielding a more temporal and spatial locality. Additionally, we show how the use of vectorization in distance calculations affects the overall speedup and also provide directions on the situations one can benefit from that. The experiments are carried out on five public datasets with a different number of samples and features on architectures with distinct levels of parallelism. On average, the proposed approach provides speedups of up to 11.8 × and 26 × in a 24-core Intel and 64-core AMD processors, respectively.
id UNSP_00fc88c1f4084f71affb401bfcd00d16
oai_identifier_str oai:repositorio.unesp.br:11449/201723
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling An efficient parallel implementation for training supervised optimum-path forest classifiersGraph algorithmsOptimum-path forestParallel algorithmsIn this work, we propose and analyze parallel training algorithms for the Optimum-Path Forest (OPF) classifier. We start with a naïve parallelization approach where, following traditional sequential training that considers the supervised OPF, a priority queue is used to store the best samples at each learning iteration. The proposed approach replaces the priority queue with an array and a linear search aiming at using a parallel-friendly data structure. We show that this approach leads to less competition among threads, thus yielding a more temporal and spatial locality. Additionally, we show how the use of vectorization in distance calculations affects the overall speedup and also provide directions on the situations one can benefit from that. The experiments are carried out on five public datasets with a different number of samples and features on architectures with distinct levels of parallelism. On average, the proposed approach provides speedups of up to 11.8 × and 26 × in a 24-core Intel and 64-core AMD processors, respectively.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Universidad Católica San PabloUNESP – São Paulo State UniversityInstitute of Computing University of CampinasUNESP – São Paulo State UniversityFAPESP: #2013/07375-0FAPESP: #2014/12236-1FAPESP: #2014/16250-9FAPESP: #2016/15337-9FAPESP: #2016/19403-6FAPESP: #2017/03940-5CNPq: #306166/2014-3CNPq: #307066/2017-7CAPES: 2966/2014Universidad Católica San PabloUniversidade Estadual Paulista (Unesp)Universidade Estadual de Campinas (UNICAMP)Culquicondor, AldoBaldassin, Alexandro [UNESP]Castelo-Fernández, Cesarde Carvalho, João P.L.Papa, João Paulo [UNESP]2020-12-12T02:40:07Z2020-12-12T02:40:07Z2020-06-14info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article259-268http://dx.doi.org/10.1016/j.neucom.2018.10.115Neurocomputing, v. 393, p. 259-268.1872-82860925-2312http://hdl.handle.net/11449/20172310.1016/j.neucom.2018.10.1152-s2.0-85084114844Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengNeurocomputinginfo:eu-repo/semantics/openAccess2024-04-23T16:10:46Zoai:repositorio.unesp.br:11449/201723Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T16:59:03.084395Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv An efficient parallel implementation for training supervised optimum-path forest classifiers
title An efficient parallel implementation for training supervised optimum-path forest classifiers
spellingShingle An efficient parallel implementation for training supervised optimum-path forest classifiers
Culquicondor, Aldo
Graph algorithms
Optimum-path forest
Parallel algorithms
title_short An efficient parallel implementation for training supervised optimum-path forest classifiers
title_full An efficient parallel implementation for training supervised optimum-path forest classifiers
title_fullStr An efficient parallel implementation for training supervised optimum-path forest classifiers
title_full_unstemmed An efficient parallel implementation for training supervised optimum-path forest classifiers
title_sort An efficient parallel implementation for training supervised optimum-path forest classifiers
author Culquicondor, Aldo
author_facet Culquicondor, Aldo
Baldassin, Alexandro [UNESP]
Castelo-Fernández, Cesar
de Carvalho, João P.L.
Papa, João Paulo [UNESP]
author_role author
author2 Baldassin, Alexandro [UNESP]
Castelo-Fernández, Cesar
de Carvalho, João P.L.
Papa, João Paulo [UNESP]
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Universidad Católica San Pablo
Universidade Estadual Paulista (Unesp)
Universidade Estadual de Campinas (UNICAMP)
dc.contributor.author.fl_str_mv Culquicondor, Aldo
Baldassin, Alexandro [UNESP]
Castelo-Fernández, Cesar
de Carvalho, João P.L.
Papa, João Paulo [UNESP]
dc.subject.por.fl_str_mv Graph algorithms
Optimum-path forest
Parallel algorithms
topic Graph algorithms
Optimum-path forest
Parallel algorithms
description In this work, we propose and analyze parallel training algorithms for the Optimum-Path Forest (OPF) classifier. We start with a naïve parallelization approach where, following traditional sequential training that considers the supervised OPF, a priority queue is used to store the best samples at each learning iteration. The proposed approach replaces the priority queue with an array and a linear search aiming at using a parallel-friendly data structure. We show that this approach leads to less competition among threads, thus yielding a more temporal and spatial locality. Additionally, we show how the use of vectorization in distance calculations affects the overall speedup and also provide directions on the situations one can benefit from that. The experiments are carried out on five public datasets with a different number of samples and features on architectures with distinct levels of parallelism. On average, the proposed approach provides speedups of up to 11.8 × and 26 × in a 24-core Intel and 64-core AMD processors, respectively.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-12T02:40:07Z
2020-12-12T02:40:07Z
2020-06-14
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1016/j.neucom.2018.10.115
Neurocomputing, v. 393, p. 259-268.
1872-8286
0925-2312
http://hdl.handle.net/11449/201723
10.1016/j.neucom.2018.10.115
2-s2.0-85084114844
url http://dx.doi.org/10.1016/j.neucom.2018.10.115
http://hdl.handle.net/11449/201723
identifier_str_mv Neurocomputing, v. 393, p. 259-268.
1872-8286
0925-2312
10.1016/j.neucom.2018.10.115
2-s2.0-85084114844
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Neurocomputing
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 259-268
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128729550946304