Parallelization of the DIANA algorithm in openMP

Detalhes bibliográficos
Autor(a) principal: Ribeiro, Hethini [UNESP]
Data de Publicação: 2019
Outros Autores: Spolon, Roberta [UNESP], Manacero, Aleardo [UNESP], Lobato, Renata S. [UNESP]
Tipo de documento: Artigo de conferência
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1007/978-981-13-5907-1_18
http://hdl.handle.net/11449/190153
Resumo: Global data production has been increasing by approximately 40% per year since the beginning of the last decade. These large datasets, also called Big Data, are posing great challenges in many areas and in particular in the Machine Learning (ML) field. Although ML algorithms are able to extract useful information from these large data repositories, they are computationally expensive such as AGNES and DIANA, which have O(n) and O(2 n ) complexity, respectively. Therefore, the big challenge is to process large amounts of data in a realistic time frame. In this context, this paper proposes the parallelization of the DIANA OpenMP algorithm. Initial tests with a database with 5000 elements presented a speed up of 5,2521. It is believed that, according to Gustafson’s law, for a larger database the results will also be larger.
id UNSP_1c602d2fa4a6ff7a613dc1ac84f81afc
oai_identifier_str oai:repositorio.unesp.br:11449/190153
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Parallelization of the DIANA algorithm in openMPDIANAMachine learningOpenMPParallelizationGlobal data production has been increasing by approximately 40% per year since the beginning of the last decade. These large datasets, also called Big Data, are posing great challenges in many areas and in particular in the Machine Learning (ML) field. Although ML algorithms are able to extract useful information from these large data repositories, they are computationally expensive such as AGNES and DIANA, which have O(n) and O(2 n ) complexity, respectively. Therefore, the big challenge is to process large amounts of data in a realistic time frame. In this context, this paper proposes the parallelization of the DIANA OpenMP algorithm. Initial tests with a database with 5000 elements presented a speed up of 5,2521. It is believed that, according to Gustafson’s law, for a larger database the results will also be larger.Computer Department Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Department of Computer Science and Statistics Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Computer Department Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Department of Computer Science and Statistics Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Universidade Estadual Paulista (Unesp)Ribeiro, Hethini [UNESP]Spolon, Roberta [UNESP]Manacero, Aleardo [UNESP]Lobato, Renata S. [UNESP]2019-10-06T17:04:01Z2019-10-06T17:04:01Z2019-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject171-176http://dx.doi.org/10.1007/978-981-13-5907-1_18Communications in Computer and Information Science, v. 931, p. 171-176.1865-0929http://hdl.handle.net/11449/19015310.1007/978-981-13-5907-1_182-s2.0-8506229454655686813740948600000-0001-8248-0826Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengCommunications in Computer and Information Scienceinfo:eu-repo/semantics/openAccess2024-04-23T16:11:19Zoai:repositorio.unesp.br:11449/190153Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-04-23T16:11:19Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Parallelization of the DIANA algorithm in openMP
title Parallelization of the DIANA algorithm in openMP
spellingShingle Parallelization of the DIANA algorithm in openMP
Ribeiro, Hethini [UNESP]
DIANA
Machine learning
OpenMP
Parallelization
title_short Parallelization of the DIANA algorithm in openMP
title_full Parallelization of the DIANA algorithm in openMP
title_fullStr Parallelization of the DIANA algorithm in openMP
title_full_unstemmed Parallelization of the DIANA algorithm in openMP
title_sort Parallelization of the DIANA algorithm in openMP
author Ribeiro, Hethini [UNESP]
author_facet Ribeiro, Hethini [UNESP]
Spolon, Roberta [UNESP]
Manacero, Aleardo [UNESP]
Lobato, Renata S. [UNESP]
author_role author
author2 Spolon, Roberta [UNESP]
Manacero, Aleardo [UNESP]
Lobato, Renata S. [UNESP]
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Ribeiro, Hethini [UNESP]
Spolon, Roberta [UNESP]
Manacero, Aleardo [UNESP]
Lobato, Renata S. [UNESP]
dc.subject.por.fl_str_mv DIANA
Machine learning
OpenMP
Parallelization
topic DIANA
Machine learning
OpenMP
Parallelization
description Global data production has been increasing by approximately 40% per year since the beginning of the last decade. These large datasets, also called Big Data, are posing great challenges in many areas and in particular in the Machine Learning (ML) field. Although ML algorithms are able to extract useful information from these large data repositories, they are computationally expensive such as AGNES and DIANA, which have O(n) and O(2 n ) complexity, respectively. Therefore, the big challenge is to process large amounts of data in a realistic time frame. In this context, this paper proposes the parallelization of the DIANA OpenMP algorithm. Initial tests with a database with 5000 elements presented a speed up of 5,2521. It is believed that, according to Gustafson’s law, for a larger database the results will also be larger.
publishDate 2019
dc.date.none.fl_str_mv 2019-10-06T17:04:01Z
2019-10-06T17:04:01Z
2019-01-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/978-981-13-5907-1_18
Communications in Computer and Information Science, v. 931, p. 171-176.
1865-0929
http://hdl.handle.net/11449/190153
10.1007/978-981-13-5907-1_18
2-s2.0-85062294546
5568681374094860
0000-0001-8248-0826
url http://dx.doi.org/10.1007/978-981-13-5907-1_18
http://hdl.handle.net/11449/190153
identifier_str_mv Communications in Computer and Information Science, v. 931, p. 171-176.
1865-0929
10.1007/978-981-13-5907-1_18
2-s2.0-85062294546
5568681374094860
0000-0001-8248-0826
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Communications in Computer and Information Science
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 171-176
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1799964822006136832