Parallelization of the DIANA algorithm in openMP
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | , , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1007/978-981-13-5907-1_18 http://hdl.handle.net/11449/190153 |
Resumo: | Global data production has been increasing by approximately 40% per year since the beginning of the last decade. These large datasets, also called Big Data, are posing great challenges in many areas and in particular in the Machine Learning (ML) field. Although ML algorithms are able to extract useful information from these large data repositories, they are computationally expensive such as AGNES and DIANA, which have O(n) and O(2 n ) complexity, respectively. Therefore, the big challenge is to process large amounts of data in a realistic time frame. In this context, this paper proposes the parallelization of the DIANA OpenMP algorithm. Initial tests with a database with 5000 elements presented a speed up of 5,2521. It is believed that, according to Gustafson’s law, for a larger database the results will also be larger. |
id |
UNSP_1c602d2fa4a6ff7a613dc1ac84f81afc |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/190153 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Parallelization of the DIANA algorithm in openMPDIANAMachine learningOpenMPParallelizationGlobal data production has been increasing by approximately 40% per year since the beginning of the last decade. These large datasets, also called Big Data, are posing great challenges in many areas and in particular in the Machine Learning (ML) field. Although ML algorithms are able to extract useful information from these large data repositories, they are computationally expensive such as AGNES and DIANA, which have O(n) and O(2 n ) complexity, respectively. Therefore, the big challenge is to process large amounts of data in a realistic time frame. In this context, this paper proposes the parallelization of the DIANA OpenMP algorithm. Initial tests with a database with 5000 elements presented a speed up of 5,2521. It is believed that, according to Gustafson’s law, for a larger database the results will also be larger.Computer Department Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Department of Computer Science and Statistics Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Computer Department Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Department of Computer Science and Statistics Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)Universidade Estadual Paulista (Unesp)Ribeiro, Hethini [UNESP]Spolon, Roberta [UNESP]Manacero, Aleardo [UNESP]Lobato, Renata S. [UNESP]2019-10-06T17:04:01Z2019-10-06T17:04:01Z2019-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject171-176http://dx.doi.org/10.1007/978-981-13-5907-1_18Communications in Computer and Information Science, v. 931, p. 171-176.1865-0929http://hdl.handle.net/11449/19015310.1007/978-981-13-5907-1_182-s2.0-8506229454655686813740948600000-0001-8248-0826Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengCommunications in Computer and Information Scienceinfo:eu-repo/semantics/openAccess2024-04-23T16:11:19Zoai:repositorio.unesp.br:11449/190153Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T16:56:05.605090Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Parallelization of the DIANA algorithm in openMP |
title |
Parallelization of the DIANA algorithm in openMP |
spellingShingle |
Parallelization of the DIANA algorithm in openMP Ribeiro, Hethini [UNESP] DIANA Machine learning OpenMP Parallelization |
title_short |
Parallelization of the DIANA algorithm in openMP |
title_full |
Parallelization of the DIANA algorithm in openMP |
title_fullStr |
Parallelization of the DIANA algorithm in openMP |
title_full_unstemmed |
Parallelization of the DIANA algorithm in openMP |
title_sort |
Parallelization of the DIANA algorithm in openMP |
author |
Ribeiro, Hethini [UNESP] |
author_facet |
Ribeiro, Hethini [UNESP] Spolon, Roberta [UNESP] Manacero, Aleardo [UNESP] Lobato, Renata S. [UNESP] |
author_role |
author |
author2 |
Spolon, Roberta [UNESP] Manacero, Aleardo [UNESP] Lobato, Renata S. [UNESP] |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Ribeiro, Hethini [UNESP] Spolon, Roberta [UNESP] Manacero, Aleardo [UNESP] Lobato, Renata S. [UNESP] |
dc.subject.por.fl_str_mv |
DIANA Machine learning OpenMP Parallelization |
topic |
DIANA Machine learning OpenMP Parallelization |
description |
Global data production has been increasing by approximately 40% per year since the beginning of the last decade. These large datasets, also called Big Data, are posing great challenges in many areas and in particular in the Machine Learning (ML) field. Although ML algorithms are able to extract useful information from these large data repositories, they are computationally expensive such as AGNES and DIANA, which have O(n) and O(2 n ) complexity, respectively. Therefore, the big challenge is to process large amounts of data in a realistic time frame. In this context, this paper proposes the parallelization of the DIANA OpenMP algorithm. Initial tests with a database with 5000 elements presented a speed up of 5,2521. It is believed that, according to Gustafson’s law, for a larger database the results will also be larger. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-10-06T17:04:01Z 2019-10-06T17:04:01Z 2019-01-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1007/978-981-13-5907-1_18 Communications in Computer and Information Science, v. 931, p. 171-176. 1865-0929 http://hdl.handle.net/11449/190153 10.1007/978-981-13-5907-1_18 2-s2.0-85062294546 5568681374094860 0000-0001-8248-0826 |
url |
http://dx.doi.org/10.1007/978-981-13-5907-1_18 http://hdl.handle.net/11449/190153 |
identifier_str_mv |
Communications in Computer and Information Science, v. 931, p. 171-176. 1865-0929 10.1007/978-981-13-5907-1_18 2-s2.0-85062294546 5568681374094860 0000-0001-8248-0826 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Communications in Computer and Information Science |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
171-176 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128724010270720 |