Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA

Detalhes bibliográficos
Autor(a) principal: Sulzbach, Maurício
Data de Publicação: 2014
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações do UFSM
Texto Completo: http://repositorio.ufsm.br/handle/1/5441
Resumo: As a consequence of the CPU and GPU's architectures advance, in the last years there was a raise of the number of parallel programming APIs for both devices. While OpenMP is used to make parallel programs for the CPU, CUDA and OpenACC are employed in the parallel processing in the GPU. In the programming for the GPU, CUDA presents a model based on functions that make the source code extensive and prone to errors, in addition to leading to low development productivity. OpenACC emerged aiming to solve these problems and to be an alternative to the utilization of CUDA. Similar to OpenMP, this API has policies that ease the development of parallel applications that run on the GPU only. To further increase performance and take advantage of the parallel aspects of both CPU and GPU, it is possible to develop hybrid algorithms that split the processing on the two devices. In that sense, the main objective of this work is to verify if the advantages that OpenACC introduces are also positively reflected on the hybrid programming using OpenMP, if compared to the OpenMP + CUDA model. A second objective of this work is to identify aspects of the two programming models that could limit the performance or on the applications' development. As a way to accomplish these goals, this work presents the development of three hybrid parallel algorithms that are based on the Rodinia's benchmark algorithms, namely, RNG, Hotspot and SRAD, using the hybrid models OpenMP + CUDA and OpenMP + OpenACC. In these algorithms, the CPU part of the code is programmed using OpenMP, while it's assigned for the CUDA and OpenACC the parallel processing on the GPU. After the execution of the hybrid algorithms, the performance, efficiency and the processing's splitting in each one of the devices were analyzed. It was verified, through the hybrid algorithms' runs, that, in the two proposed programming models it was possible to outperform the performance of a parallel application that runs on a single API and in only one of the devices. In addition to that, in the hybrid algorithms RNG and Hotspot, CUDA's performance was superior to that of OpenACC, while in the SRAD algorithm OpenACC was faster than CUDA.
id UFSM_0eb584ac52b2ded29c94c0b61c7cbab1
oai_identifier_str oai:repositorio.ufsm.br:1/5441
network_acronym_str UFSM
network_name_str Biblioteca Digital de Teses e Dissertações do UFSM
repository_id_str
spelling 2015-03-242015-03-242014-08-22SULZBACH, Maurício. HYBRID PARALLEL PROGRAMMING FOR CPU AND GPU: AN EVALUATION OF OPENACC AS RELATED TO OPENMP AND CUDA. 2014. 100 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Santa Maria, Santa Maria, 2014.http://repositorio.ufsm.br/handle/1/5441As a consequence of the CPU and GPU's architectures advance, in the last years there was a raise of the number of parallel programming APIs for both devices. While OpenMP is used to make parallel programs for the CPU, CUDA and OpenACC are employed in the parallel processing in the GPU. In the programming for the GPU, CUDA presents a model based on functions that make the source code extensive and prone to errors, in addition to leading to low development productivity. OpenACC emerged aiming to solve these problems and to be an alternative to the utilization of CUDA. Similar to OpenMP, this API has policies that ease the development of parallel applications that run on the GPU only. To further increase performance and take advantage of the parallel aspects of both CPU and GPU, it is possible to develop hybrid algorithms that split the processing on the two devices. In that sense, the main objective of this work is to verify if the advantages that OpenACC introduces are also positively reflected on the hybrid programming using OpenMP, if compared to the OpenMP + CUDA model. A second objective of this work is to identify aspects of the two programming models that could limit the performance or on the applications' development. As a way to accomplish these goals, this work presents the development of three hybrid parallel algorithms that are based on the Rodinia's benchmark algorithms, namely, RNG, Hotspot and SRAD, using the hybrid models OpenMP + CUDA and OpenMP + OpenACC. In these algorithms, the CPU part of the code is programmed using OpenMP, while it's assigned for the CUDA and OpenACC the parallel processing on the GPU. After the execution of the hybrid algorithms, the performance, efficiency and the processing's splitting in each one of the devices were analyzed. It was verified, through the hybrid algorithms' runs, that, in the two proposed programming models it was possible to outperform the performance of a parallel application that runs on a single API and in only one of the devices. In addition to that, in the hybrid algorithms RNG and Hotspot, CUDA's performance was superior to that of OpenACC, while in the SRAD algorithm OpenACC was faster than CUDA.Como consequência do avanço das arquiteturas de CPU e GPU, nos últimos anos houve um aumento no número de APIs de programação paralela para os dois dispositivos. Enquanto que OpenMP é utilizada no processamento paralelo em CPU, CUDA e OpenACC são empregadas no processamento paralelo em GPU. Na programação para GPU, CUDA apresenta um modelo baseado em funções que deixam o código fonte extenso e propenso a erros, além de acarretar uma baixa produtividade no desenvolvimento. Objetivando solucionar esses problemas e sendo uma alternativa à utilização de CUDA surgiu o OpenACC. Semelhante ao OpenMP, essa API disponibiliza diretivas que facilitam o desenvolvimento de aplicações paralelas, porém para execução em GPU. Para aumentar ainda mais o desempenho e tirar proveito da capacidade de paralelismo de CPU e GPU, é possível desenvolver algoritmos híbridos que dividam o processamento nos dois dispositivos. Nesse sentido, este trabalho objetiva verificar se as facilidades que o OpenACC introduz também refletem positivamente na programação híbrida com OpenMP, se comparado ao modelo OpenMP + CUDA. Além disso, o trabalho visa relatar as limitações nos dois modelos de programação híbrida que possam influenciar no desempenho ou no desenvolvimento de aplicações. Como forma de cumprir essas metas, este trabalho apresenta o desenvolvimento de três algoritmos paralelos híbridos baseados nos algoritmos do benchmark Rodinia, a saber, RNG, Hotspot e SRAD, utilizando os modelos híbridos OpenMP + CUDA e OpenMP + OpenACC. Nesses algoritmos é atribuída ao OpenMP a execução paralela em CPU, enquanto que CUDA e OpenACC são responsáveis pelo processamento paralelo em GPU. Após as execuções dos algoritmos híbridos foram analisados o desempenho, a eficiência e a divisão da execução em cada um dos dispositivos. Verificou-se através das execuções dos algoritmos híbridos que nos dois modelos de programação propostos foi possível superar o desempenho de uma aplicação paralela em uma única API, com execução em apenas um dos dispositivos. Além disso, nos algoritmos híbridos RNG e Hotspot o desempenho de CUDA foi superior ao desempenho de OpenACC, enquanto que no algoritmo SRAD a API OpenACC apresentou uma execução mais rápida, se comparada à API CUDA.application/pdfporUniversidade Federal de Santa MariaPrograma de Pós-Graduação em InformáticaUFSMBRCiência da ComputaçãoCPUGPUOpenMPCUDAOpenACCProgramação paralela híbridaDesempenhoOpenMPCUDAHybrid parallel programmingPerformanceCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOProgramação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDAHybrid parallel programming for CPU and GPU: an evaluation of OPENACC as RELATED to OPENMP and CUDAinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisStein, Benhur de Oliveirahttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4728084T8Charão, Andréa Schwertnerhttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4721144D9Cera, Marcia Cristinahttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4762397U0http://lattes.cnpq.br/9113910489637221Sulzbach, Maurício100300000007400300300300300c96e146b-e7f3-455a-8889-3d68529e5b264ff789df-7a8e-4754-b744-975996b12a82d66ed8c6-e395-4691-8ab7-919ae4d12b50e284c963-25c2-4a1c-8bc0-ca8f52bb2cefinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações do UFSMinstname:Universidade Federal de Santa Maria (UFSM)instacron:UFSMORIGINALSULZBACH, MAURICIO.pdfapplication/pdf1250049http://repositorio.ufsm.br/bitstream/1/5441/1/SULZBACH%2c%20MAURICIO.pdf0bf5794d66ff18af0b3436d8a752bc6fMD51TEXTSULZBACH, MAURICIO.pdf.txtSULZBACH, MAURICIO.pdf.txtExtracted texttext/plain197719http://repositorio.ufsm.br/bitstream/1/5441/2/SULZBACH%2c%20MAURICIO.pdf.txt60b127dc92e154afdb095b511a33652eMD52THUMBNAILSULZBACH, MAURICIO.pdf.jpgSULZBACH, MAURICIO.pdf.jpgIM Thumbnailimage/jpeg4911http://repositorio.ufsm.br/bitstream/1/5441/3/SULZBACH%2c%20MAURICIO.pdf.jpg826c9acdd925e07c93c44ccf2935c820MD531/54412022-03-16 12:46:24.929oai:repositorio.ufsm.br:1/5441Biblioteca Digital de Teses e Dissertaçõeshttps://repositorio.ufsm.br/ONGhttps://repositorio.ufsm.br/oai/requestatendimento.sib@ufsm.br||tedebc@gmail.comopendoar:2022-03-16T15:46:24Biblioteca Digital de Teses e Dissertações do UFSM - Universidade Federal de Santa Maria (UFSM)false
dc.title.por.fl_str_mv Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
dc.title.alternative.eng.fl_str_mv Hybrid parallel programming for CPU and GPU: an evaluation of OPENACC as RELATED to OPENMP and CUDA
title Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
spellingShingle Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
Sulzbach, Maurício
CPU
GPU
OpenMP
CUDA
OpenACC
Programação paralela híbrida
Desempenho
OpenMP
CUDA
Hybrid parallel programming
Performance
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
title_full Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
title_fullStr Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
title_full_unstemmed Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
title_sort Programação paralela híbrida para CPU e GPU: uma avaliação do OPENACC frente a OPENMP e CUDA
author Sulzbach, Maurício
author_facet Sulzbach, Maurício
author_role author
dc.contributor.advisor1.fl_str_mv Stein, Benhur de Oliveira
dc.contributor.advisor1Lattes.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4728084T8
dc.contributor.referee1.fl_str_mv Charão, Andréa Schwertner
dc.contributor.referee1Lattes.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4721144D9
dc.contributor.referee2.fl_str_mv Cera, Marcia Cristina
dc.contributor.referee2Lattes.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4762397U0
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/9113910489637221
dc.contributor.author.fl_str_mv Sulzbach, Maurício
contributor_str_mv Stein, Benhur de Oliveira
Charão, Andréa Schwertner
Cera, Marcia Cristina
dc.subject.por.fl_str_mv CPU
GPU
OpenMP
CUDA
OpenACC
Programação paralela híbrida
Desempenho
topic CPU
GPU
OpenMP
CUDA
OpenACC
Programação paralela híbrida
Desempenho
OpenMP
CUDA
Hybrid parallel programming
Performance
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv OpenMP
CUDA
Hybrid parallel programming
Performance
dc.subject.cnpq.fl_str_mv CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description As a consequence of the CPU and GPU's architectures advance, in the last years there was a raise of the number of parallel programming APIs for both devices. While OpenMP is used to make parallel programs for the CPU, CUDA and OpenACC are employed in the parallel processing in the GPU. In the programming for the GPU, CUDA presents a model based on functions that make the source code extensive and prone to errors, in addition to leading to low development productivity. OpenACC emerged aiming to solve these problems and to be an alternative to the utilization of CUDA. Similar to OpenMP, this API has policies that ease the development of parallel applications that run on the GPU only. To further increase performance and take advantage of the parallel aspects of both CPU and GPU, it is possible to develop hybrid algorithms that split the processing on the two devices. In that sense, the main objective of this work is to verify if the advantages that OpenACC introduces are also positively reflected on the hybrid programming using OpenMP, if compared to the OpenMP + CUDA model. A second objective of this work is to identify aspects of the two programming models that could limit the performance or on the applications' development. As a way to accomplish these goals, this work presents the development of three hybrid parallel algorithms that are based on the Rodinia's benchmark algorithms, namely, RNG, Hotspot and SRAD, using the hybrid models OpenMP + CUDA and OpenMP + OpenACC. In these algorithms, the CPU part of the code is programmed using OpenMP, while it's assigned for the CUDA and OpenACC the parallel processing on the GPU. After the execution of the hybrid algorithms, the performance, efficiency and the processing's splitting in each one of the devices were analyzed. It was verified, through the hybrid algorithms' runs, that, in the two proposed programming models it was possible to outperform the performance of a parallel application that runs on a single API and in only one of the devices. In addition to that, in the hybrid algorithms RNG and Hotspot, CUDA's performance was superior to that of OpenACC, while in the SRAD algorithm OpenACC was faster than CUDA.
publishDate 2014
dc.date.issued.fl_str_mv 2014-08-22
dc.date.accessioned.fl_str_mv 2015-03-24
dc.date.available.fl_str_mv 2015-03-24
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv SULZBACH, Maurício. HYBRID PARALLEL PROGRAMMING FOR CPU AND GPU: AN EVALUATION OF OPENACC AS RELATED TO OPENMP AND CUDA. 2014. 100 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Santa Maria, Santa Maria, 2014.
dc.identifier.uri.fl_str_mv http://repositorio.ufsm.br/handle/1/5441
identifier_str_mv SULZBACH, Maurício. HYBRID PARALLEL PROGRAMMING FOR CPU AND GPU: AN EVALUATION OF OPENACC AS RELATED TO OPENMP AND CUDA. 2014. 100 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Santa Maria, Santa Maria, 2014.
url http://repositorio.ufsm.br/handle/1/5441
dc.language.iso.fl_str_mv por
language por
dc.relation.cnpq.fl_str_mv 100300000007
dc.relation.confidence.fl_str_mv 400
300
300
300
300
dc.relation.authority.fl_str_mv c96e146b-e7f3-455a-8889-3d68529e5b26
4ff789df-7a8e-4754-b744-975996b12a82
d66ed8c6-e395-4691-8ab7-919ae4d12b50
e284c963-25c2-4a1c-8bc0-ca8f52bb2cef
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Santa Maria
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Informática
dc.publisher.initials.fl_str_mv UFSM
dc.publisher.country.fl_str_mv BR
dc.publisher.department.fl_str_mv Ciência da Computação
publisher.none.fl_str_mv Universidade Federal de Santa Maria
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações do UFSM
instname:Universidade Federal de Santa Maria (UFSM)
instacron:UFSM
instname_str Universidade Federal de Santa Maria (UFSM)
instacron_str UFSM
institution UFSM
reponame_str Biblioteca Digital de Teses e Dissertações do UFSM
collection Biblioteca Digital de Teses e Dissertações do UFSM
bitstream.url.fl_str_mv http://repositorio.ufsm.br/bitstream/1/5441/1/SULZBACH%2c%20MAURICIO.pdf
http://repositorio.ufsm.br/bitstream/1/5441/2/SULZBACH%2c%20MAURICIO.pdf.txt
http://repositorio.ufsm.br/bitstream/1/5441/3/SULZBACH%2c%20MAURICIO.pdf.jpg
bitstream.checksum.fl_str_mv 0bf5794d66ff18af0b3436d8a752bc6f
60b127dc92e154afdb095b511a33652e
826c9acdd925e07c93c44ccf2935c820
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações do UFSM - Universidade Federal de Santa Maria (UFSM)
repository.mail.fl_str_mv atendimento.sib@ufsm.br||tedebc@gmail.com
_version_ 1801485201773166592