Performance evaluation of data migration methods between the host and the device in CUDA-based programming
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Outros Autores: | , |
Tipo de documento: | Capítulo de livro |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1007/978-3-319-32467-8_60 http://hdl.handle.net/11449/177938 |
Resumo: | CUDA-based programming model is heterogeneous – composed of two components: host (CPU) and device (GPU). Both components have separated memory spaces and processing units. A great challenge to increase GPU-based application performance is the data migration between these memory spaces. Currently, the CUDA platform supports the following data migration methods: UMA, zero-copy, pageable and pinned memory. In this paper, we compare the zero-copy performance method with the other methods by considering the overall application runtime. Additionally, we investigated the aspects of data migration process to enunciate causes of the performance variations. The obtained results demonstrated in some cases the zero-copy memory can provide an average performance on 19% higher than the pinned memory transfer. In the studied situation, this method was the second most efficient. Finally, we present limitations of zero-copy memory as a resource for improving performance of CUDA applications. |
id |
UNSP_75e09b0d57698f22fb7498ff6915a32e |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/177938 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Performance evaluation of data migration methods between the host and the device in CUDA-based programmingCUDA-based programming model is heterogeneous – composed of two components: host (CPU) and device (GPU). Both components have separated memory spaces and processing units. A great challenge to increase GPU-based application performance is the data migration between these memory spaces. Currently, the CUDA platform supports the following data migration methods: UMA, zero-copy, pageable and pinned memory. In this paper, we compare the zero-copy performance method with the other methods by considering the overall application runtime. Additionally, we investigated the aspects of data migration process to enunciate causes of the performance variations. The obtained results demonstrated in some cases the zero-copy memory can provide an average performance on 19% higher than the pinned memory transfer. In the studied situation, this method was the second most efficient. Finally, we present limitations of zero-copy memory as a resource for improving performance of CUDA applications.UNESP - Univ Estadual PaulistaUNESP - Univ Estadual PaulistaUniversidade Estadual Paulista (Unesp)Santos, Rafael Silva [UNESP]Eler, Danilo Medeiros [UNESP]Garcia, Rogério Eduardo [UNESP]2018-12-11T17:27:47Z2018-12-11T17:27:47Z2016-04-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookPart689-700application/pdfhttp://dx.doi.org/10.1007/978-3-319-32467-8_60Advances in Intelligent Systems and Computing, v. 448, p. 689-700.2194-5357http://hdl.handle.net/11449/17793810.1007/978-3-319-32467-8_602-s2.0-849627090542-s2.0-84962709054.pdf80310125732593610000-0003-1248-528XScopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengAdvances in Intelligent Systems and Computinginfo:eu-repo/semantics/openAccess2024-06-19T14:32:16Zoai:repositorio.unesp.br:11449/177938Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T18:48:22.638833Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
title |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
spellingShingle |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming Santos, Rafael Silva [UNESP] |
title_short |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
title_full |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
title_fullStr |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
title_full_unstemmed |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
title_sort |
Performance evaluation of data migration methods between the host and the device in CUDA-based programming |
author |
Santos, Rafael Silva [UNESP] |
author_facet |
Santos, Rafael Silva [UNESP] Eler, Danilo Medeiros [UNESP] Garcia, Rogério Eduardo [UNESP] |
author_role |
author |
author2 |
Eler, Danilo Medeiros [UNESP] Garcia, Rogério Eduardo [UNESP] |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Santos, Rafael Silva [UNESP] Eler, Danilo Medeiros [UNESP] Garcia, Rogério Eduardo [UNESP] |
description |
CUDA-based programming model is heterogeneous – composed of two components: host (CPU) and device (GPU). Both components have separated memory spaces and processing units. A great challenge to increase GPU-based application performance is the data migration between these memory spaces. Currently, the CUDA platform supports the following data migration methods: UMA, zero-copy, pageable and pinned memory. In this paper, we compare the zero-copy performance method with the other methods by considering the overall application runtime. Additionally, we investigated the aspects of data migration process to enunciate causes of the performance variations. The obtained results demonstrated in some cases the zero-copy memory can provide an average performance on 19% higher than the pinned memory transfer. In the studied situation, this method was the second most efficient. Finally, we present limitations of zero-copy memory as a resource for improving performance of CUDA applications. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-04-01 2018-12-11T17:27:47Z 2018-12-11T17:27:47Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/bookPart |
format |
bookPart |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1007/978-3-319-32467-8_60 Advances in Intelligent Systems and Computing, v. 448, p. 689-700. 2194-5357 http://hdl.handle.net/11449/177938 10.1007/978-3-319-32467-8_60 2-s2.0-84962709054 2-s2.0-84962709054.pdf 8031012573259361 0000-0003-1248-528X |
url |
http://dx.doi.org/10.1007/978-3-319-32467-8_60 http://hdl.handle.net/11449/177938 |
identifier_str_mv |
Advances in Intelligent Systems and Computing, v. 448, p. 689-700. 2194-5357 10.1007/978-3-319-32467-8_60 2-s2.0-84962709054 2-s2.0-84962709054.pdf 8031012573259361 0000-0003-1248-528X |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Advances in Intelligent Systems and Computing |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
689-700 application/pdf |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128981659025408 |