Performance evaluation of data migration methods between the host and the device in CUDA-based programming

Detalhes bibliográficos
Autor(a) principal: Santos, Rafael Silva [UNESP]
Data de Publicação: 2016
Outros Autores: Eler, Danilo Medeiros [UNESP], Garcia, Rogério Eduardo [UNESP]
Tipo de documento: Capítulo de livro
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1007/978-3-319-32467-8_60
http://hdl.handle.net/11449/177938
Resumo: CUDA-based programming model is heterogeneous – composed of two components: host (CPU) and device (GPU). Both components have separated memory spaces and processing units. A great challenge to increase GPU-based application performance is the data migration between these memory spaces. Currently, the CUDA platform supports the following data migration methods: UMA, zero-copy, pageable and pinned memory. In this paper, we compare the zero-copy performance method with the other methods by considering the overall application runtime. Additionally, we investigated the aspects of data migration process to enunciate causes of the performance variations. The obtained results demonstrated in some cases the zero-copy memory can provide an average performance on 19% higher than the pinned memory transfer. In the studied situation, this method was the second most efficient. Finally, we present limitations of zero-copy memory as a resource for improving performance of CUDA applications.
id UNSP_75e09b0d57698f22fb7498ff6915a32e
oai_identifier_str oai:repositorio.unesp.br:11449/177938
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Performance evaluation of data migration methods between the host and the device in CUDA-based programmingCUDA-based programming model is heterogeneous – composed of two components: host (CPU) and device (GPU). Both components have separated memory spaces and processing units. A great challenge to increase GPU-based application performance is the data migration between these memory spaces. Currently, the CUDA platform supports the following data migration methods: UMA, zero-copy, pageable and pinned memory. In this paper, we compare the zero-copy performance method with the other methods by considering the overall application runtime. Additionally, we investigated the aspects of data migration process to enunciate causes of the performance variations. The obtained results demonstrated in some cases the zero-copy memory can provide an average performance on 19% higher than the pinned memory transfer. In the studied situation, this method was the second most efficient. Finally, we present limitations of zero-copy memory as a resource for improving performance of CUDA applications.UNESP - Univ Estadual PaulistaUNESP - Univ Estadual PaulistaUniversidade Estadual Paulista (Unesp)Santos, Rafael Silva [UNESP]Eler, Danilo Medeiros [UNESP]Garcia, Rogério Eduardo [UNESP]2018-12-11T17:27:47Z2018-12-11T17:27:47Z2016-04-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookPart689-700application/pdfhttp://dx.doi.org/10.1007/978-3-319-32467-8_60Advances in Intelligent Systems and Computing, v. 448, p. 689-700.2194-5357http://hdl.handle.net/11449/17793810.1007/978-3-319-32467-8_602-s2.0-849627090542-s2.0-84962709054.pdf80310125732593610000-0003-1248-528XScopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengAdvances in Intelligent Systems and Computinginfo:eu-repo/semantics/openAccess2024-06-19T14:32:16Zoai:repositorio.unesp.br:11449/177938Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T18:48:22.638833Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Performance evaluation of data migration methods between the host and the device in CUDA-based programming
title Performance evaluation of data migration methods between the host and the device in CUDA-based programming
spellingShingle Performance evaluation of data migration methods between the host and the device in CUDA-based programming
Santos, Rafael Silva [UNESP]
title_short Performance evaluation of data migration methods between the host and the device in CUDA-based programming
title_full Performance evaluation of data migration methods between the host and the device in CUDA-based programming
title_fullStr Performance evaluation of data migration methods between the host and the device in CUDA-based programming
title_full_unstemmed Performance evaluation of data migration methods between the host and the device in CUDA-based programming
title_sort Performance evaluation of data migration methods between the host and the device in CUDA-based programming
author Santos, Rafael Silva [UNESP]
author_facet Santos, Rafael Silva [UNESP]
Eler, Danilo Medeiros [UNESP]
Garcia, Rogério Eduardo [UNESP]
author_role author
author2 Eler, Danilo Medeiros [UNESP]
Garcia, Rogério Eduardo [UNESP]
author2_role author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Santos, Rafael Silva [UNESP]
Eler, Danilo Medeiros [UNESP]
Garcia, Rogério Eduardo [UNESP]
description CUDA-based programming model is heterogeneous – composed of two components: host (CPU) and device (GPU). Both components have separated memory spaces and processing units. A great challenge to increase GPU-based application performance is the data migration between these memory spaces. Currently, the CUDA platform supports the following data migration methods: UMA, zero-copy, pageable and pinned memory. In this paper, we compare the zero-copy performance method with the other methods by considering the overall application runtime. Additionally, we investigated the aspects of data migration process to enunciate causes of the performance variations. The obtained results demonstrated in some cases the zero-copy memory can provide an average performance on 19% higher than the pinned memory transfer. In the studied situation, this method was the second most efficient. Finally, we present limitations of zero-copy memory as a resource for improving performance of CUDA applications.
publishDate 2016
dc.date.none.fl_str_mv 2016-04-01
2018-12-11T17:27:47Z
2018-12-11T17:27:47Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/bookPart
format bookPart
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/978-3-319-32467-8_60
Advances in Intelligent Systems and Computing, v. 448, p. 689-700.
2194-5357
http://hdl.handle.net/11449/177938
10.1007/978-3-319-32467-8_60
2-s2.0-84962709054
2-s2.0-84962709054.pdf
8031012573259361
0000-0003-1248-528X
url http://dx.doi.org/10.1007/978-3-319-32467-8_60
http://hdl.handle.net/11449/177938
identifier_str_mv Advances in Intelligent Systems and Computing, v. 448, p. 689-700.
2194-5357
10.1007/978-3-319-32467-8_60
2-s2.0-84962709054
2-s2.0-84962709054.pdf
8031012573259361
0000-0003-1248-528X
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Advances in Intelligent Systems and Computing
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 689-700
application/pdf
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128981659025408