Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória

Raeder, Mateus

Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória

Detalhes bibliográficos
Autor(a) principal:	Raeder, Mateus
Data de Publicação:	2014
Tipo de documento:	Tese
Idioma:	por
Título da fonte:	Biblioteca Digital de Teses e Dissertações da PUC_RS
Texto Completo:	http://tede2.pucrs.br/tede2/handle/tede/7390
Resumo:	Over the last years, technological advances provide machines with different levels of parallelism, producing a great impact in high-performance computing area. These advances allowed developers to improve further the performance of large scale applications. In this context, clusters of multiprocessor machines with Non-Uniform Memory Access (NUMA) are a trend in parallel processing. In NUMA architectures, the access time to data depends on where it is placed in memory. For this reason, managing data location is essential in this type of machine. In this scenario, developing software for a cluster of NUMA machines must explore the internode part (multicomputer, with distributed memory) and the intranode part (multiprocessor, with shared memory) of this architecture. This type of hybrid programming takes advantage of all features provided by NUMA architectures. However, rewriting a sequential application so that it exploits the parallelism of the environment correctly is not a trivial task, but can be facilitated through an automated process. In this sense, our work presents an automatic parallel code generation process for hybrid architectures. With the proposed approach, users do not need to know low level routines of parallel programming libraries. In order to do so, we developed a graphical tool, in which users can dynamically and intuitively create their parallel models. Thereby, it is possible to create parallel programs in such a way that is not required to be familiar with libraries commonly used by professionals of high performance computing area (such as MPI, for example). By using the developed tool, user draws a directed graph to indicate the number of processes (nodes of the graph) and the communication between them (edges). From this drawing, user inserts the sequential code of each process defined in the graphical interface, and the tool automatically generates the corresponding parallel code. Moreover, weight process and memory mappings were defined and tested on a NUMA machine cluster, as well as a hybrid mapping. The tool was developed in Java and generates parallel code with MPI for C++, in the same way that it applies memory affinity policies for NUMA machines through the Memory Affinity Interface (MAI) library. Some applications were developed with and without our model. The obtained results evidence that the proposed mapping is valid, providing performance gains in relation to sequential versions and behaving in a very similar way to traditional parallel implementations.

Metadados do item

id	P_RS_8214555168e0bfb82bb49b7927609a56
oai_identifier_str	oai:tede2.pucrs.br:tede/7390
network_acronym_str	P_RS
network_name_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling	Fernandes, Luiz Gustavo Leão571.500.100-59http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4784653A5009.581.410-88http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4184274T6Raeder, Mateus2017-06-29T13:36:54Z2014-08-27http://tede2.pucrs.br/tede2/handle/tede/7390Over the last years, technological advances provide machines with different levels of parallelism, producing a great impact in high-performance computing area. These advances allowed developers to improve further the performance of large scale applications. In this context, clusters of multiprocessor machines with Non-Uniform Memory Access (NUMA) are a trend in parallel processing. In NUMA architectures, the access time to data depends on where it is placed in memory. For this reason, managing data location is essential in this type of machine. In this scenario, developing software for a cluster of NUMA machines must explore the internode part (multicomputer, with distributed memory) and the intranode part (multiprocessor, with shared memory) of this architecture. This type of hybrid programming takes advantage of all features provided by NUMA architectures. However, rewriting a sequential application so that it exploits the parallelism of the environment correctly is not a trivial task, but can be facilitated through an automated process. In this sense, our work presents an automatic parallel code generation process for hybrid architectures. With the proposed approach, users do not need to know low level routines of parallel programming libraries. In order to do so, we developed a graphical tool, in which users can dynamically and intuitively create their parallel models. Thereby, it is possible to create parallel programs in such a way that is not required to be familiar with libraries commonly used by professionals of high performance computing area (such as MPI, for example). By using the developed tool, user draws a directed graph to indicate the number of processes (nodes of the graph) and the communication between them (edges). From this drawing, user inserts the sequential code of each process defined in the graphical interface, and the tool automatically generates the corresponding parallel code. Moreover, weight process and memory mappings were defined and tested on a NUMA machine cluster, as well as a hybrid mapping. The tool was developed in Java and generates parallel code with MPI for C++, in the same way that it applies memory affinity policies for NUMA machines through the Memory Affinity Interface (MAI) library. Some applications were developed with and without our model. The obtained results evidence that the proposed mapping is valid, providing performance gains in relation to sequential versions and behaving in a very similar way to traditional parallel implementations.Nos últimos anos, avanços tecnológicos têm disponibilizado máquinas com diferentes níveis de paralelismo, produzindo um grande impacto na área de processamento de alto desempenho. Estes avanços permitiram aos desenvolvedores melhorar ainda mais o desempenho de aplicações de grande porte. Neste contexto, a criação de clusters de máquinas multiprocessadas com acesso não uniforme à memória (NUMA - Non-Uniform Memory Access), surge como uma tendência. Em uma arquitetura NUMA, o tempo de acesso a um dado depende de sua localização na memória. Por este motivo, gerenciar a localização dos dados é essencial em máquinas deste tipo. Neste cenário, o desenvolvimento de software para um cluster de máquinas NUMA deve explorar tanto a parte internodo (multicomputador, com memória distribuída) quanto a parte intranodo (multiprocessador, memória compartilhada) desta arquitetura. Este tipo de programação híbrida faz melhor uso dos recursos disponibilizados por arquiteturas NUMA. Entretanto, reescrever uma aplicação sequencial de modo que explore o paralelismo do ambiente de forma correta não é uma tarefa trivial, mas que pode ser facilitada através de um processo automatizado. Neste sentido, o presente trabalho apresenta um processo de geração automática e transparente de aplicações paralelas híbridas, sem que o usuário precise conhecer as rotinas de baixo nível das bibliotecas de programação paralela. Foi desenvolvida então, uma ferramenta gráfica para que o usuário crie seu modelo paralelo de forma dinâmica e intuitiva. Assim, é possível criar programas paralelos de tal forma que não é necessário ser familiarizado com bibliotecas comumente utilizadas por profissionais da área de alto desempenho (como o MPI, por exemplo). Através da ferramenta desenvolvida, o usuário desenha um grafo dirigido para indicar a quantidade de processos (nodos do grafo) e as formas de comunicação entre eles (arestas). A partir desse desenho, o usuário insere o código sequencial de cada processo definido na interface gráfica, e a ferramenta gera o código paralelo correspondente. Além disto, mapeamentos de processos pesados e de memória foram definidos e testados em um cluster de máquinas NUMA, bem como um mapeamento híbrido. A ferramenta foi desenvolvida em Java e gera código paralelo com MPI em C++, além de aplicar políticas de afinidade de memória para máquinas NUMA através da biblioteca MAI (Memory Affinity Interface). Algumas aplicações foram desenvolvidas com e sem a utilização do modelo. Os resultados demonstram que o mapeamento proposto é válido, já que houve ganho de desempenho em relação às versões sequenciais, além de um comportamento similar a implementações paralelas tradicionais.Submitted by Caroline Xavier (caroline.xavier@pucrs.br) on 2017-06-29T13:36:54Z No. of bitstreams: 1 TES_MATEUS_RAEDER_COMPLETO.pdf: 6448267 bytes, checksum: af90fc3a763acd6de5c2203df411193f (MD5)Made available in DSpace on 2017-06-29T13:36:54Z (GMT). No. of bitstreams: 1 TES_MATEUS_RAEDER_COMPLETO.pdf: 6448267 bytes, checksum: af90fc3a763acd6de5c2203df411193f (MD5) Previous issue date: 2014-08-27application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/168884/TES_MATEUS_RAEDER_COMPLETO.pdf.jpgporPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilFaculdade de InformáticaProgramação HíbridaAfinidade de MemóriaCluster de NUMACIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOUm processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memóriaAn automatic parallel code generation process for hybrid architectures using memory affinityinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis1974996533081274470600600600-30085425104011491443671711205811204509info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSTHUMBNAILTES_MATEUS_RAEDER_COMPLETO.pdf.jpgTES_MATEUS_RAEDER_COMPLETO.pdf.jpgimage/jpeg3988http://tede2.pucrs.br/tede2/bitstream/tede/7390/5/TES_MATEUS_RAEDER_COMPLETO.pdf.jpgb72147970d8e5238a74a6bd7f6efe397MD55TEXTTES_MATEUS_RAEDER_COMPLETO.pdf.txtTES_MATEUS_RAEDER_COMPLETO.pdf.txttext/plain216960http://tede2.pucrs.br/tede2/bitstream/tede/7390/4/TES_MATEUS_RAEDER_COMPLETO.pdf.txt6fbfe99fef1a76d5085fc58f96fab591MD54LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/7390/3/license.txt5a9d6006225b368ef605ba16b4f6d1beMD53ORIGINALTES_MATEUS_RAEDER_COMPLETO.pdfTES_MATEUS_RAEDER_COMPLETO.pdfapplication/pdf6448267http://tede2.pucrs.br/tede2/bitstream/tede/7390/2/TES_MATEUS_RAEDER_COMPLETO.pdfaf90fc3a763acd6de5c2203df411193fMD52tede/73902017-06-29 12:01:16.668oai:tede2.pucrs.br:tede/7390QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br\|\|opendoar:2017-06-29T15:01:16Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
dc.title.alternative.eng.fl_str_mv	An automatic parallel code generation process for hybrid architectures using memory affinity
title	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
spellingShingle	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória Raeder, Mateus Programação Híbrida Afinidade de Memória Cluster de NUMA CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
title_full	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
title_fullStr	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
title_full_unstemmed	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
title_sort	Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória
author	Raeder, Mateus
author_facet	Raeder, Mateus
author_role	author
dc.contributor.advisor1.fl_str_mv	Fernandes, Luiz Gustavo Leão
dc.contributor.advisor1ID.fl_str_mv	571.500.100-59
dc.contributor.advisor1Lattes.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4784653A5
dc.contributor.authorID.fl_str_mv	009.581.410-88
dc.contributor.authorLattes.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4184274T6
dc.contributor.author.fl_str_mv	Raeder, Mateus
contributor_str_mv	Fernandes, Luiz Gustavo Leão
dc.subject.por.fl_str_mv	Programação Híbrida Afinidade de Memória Cluster de NUMA
topic	Programação Híbrida Afinidade de Memória Cluster de NUMA CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description	Over the last years, technological advances provide machines with different levels of parallelism, producing a great impact in high-performance computing area. These advances allowed developers to improve further the performance of large scale applications. In this context, clusters of multiprocessor machines with Non-Uniform Memory Access (NUMA) are a trend in parallel processing. In NUMA architectures, the access time to data depends on where it is placed in memory. For this reason, managing data location is essential in this type of machine. In this scenario, developing software for a cluster of NUMA machines must explore the internode part (multicomputer, with distributed memory) and the intranode part (multiprocessor, with shared memory) of this architecture. This type of hybrid programming takes advantage of all features provided by NUMA architectures. However, rewriting a sequential application so that it exploits the parallelism of the environment correctly is not a trivial task, but can be facilitated through an automated process. In this sense, our work presents an automatic parallel code generation process for hybrid architectures. With the proposed approach, users do not need to know low level routines of parallel programming libraries. In order to do so, we developed a graphical tool, in which users can dynamically and intuitively create their parallel models. Thereby, it is possible to create parallel programs in such a way that is not required to be familiar with libraries commonly used by professionals of high performance computing area (such as MPI, for example). By using the developed tool, user draws a directed graph to indicate the number of processes (nodes of the graph) and the communication between them (edges). From this drawing, user inserts the sequential code of each process defined in the graphical interface, and the tool automatically generates the corresponding parallel code. Moreover, weight process and memory mappings were defined and tested on a NUMA machine cluster, as well as a hybrid mapping. The tool was developed in Java and generates parallel code with MPI for C++, in the same way that it applies memory affinity policies for NUMA machines through the Memory Affinity Interface (MAI) library. Some applications were developed with and without our model. The obtained results evidence that the proposed mapping is valid, providing performance gains in relation to sequential versions and behaving in a very similar way to traditional parallel implementations.
publishDate	2014
dc.date.issued.fl_str_mv	2014-08-27
dc.date.accessioned.fl_str_mv	2017-06-29T13:36:54Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://tede2.pucrs.br/tede2/handle/tede/7390
url	http://tede2.pucrs.br/tede2/handle/tede/7390
dc.language.iso.fl_str_mv	por
language	por
dc.relation.program.fl_str_mv	1974996533081274470
dc.relation.confidence.fl_str_mv	600 600 600
dc.relation.department.fl_str_mv	-3008542510401149144
dc.relation.cnpq.fl_str_mv	3671711205811204509
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Pontifícia Universidade Católica do Rio Grande do Sul
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv	PUCRS
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Faculdade de Informática
publisher.none.fl_str_mv	Pontifícia Universidade Católica do Rio Grande do Sul
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) instacron:PUC_RS
instname_str	Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str	PUC_RS
institution	PUC_RS
reponame_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
collection	Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv	http://tede2.pucrs.br/tede2/bitstream/tede/7390/5/TES_MATEUS_RAEDER_COMPLETO.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/7390/4/TES_MATEUS_RAEDER_COMPLETO.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/7390/3/license.txt http://tede2.pucrs.br/tede2/bitstream/tede/7390/2/TES_MATEUS_RAEDER_COMPLETO.pdf
bitstream.checksum.fl_str_mv	b72147970d8e5238a74a6bd7f6efe397 6fbfe99fef1a76d5085fc58f96fab591 5a9d6006225b368ef605ba16b4f6d1be af90fc3a763acd6de5c2203df411193f
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv	biblioteca.central@pucrs.br\|\|
_version_	1799765326239367168

Um processo de geração automática de código paralelo para arquiteturas híbridas com afinidade de memória

Registros relacionados