Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language

Barros, Diana Almeida; diana.ab@live.com

Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language

Detalhes bibliográficos
Autor(a) principal:	Barros, Diana Almeida
Data de Publicação:	2023
Outros Autores:	diana.ab@live.com
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Biblioteca Digital de Teses e Dissertações da UERJ
Texto Completo:	http://www.bdtd.uerj.br/handle/1/20486
Resumo:	Areas of study like data science, machine learning or scientific computing are promising areas that are currently receiving a lot of investment. These areas are usually very complex and demand optimal usage of computing resources for better performance. In this context, Julia language was born. A dynamic language that offers a high performance environment with a user friendly syntax. Although its design is focused on performance, shared memory parallel computing still has some features under development and the studies in this area are so far scarce. In this work, we propose a detailed performance study of the shared memory parallel computing mechanisms of Julia. It was analyzed the performance of the Multithreading and SIMD mechanisms. In the analysis of Multithreading, we compare the data and task parallelism strategies available through the language built-in macros @threads and @spawn, focusing on the way they distribute the loop iterations. Furthermore, the loop scheduling mechanisms available for the Julia version used in this work were analyzed, which are the one static scheduling provided by @threads and the ones provided by the package FLoops.jl, and it was observed their performance behavior when the environment scales. In the analysis of the SIMD mechanisms, the compiler auto-vectorization was compared to the built-in construction @simd and two packages for vectorization. Our experiments were run with synthetic kernels, benchmark applications and a real-world optimization framework. Our results show that the macro @spawn presented a better performance on unbalanced loads and the different loop schedulers offered by FLoops.jl help improving the performance of applications with load imbalance. However, we found that applications that are commonly found in real world scenarios are susceptible to overhead and loss of performance as the nature of the problem influences on code implementation, being more noticeable when @spawn was used or if the environment could scale with threads. For the SIMD mechanisms, we showed that the package LoopVectorization.jl provided the best performance results with low programming effort.

Metadados do item

id	UERJ_532ee082020956759e4e6ecac7d73f34
oai_identifier_str	oai:www.bdtd.uerj.br:1/20486
network_acronym_str	UERJ
network_name_str	Biblioteca Digital de Teses e Dissertações da UERJ
repository_id_str	2903
spelling	Bentes, Cristiana BarbosaSena, AlexandreHoffimann, JúlioPessoa, Tiago CarneiroBarros, Diana Almeidadiana.ab@live.com2023-10-20T14:57:22Z2023-04-26BARROS, Diana Almeida. Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language. 2023. 93 f. Dissertação( Mestrado em Ciências Computacionais) - Instituto de Matemática e Estatística, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, 2023.http://www.bdtd.uerj.br/handle/1/20486Areas of study like data science, machine learning or scientific computing are promising areas that are currently receiving a lot of investment. These areas are usually very complex and demand optimal usage of computing resources for better performance. In this context, Julia language was born. A dynamic language that offers a high performance environment with a user friendly syntax. Although its design is focused on performance, shared memory parallel computing still has some features under development and the studies in this area are so far scarce. In this work, we propose a detailed performance study of the shared memory parallel computing mechanisms of Julia. It was analyzed the performance of the Multithreading and SIMD mechanisms. In the analysis of Multithreading, we compare the data and task parallelism strategies available through the language built-in macros @threads and @spawn, focusing on the way they distribute the loop iterations. Furthermore, the loop scheduling mechanisms available for the Julia version used in this work were analyzed, which are the one static scheduling provided by @threads and the ones provided by the package FLoops.jl, and it was observed their performance behavior when the environment scales. In the analysis of the SIMD mechanisms, the compiler auto-vectorization was compared to the built-in construction @simd and two packages for vectorization. Our experiments were run with synthetic kernels, benchmark applications and a real-world optimization framework. Our results show that the macro @spawn presented a better performance on unbalanced loads and the different loop schedulers offered by FLoops.jl help improving the performance of applications with load imbalance. However, we found that applications that are commonly found in real world scenarios are susceptible to overhead and loss of performance as the nature of the problem influences on code implementation, being more noticeable when @spawn was used or if the environment could scale with threads. For the SIMD mechanisms, we showed that the package LoopVectorization.jl provided the best performance results with low programming effort.Áreas de estudo como ciência de dados, aprendizado de máquina ou computação científica são áreas promissoras que estão recebendo muitos investimentos atualmente. Essas áreas geralmente são muito complexas e exigem um bom uso dos recursos de computação para um melhor desempenho. Nesse contexto, nasceu a linguagem Julia. Uma linguagem dinâmica que oferece um ambiente de alto desempenho com uma sintaxe amigável. Embora seu design seja voltado para a performance, a computação paralela com memória compartilhada ainda possui alguns recursos em desenvolvimento e os estudos nesta área até o momento são escassos. Neste trabalho, apresentamos um estudo do desempenho dos mecanismos de computação paralela de memória compartilhada da linguagem de programação Julia. Foram analisados o desempenho dos mecanismos Multithreading e SIMD. Na análise do Multithreading, comparamos as estratégias de paralelismo de dados e de tarefas disponíveis através das macros built-in @threads e @spawn, focando na forma como distribuem as iterações do loop. Além do mais, foram analisados os mecanismos de escalonamento de loops disponíveis na versão de Julia utilizada neste trabalho, que são o próprio escalonamento estático da macro @threads e os escalonamentos do pacote FLoops.jl, e foi observado o comportamento da performance de tais mecanismos num ambiente escalável. Na análise dos mecanismos SIMD, comparamos a autovetorização do compilador com a construção built-in @simd e dois pacotes para vetorização. Executamos nossos experimentos com kernels sintéticos, aplicações de benchmarks e em um framework de otimização do mundo real. Nossos resultados mostram que a macro @spawn apresentou melhor desempenho em cargas desbalanceadas e os diferentes tipos de escalonamento de loops oferecidos pelo FLoops.jl ajudam a melhorar a performance das aplicações com desbalanceamento de carga. Contudo, aplicações comuns em cenários reais se mostraram mais suscetíveis a overhead e perda de desempenho justamente pela natureza do problema influenciar na forma em que o código é implementado, sendo mais notáveis quando @spawn é utilizado ou quando o ambiente escala em número de threads. Para os mecanismos SIMD, mostramos que o pacote LoopVectorization.jl proporcionou os melhores resultados de desempenho com baixo esforço de programação.Submitted by Bárbara CTC/A (babalusotnas@gmail.com) on 2023-10-20T14:57:22Z No. of bitstreams: 1 Dissertação - Diana Almeida Barros - 2023 - Completa.pdf: 1816811 bytes, checksum: fd019b5947968a6e1691beb337165f66 (MD5)Made available in DSpace on 2023-10-20T14:57:22Z (GMT). No. of bitstreams: 1 Dissertação - Diana Almeida Barros - 2023 - Completa.pdf: 1816811 bytes, checksum: fd019b5947968a6e1691beb337165f66 (MD5) Previous issue date: 2023-04-26application/pdfengUniversidade do Estado do Rio de JaneiroPrograma de Pós-Graduação em Ciências ComputacionaisUERJBrasilCentro de Tecnologia e Ciências::Instituto de Matemática e EstatísticaShared MemoryMultithreadingLoop SchedulingJulia LanguageJulia (Linguagem de programação de computador)Linguagem JuliaMemória CompartilhadaMultithreadingSIMDEscalonamento de LoopsCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOEvaluating Shared Memory Parallel Computing Mechanisms of the Julia LanguageAvaliando os mecanismos de computação paralela com memória compartilhada da linguagem Julia.info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UERJinstname:Universidade do Estado do Rio de Janeiro (UERJ)instacron:UERJORIGINALDissertação - Diana Almeida Barros - 2023 - Completa.pdfDissertação - Diana Almeida Barros - 2023 - Completa.pdfapplication/pdf1816811http://www.bdtd.uerj.br/bitstream/1/20486/2/Disserta%C3%A7%C3%A3o+-+Diana+Almeida+Barros+-+2023+-+Completa.pdffd019b5947968a6e1691beb337165f66MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82123http://www.bdtd.uerj.br/bitstream/1/20486/1/license.txte5502652da718045d7fcd832b79fca29MD511/204862024-02-27 14:34:49.309oai:www.bdtd.uerj.br:1/20486Tk9UQTogTElDRU7Dh0EgUkVERSBTSVJJVVMKRXN0YSBsaWNlbsOnYSBkZSBleGVtcGxvIMOpIGZvcm5lY2lkYSBhcGVuYXMgcGFyYSBmaW5zIGluZm9ybWF0aXZvcy4KCkxJQ0VOw4dBIERFIERJU1RSSUJVScOHw4NPIE7Dg08tRVhDTFVTSVZBCgpDb20gYSBhcHJlc2VudGHDp8OjbyBkZXN0YSBsaWNlbsOnYSwgdm9jw6ogKG8gYXV0b3IgKGVzKSBvdSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGRlIGF1dG9yKSBjb25jZWRlIMOgIFVuaXZlcnNpZGFkZSAKZG8gRXN0YWRvIGRvIFJpbyBkZSBKYW5laXJvIChVRVJKKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFVFUkogcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250ZcO6ZG8sIHRyYW5zcG9yIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAKcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVFUkogcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGEgc3VhIHRlc2Ugb3UgCmRpc3NlcnRhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIApuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldSAKY29uaGVjaW1lbnRvLCBpbmZyaW5nZSBkaXJlaXRvcyBhdXRvcmFpcyBkZSBuaW5ndcOpbS4KCkNhc28gYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIGNvbnRlbmhhIG1hdGVyaWFsIHF1ZSB2b2PDqiBuw6NvIHBvc3N1aSBhIHRpdHVsYXJpZGFkZSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMsIHZvY8OqIApkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgw6AgVUVSSiBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgCmlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIG5vIHRleHRvIG91IG5vIGNvbnRlw7pkbyBkYSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gb3JhIGRlcG9zaXRhZGEuCgpDQVNPIEEgVEVTRSBPVSBESVNTRVJUQcOHw4NPIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ8ONTklPIE9VIApBUE9JTyBERSBVTUEgQUfDik5DSUEgREUgRk9NRU5UTyBPVSBPVVRSTyBPUkdBTklTTU8gUVVFIE7Dg08gU0VKQSBFU1RBClVOSVZFUlNJREFERSwgVk9Dw4ogREVDTEFSQSBRVUUgUkVTUEVJVE9VIFRPRE9TIEUgUVVBSVNRVUVSIERJUkVJVE9TIERFIFJFVklTw4NPIENPTU8gClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVbml2ZXJzaWRhZGUgZG8gRXN0YWRvIGRvIFJpbyBkZSBKYW5laXJvIChVRVJKKSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Teses e Dissertaçõeshttp://www.bdtd.uerj.br/PUBhttps://www.bdtd.uerj.br:8443/oai/requestbdtd.suporte@uerj.bropendoar:29032024-02-27T17:34:49Biblioteca Digital de Teses e Dissertações da UERJ - Universidade do Estado do Rio de Janeiro (UERJ)false
dc.title.por.fl_str_mv	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
dc.title.alternative.por.fl_str_mv	Avaliando os mecanismos de computação paralela com memória compartilhada da linguagem Julia.
title	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
spellingShingle	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language Barros, Diana Almeida Shared Memory Multithreading Loop Scheduling Julia Language Julia (Linguagem de programação de computador) Linguagem Julia Memória Compartilhada Multithreading SIMD Escalonamento de Loops CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_full	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_fullStr	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_full_unstemmed	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_sort	Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
author	Barros, Diana Almeida
author_facet	Barros, Diana Almeida diana.ab@live.com
author_role	author
author2	diana.ab@live.com
author2_role	author
dc.contributor.advisor1.fl_str_mv	Bentes, Cristiana Barbosa
dc.contributor.referee1.fl_str_mv	Sena, Alexandre
dc.contributor.referee2.fl_str_mv	Hoffimann, Júlio
dc.contributor.referee3.fl_str_mv	Pessoa, Tiago Carneiro
dc.contributor.author.fl_str_mv	Barros, Diana Almeida diana.ab@live.com
contributor_str_mv	Bentes, Cristiana Barbosa Sena, Alexandre Hoffimann, Júlio Pessoa, Tiago Carneiro
dc.subject.eng.fl_str_mv	Shared Memory Multithreading Loop Scheduling
topic	Shared Memory Multithreading Loop Scheduling Julia Language Julia (Linguagem de programação de computador) Linguagem Julia Memória Compartilhada Multithreading SIMD Escalonamento de Loops CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cat.fl_str_mv	Julia Language
dc.subject.por.fl_str_mv	Julia (Linguagem de programação de computador) Linguagem Julia Memória Compartilhada Multithreading SIMD Escalonamento de Loops
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description	Areas of study like data science, machine learning or scientific computing are promising areas that are currently receiving a lot of investment. These areas are usually very complex and demand optimal usage of computing resources for better performance. In this context, Julia language was born. A dynamic language that offers a high performance environment with a user friendly syntax. Although its design is focused on performance, shared memory parallel computing still has some features under development and the studies in this area are so far scarce. In this work, we propose a detailed performance study of the shared memory parallel computing mechanisms of Julia. It was analyzed the performance of the Multithreading and SIMD mechanisms. In the analysis of Multithreading, we compare the data and task parallelism strategies available through the language built-in macros @threads and @spawn, focusing on the way they distribute the loop iterations. Furthermore, the loop scheduling mechanisms available for the Julia version used in this work were analyzed, which are the one static scheduling provided by @threads and the ones provided by the package FLoops.jl, and it was observed their performance behavior when the environment scales. In the analysis of the SIMD mechanisms, the compiler auto-vectorization was compared to the built-in construction @simd and two packages for vectorization. Our experiments were run with synthetic kernels, benchmark applications and a real-world optimization framework. Our results show that the macro @spawn presented a better performance on unbalanced loads and the different loop schedulers offered by FLoops.jl help improving the performance of applications with load imbalance. However, we found that applications that are commonly found in real world scenarios are susceptible to overhead and loss of performance as the nature of the problem influences on code implementation, being more noticeable when @spawn was used or if the environment could scale with threads. For the SIMD mechanisms, we showed that the package LoopVectorization.jl provided the best performance results with low programming effort.
publishDate	2023
dc.date.accessioned.fl_str_mv	2023-10-20T14:57:22Z
dc.date.issued.fl_str_mv	2023-04-26
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	BARROS, Diana Almeida. Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language. 2023. 93 f. Dissertação( Mestrado em Ciências Computacionais) - Instituto de Matemática e Estatística, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, 2023.
dc.identifier.uri.fl_str_mv	http://www.bdtd.uerj.br/handle/1/20486
identifier_str_mv	BARROS, Diana Almeida. Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language. 2023. 93 f. Dissertação( Mestrado em Ciências Computacionais) - Instituto de Matemática e Estatística, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, 2023.
url	http://www.bdtd.uerj.br/handle/1/20486
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidade do Estado do Rio de Janeiro
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciências Computacionais
dc.publisher.initials.fl_str_mv	UERJ
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Centro de Tecnologia e Ciências::Instituto de Matemática e Estatística
publisher.none.fl_str_mv	Universidade do Estado do Rio de Janeiro
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da UERJ instname:Universidade do Estado do Rio de Janeiro (UERJ) instacron:UERJ
instname_str	Universidade do Estado do Rio de Janeiro (UERJ)
instacron_str	UERJ
institution	UERJ
reponame_str	Biblioteca Digital de Teses e Dissertações da UERJ
collection	Biblioteca Digital de Teses e Dissertações da UERJ
bitstream.url.fl_str_mv	http://www.bdtd.uerj.br/bitstream/1/20486/2/Disserta%C3%A7%C3%A3o+-+Diana+Almeida+Barros+-+2023+-+Completa.pdf http://www.bdtd.uerj.br/bitstream/1/20486/1/license.txt
bitstream.checksum.fl_str_mv	fd019b5947968a6e1691beb337165f66 e5502652da718045d7fcd832b79fca29
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da UERJ - Universidade do Estado do Rio de Janeiro (UERJ)
repository.mail.fl_str_mv	bdtd.suporte@uerj.br
_version_	1811728741509890048

Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language

Registros relacionados