Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language

Detalhes bibliográficos
Autor(a) principal: Barros, Diana Almeida
Data de Publicação: 2023
Outros Autores: diana.ab@live.com
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da UERJ
Texto Completo: http://www.bdtd.uerj.br/handle/1/20486
Resumo: Areas of study like data science, machine learning or scientific computing are promising areas that are currently receiving a lot of investment. These areas are usually very complex and demand optimal usage of computing resources for better performance. In this context, Julia language was born. A dynamic language that offers a high performance environment with a user friendly syntax. Although its design is focused on performance, shared memory parallel computing still has some features under development and the studies in this area are so far scarce. In this work, we propose a detailed performance study of the shared memory parallel computing mechanisms of Julia. It was analyzed the performance of the Multithreading and SIMD mechanisms. In the analysis of Multithreading, we compare the data and task parallelism strategies available through the language built-in macros @threads and @spawn, focusing on the way they distribute the loop iterations. Furthermore, the loop scheduling mechanisms available for the Julia version used in this work were analyzed, which are the one static scheduling provided by @threads and the ones provided by the package FLoops.jl, and it was observed their performance behavior when the environment scales. In the analysis of the SIMD mechanisms, the compiler auto-vectorization was compared to the built-in construction @simd and two packages for vectorization. Our experiments were run with synthetic kernels, benchmark applications and a real-world optimization framework. Our results show that the macro @spawn presented a better performance on unbalanced loads and the different loop schedulers offered by FLoops.jl help improving the performance of applications with load imbalance. However, we found that applications that are commonly found in real world scenarios are susceptible to overhead and loss of performance as the nature of the problem influences on code implementation, being more noticeable when @spawn was used or if the environment could scale with threads. For the SIMD mechanisms, we showed that the package LoopVectorization.jl provided the best performance results with low programming effort.
id UERJ_532ee082020956759e4e6ecac7d73f34
oai_identifier_str oai:www.bdtd.uerj.br:1/20486
network_acronym_str UERJ
network_name_str Biblioteca Digital de Teses e Dissertações da UERJ
repository_id_str 2903
spelling Bentes, Cristiana BarbosaSena, AlexandreHoffimann, JúlioPessoa, Tiago CarneiroBarros, Diana Almeidadiana.ab@live.com2023-10-20T14:57:22Z2023-04-26BARROS, Diana Almeida. Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language. 2023. 93 f. Dissertação( Mestrado em Ciências Computacionais) - Instituto de Matemática e Estatística, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, 2023.http://www.bdtd.uerj.br/handle/1/20486Areas of study like data science, machine learning or scientific computing are promising areas that are currently receiving a lot of investment. These areas are usually very complex and demand optimal usage of computing resources for better performance. In this context, Julia language was born. A dynamic language that offers a high performance environment with a user friendly syntax. Although its design is focused on performance, shared memory parallel computing still has some features under development and the studies in this area are so far scarce. In this work, we propose a detailed performance study of the shared memory parallel computing mechanisms of Julia. It was analyzed the performance of the Multithreading and SIMD mechanisms. In the analysis of Multithreading, we compare the data and task parallelism strategies available through the language built-in macros @threads and @spawn, focusing on the way they distribute the loop iterations. Furthermore, the loop scheduling mechanisms available for the Julia version used in this work were analyzed, which are the one static scheduling provided by @threads and the ones provided by the package FLoops.jl, and it was observed their performance behavior when the environment scales. In the analysis of the SIMD mechanisms, the compiler auto-vectorization was compared to the built-in construction @simd and two packages for vectorization. Our experiments were run with synthetic kernels, benchmark applications and a real-world optimization framework. Our results show that the macro @spawn presented a better performance on unbalanced loads and the different loop schedulers offered by FLoops.jl help improving the performance of applications with load imbalance. However, we found that applications that are commonly found in real world scenarios are susceptible to overhead and loss of performance as the nature of the problem influences on code implementation, being more noticeable when @spawn was used or if the environment could scale with threads. For the SIMD mechanisms, we showed that the package LoopVectorization.jl provided the best performance results with low programming effort.Áreas de estudo como ciência de dados, aprendizado de máquina ou computação científica são áreas promissoras que estão recebendo muitos investimentos atualmente. Essas áreas geralmente são muito complexas e exigem um bom uso dos recursos de computação para um melhor desempenho. Nesse contexto, nasceu a linguagem Julia. Uma linguagem dinâmica que oferece um ambiente de alto desempenho com uma sintaxe amigável. Embora seu design seja voltado para a performance, a computação paralela com memória compartilhada ainda possui alguns recursos em desenvolvimento e os estudos nesta área até o momento são escassos. Neste trabalho, apresentamos um estudo do desempenho dos mecanismos de computação paralela de memória compartilhada da linguagem de programação Julia. Foram analisados o desempenho dos mecanismos Multithreading e SIMD. Na análise do Multithreading, comparamos as estratégias de paralelismo de dados e de tarefas disponíveis através das macros built-in @threads e @spawn, focando na forma como distribuem as iterações do loop. Além do mais, foram analisados os mecanismos de escalonamento de loops disponíveis na versão de Julia utilizada neste trabalho, que são o próprio escalonamento estático da macro @threads e os escalonamentos do pacote FLoops.jl, e foi observado o comportamento da performance de tais mecanismos num ambiente escalável. Na análise dos mecanismos SIMD, comparamos a autovetorização do compilador com a construção built-in @simd e dois pacotes para vetorização. Executamos nossos experimentos com kernels sintéticos, aplicações de benchmarks e em um framework de otimização do mundo real. Nossos resultados mostram que a macro @spawn apresentou melhor desempenho em cargas desbalanceadas e os diferentes tipos de escalonamento de loops oferecidos pelo FLoops.jl ajudam a melhorar a performance das aplicações com desbalanceamento de carga. Contudo, aplicações comuns em cenários reais se mostraram mais suscetíveis a overhead e perda de desempenho justamente pela natureza do problema influenciar na forma em que o código é implementado, sendo mais notáveis quando @spawn é utilizado ou quando o ambiente escala em número de threads. Para os mecanismos SIMD, mostramos que o pacote LoopVectorization.jl proporcionou os melhores resultados de desempenho com baixo esforço de programação.Submitted by Bárbara CTC/A (babalusotnas@gmail.com) on 2023-10-20T14:57:22Z No. of bitstreams: 1 Dissertação - Diana Almeida Barros - 2023 - Completa.pdf: 1816811 bytes, checksum: fd019b5947968a6e1691beb337165f66 (MD5)Made available in DSpace on 2023-10-20T14:57:22Z (GMT). No. of bitstreams: 1 Dissertação - Diana Almeida Barros - 2023 - Completa.pdf: 1816811 bytes, checksum: fd019b5947968a6e1691beb337165f66 (MD5) Previous issue date: 2023-04-26application/pdfengUniversidade do Estado do Rio de JaneiroPrograma de Pós-Graduação em Ciências ComputacionaisUERJBrasilCentro de Tecnologia e Ciências::Instituto de Matemática e EstatísticaShared MemoryMultithreadingLoop SchedulingJulia LanguageJulia (Linguagem de programação de computador)Linguagem JuliaMemória CompartilhadaMultithreadingSIMDEscalonamento de LoopsCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOEvaluating Shared Memory Parallel Computing Mechanisms of the Julia LanguageAvaliando os mecanismos de computação paralela com memória compartilhada da linguagem Julia.info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UERJinstname:Universidade do Estado do Rio de Janeiro (UERJ)instacron:UERJORIGINALDissertação - Diana Almeida Barros - 2023 - Completa.pdfDissertação - Diana Almeida Barros - 2023 - Completa.pdfapplication/pdf1816811http://www.bdtd.uerj.br/bitstream/1/20486/2/Disserta%C3%A7%C3%A3o+-+Diana+Almeida+Barros+-+2023+-+Completa.pdffd019b5947968a6e1691beb337165f66MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82123http://www.bdtd.uerj.br/bitstream/1/20486/1/license.txte5502652da718045d7fcd832b79fca29MD511/204862024-02-27 14:34:49.309oai:www.bdtd.uerj.br:1/20486Tk9UQTogTElDRU7Dh0EgUkVERSBTSVJJVVMKRXN0YSBsaWNlbsOnYSBkZSBleGVtcGxvIMOpIGZvcm5lY2lkYSBhcGVuYXMgcGFyYSBmaW5zIGluZm9ybWF0aXZvcy4KCkxJQ0VOw4dBIERFIERJU1RSSUJVScOHw4NPIE7Dg08tRVhDTFVTSVZBCgpDb20gYSBhcHJlc2VudGHDp8OjbyBkZXN0YSBsaWNlbsOnYSwgdm9jw6ogKG8gYXV0b3IgKGVzKSBvdSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGRlIGF1dG9yKSBjb25jZWRlIMOgIFVuaXZlcnNpZGFkZSAKZG8gRXN0YWRvIGRvIFJpbyBkZSBKYW5laXJvIChVRVJKKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFVFUkogcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250ZcO6ZG8sIHRyYW5zcG9yIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAKcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVFUkogcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGEgc3VhIHRlc2Ugb3UgCmRpc3NlcnRhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIApuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldSAKY29uaGVjaW1lbnRvLCBpbmZyaW5nZSBkaXJlaXRvcyBhdXRvcmFpcyBkZSBuaW5ndcOpbS4KCkNhc28gYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIGNvbnRlbmhhIG1hdGVyaWFsIHF1ZSB2b2PDqiBuw6NvIHBvc3N1aSBhIHRpdHVsYXJpZGFkZSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMsIHZvY8OqIApkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgw6AgVUVSSiBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgCmlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIG5vIHRleHRvIG91IG5vIGNvbnRlw7pkbyBkYSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gb3JhIGRlcG9zaXRhZGEuCgpDQVNPIEEgVEVTRSBPVSBESVNTRVJUQcOHw4NPIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ8ONTklPIE9VIApBUE9JTyBERSBVTUEgQUfDik5DSUEgREUgRk9NRU5UTyBPVSBPVVRSTyBPUkdBTklTTU8gUVVFIE7Dg08gU0VKQSBFU1RBClVOSVZFUlNJREFERSwgVk9Dw4ogREVDTEFSQSBRVUUgUkVTUEVJVE9VIFRPRE9TIEUgUVVBSVNRVUVSIERJUkVJVE9TIERFIFJFVklTw4NPIENPTU8gClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVbml2ZXJzaWRhZGUgZG8gRXN0YWRvIGRvIFJpbyBkZSBKYW5laXJvIChVRVJKKSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Teses e Dissertaçõeshttp://www.bdtd.uerj.br/PUBhttps://www.bdtd.uerj.br:8443/oai/requestbdtd.suporte@uerj.bropendoar:29032024-02-27T17:34:49Biblioteca Digital de Teses e Dissertações da UERJ - Universidade do Estado do Rio de Janeiro (UERJ)false
dc.title.por.fl_str_mv Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
dc.title.alternative.por.fl_str_mv Avaliando os mecanismos de computação paralela com memória compartilhada da linguagem Julia.
title Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
spellingShingle Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
Barros, Diana Almeida
Shared Memory
Multithreading
Loop Scheduling
Julia Language
Julia (Linguagem de programação de computador)
Linguagem Julia
Memória Compartilhada
Multithreading
SIMD
Escalonamento de Loops
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_full Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_fullStr Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_full_unstemmed Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
title_sort Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language
author Barros, Diana Almeida
author_facet Barros, Diana Almeida
diana.ab@live.com
author_role author
author2 diana.ab@live.com
author2_role author
dc.contributor.advisor1.fl_str_mv Bentes, Cristiana Barbosa
dc.contributor.referee1.fl_str_mv Sena, Alexandre
dc.contributor.referee2.fl_str_mv Hoffimann, Júlio
dc.contributor.referee3.fl_str_mv Pessoa, Tiago Carneiro
dc.contributor.author.fl_str_mv Barros, Diana Almeida
diana.ab@live.com
contributor_str_mv Bentes, Cristiana Barbosa
Sena, Alexandre
Hoffimann, Júlio
Pessoa, Tiago Carneiro
dc.subject.eng.fl_str_mv Shared Memory
Multithreading
Loop Scheduling
topic Shared Memory
Multithreading
Loop Scheduling
Julia Language
Julia (Linguagem de programação de computador)
Linguagem Julia
Memória Compartilhada
Multithreading
SIMD
Escalonamento de Loops
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cat.fl_str_mv Julia Language
dc.subject.por.fl_str_mv Julia (Linguagem de programação de computador)
Linguagem Julia
Memória Compartilhada
Multithreading
SIMD
Escalonamento de Loops
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description Areas of study like data science, machine learning or scientific computing are promising areas that are currently receiving a lot of investment. These areas are usually very complex and demand optimal usage of computing resources for better performance. In this context, Julia language was born. A dynamic language that offers a high performance environment with a user friendly syntax. Although its design is focused on performance, shared memory parallel computing still has some features under development and the studies in this area are so far scarce. In this work, we propose a detailed performance study of the shared memory parallel computing mechanisms of Julia. It was analyzed the performance of the Multithreading and SIMD mechanisms. In the analysis of Multithreading, we compare the data and task parallelism strategies available through the language built-in macros @threads and @spawn, focusing on the way they distribute the loop iterations. Furthermore, the loop scheduling mechanisms available for the Julia version used in this work were analyzed, which are the one static scheduling provided by @threads and the ones provided by the package FLoops.jl, and it was observed their performance behavior when the environment scales. In the analysis of the SIMD mechanisms, the compiler auto-vectorization was compared to the built-in construction @simd and two packages for vectorization. Our experiments were run with synthetic kernels, benchmark applications and a real-world optimization framework. Our results show that the macro @spawn presented a better performance on unbalanced loads and the different loop schedulers offered by FLoops.jl help improving the performance of applications with load imbalance. However, we found that applications that are commonly found in real world scenarios are susceptible to overhead and loss of performance as the nature of the problem influences on code implementation, being more noticeable when @spawn was used or if the environment could scale with threads. For the SIMD mechanisms, we showed that the package LoopVectorization.jl provided the best performance results with low programming effort.
publishDate 2023
dc.date.accessioned.fl_str_mv 2023-10-20T14:57:22Z
dc.date.issued.fl_str_mv 2023-04-26
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv BARROS, Diana Almeida. Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language. 2023. 93 f. Dissertação( Mestrado em Ciências Computacionais) - Instituto de Matemática e Estatística, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, 2023.
dc.identifier.uri.fl_str_mv http://www.bdtd.uerj.br/handle/1/20486
identifier_str_mv BARROS, Diana Almeida. Evaluating Shared Memory Parallel Computing Mechanisms of the Julia Language. 2023. 93 f. Dissertação( Mestrado em Ciências Computacionais) - Instituto de Matemática e Estatística, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, 2023.
url http://www.bdtd.uerj.br/handle/1/20486
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade do Estado do Rio de Janeiro
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciências Computacionais
dc.publisher.initials.fl_str_mv UERJ
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Centro de Tecnologia e Ciências::Instituto de Matemática e Estatística
publisher.none.fl_str_mv Universidade do Estado do Rio de Janeiro
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UERJ
instname:Universidade do Estado do Rio de Janeiro (UERJ)
instacron:UERJ
instname_str Universidade do Estado do Rio de Janeiro (UERJ)
instacron_str UERJ
institution UERJ
reponame_str Biblioteca Digital de Teses e Dissertações da UERJ
collection Biblioteca Digital de Teses e Dissertações da UERJ
bitstream.url.fl_str_mv http://www.bdtd.uerj.br/bitstream/1/20486/2/Disserta%C3%A7%C3%A3o+-+Diana+Almeida+Barros+-+2023+-+Completa.pdf
http://www.bdtd.uerj.br/bitstream/1/20486/1/license.txt
bitstream.checksum.fl_str_mv fd019b5947968a6e1691beb337165f66
e5502652da718045d7fcd832b79fca29
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UERJ - Universidade do Estado do Rio de Janeiro (UERJ)
repository.mail.fl_str_mv bdtd.suporte@uerj.br
_version_ 1811728741509890048