A refactoring approach to improve energy consumption of parallel software systems

Detalhes bibliográficos
Autor(a) principal: PINTO, Gustavo Henrique Lima
Data de Publicação: 2015
Tipo de documento: Tese
Idioma: eng
Título da fonte: Repositório Institucional da UFPE
Texto Completo: https://repositorio.ufpe.br/handle/123456789/16346
Resumo: Empowering application programmers to make energy-aware decisions is a critical dimension in improving energy efficiency of computer systems. Despite the growing interest in designing software development processes, frameworks, and programming models to facilitate application-level energy management, little is known on how to design application-level energy-efficient solutions for concurrent software running on parallel architectures. This is unfortunate for at least two reasons: (1) thanks to the proliferation of multicore CPUs, concurrent programming is a standard practice in modern software engineering; (2) a CPU with more cores (say 32) often consumes more power than one with fewer cores (say 1 or 2). However, application developers still do not understand how their code modifications impact energy consumption in a parallel system. Analyzing STACKOVERFLOW showed evidence that this is a real problem; Even though the interest in energy consumption issues is increasing over the years, developers still hold misconceptions and assumptions that are not always true. This lack of knowledge is primarily due to a lack of appropriate tools to measure/identify/refactor energy consumption hotspots. This thesis begins to bridge the chasm of the first problem — the lack of knowledge — by presenting an extensive experimental space exploration over two concurrent programming building blocks: (1) thread-safe collections and (2) thread management constructs. Through a list of findings that are not always obvious, we illuminate the relationship between the choices and settings of design decisions and energy consumption of parallel systems. This thesis then starts to bridge the gap of the second problem — the lack of tools. Lessons learned in our previous studies showed that ForkJoin tasks often operate on an indexable data structure, with subtasks operating only on part of this data structure. One naïve solution is to copy part of the data structure and use it in the next computation. In a recursive framework such as ForkJoin, given an array-based representation, each recursive call will create n new arrays, where n is the width of forking. To address this, we derive a refactoring that, instead of copy part of the data structure, it shares it, allowing subtasks to operate on contiguous partitions of the data structure. We manually applied this refactoring into 15 open source projects. Our refactoring succeed in saving energy for each one of them (12% average saving). We sent the refactored versions to the project owner and, during a timeframe of 40 days, 7 out of 9 projects that replied to our patches have already accepted and merged them. Discussions during the merge process revealed that developers were not aware of this optimization. We then implemented this refactoring as an Eclipse plug-in so that other developers can (1) detect uses of copy where it would be beneficial to use sharing and (2) refactor the code in an automated way.
id UFPE_24f7183677104d3166282b1ea19aa624
oai_identifier_str oai:repositorio.ufpe.br:123456789/16346
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str 2221
spelling PINTO, Gustavo Henrique Limahttp://lattes.cnpq.br/1631238943341152http://lattes.cnpq.br/7310046838140771LIMA FILHO, Fernando José Castor de2016-04-06T13:27:55Z2016-04-06T13:27:55Z2015-02-24https://repositorio.ufpe.br/handle/123456789/16346Empowering application programmers to make energy-aware decisions is a critical dimension in improving energy efficiency of computer systems. Despite the growing interest in designing software development processes, frameworks, and programming models to facilitate application-level energy management, little is known on how to design application-level energy-efficient solutions for concurrent software running on parallel architectures. This is unfortunate for at least two reasons: (1) thanks to the proliferation of multicore CPUs, concurrent programming is a standard practice in modern software engineering; (2) a CPU with more cores (say 32) often consumes more power than one with fewer cores (say 1 or 2). However, application developers still do not understand how their code modifications impact energy consumption in a parallel system. Analyzing STACKOVERFLOW showed evidence that this is a real problem; Even though the interest in energy consumption issues is increasing over the years, developers still hold misconceptions and assumptions that are not always true. This lack of knowledge is primarily due to a lack of appropriate tools to measure/identify/refactor energy consumption hotspots. This thesis begins to bridge the chasm of the first problem — the lack of knowledge — by presenting an extensive experimental space exploration over two concurrent programming building blocks: (1) thread-safe collections and (2) thread management constructs. Through a list of findings that are not always obvious, we illuminate the relationship between the choices and settings of design decisions and energy consumption of parallel systems. This thesis then starts to bridge the gap of the second problem — the lack of tools. Lessons learned in our previous studies showed that ForkJoin tasks often operate on an indexable data structure, with subtasks operating only on part of this data structure. One naïve solution is to copy part of the data structure and use it in the next computation. In a recursive framework such as ForkJoin, given an array-based representation, each recursive call will create n new arrays, where n is the width of forking. To address this, we derive a refactoring that, instead of copy part of the data structure, it shares it, allowing subtasks to operate on contiguous partitions of the data structure. We manually applied this refactoring into 15 open source projects. Our refactoring succeed in saving energy for each one of them (12% average saving). We sent the refactored versions to the project owner and, during a timeframe of 40 days, 7 out of 9 projects that replied to our patches have already accepted and merged them. Discussions during the merge process revealed that developers were not aware of this optimization. We then implemented this refactoring as an Eclipse plug-in so that other developers can (1) detect uses of copy where it would be beneficial to use sharing and (2) refactor the code in an automated way.CAPEsFornecer meios para que desenvolvedores de software tomem decisões energeticamente eficientes é uma dimensão crítica para se melhorar o consumo de energia de sistemas computacionais. Apesar do crescente interesse em processos de desenvolvimento de software, arcabouços, e modelos de programação de forma a facilitar o gerenciamento de energia no nível da aplicação, pouco se sabe sobre como arquitetar sistemas concorrentes energéticamente eficientes que rodem em arquiteturas paralelas. Isso é inoportuno por pelo menos duas razões: (1) graças a proliferação de CPUs multicore, programação concorrente se tornou uma prática padrão na engenharia de software moderna; (2) uma CPU com várias unidades de processamento (por exemplo, 32) geralmente dissipa mais potência do que uma com um número menor (por exemplo, 1 ou 2). No entanto, desenvolvedores ainda não entendem como suas modificações de código impactam no consumo de energia de uma aplicação paralela. Uma análise do StackOverflow mostrou evidências que esse é um problema real; mesmo embora exista um crescente interesse em questões relacionadas ao consumo de energia, desenvolvedores ainda cometem equívocos e mantêm suposições que não são sempre verdadeiras. Essa falta de conhecimento é primariamente devido a falta de ferramentas apropriadas para medir/identificar/refatorar hotspots de consumo de energia. Essa tese então começa a pavimentar o abismo do primeiro problema — a falta de conhecimento — através de uma extensa exploração experimental de dois dos pilares fundamentais da programação concorrente: (1) coleções thread-safe e (2) construções para o gerenciamento de threads. Através de uma lista de achados que não são sempre óbvios, esta tese ilumina o relacionamento entre escolhas de design de código paralelo com seu consumo de energia. Esta tese começa então a pavimentar a lacuna do segundo problema — a falta de ferramentas. Lições aprendidas em um dos estudos anteriores mostraram que várias tarefas do arcabouço ForkJoin operam em estrutura de dados indexáveis, com sub-tarefas operando somente em parte dessa estrutura de dados. Uma solução ingênua é de copiar parta da estrutura de dados e utiliza-la na computação sub-sequente. Em um arcabouço recursivo como o ForkJoin, dado uma representação baseada em arrays, cada chamada recursiva criará n novos arrays, onde n é a profundidade do fork. Como solução, esta tese apresenta uma refatoração que, ao invés de copiar parte da estrutura de dados, ela compartilha-a, possibilitando que sub-tarefas operem em partições contíguas da estrutura de dados. Essa refatoração foi avaliada em 15 projetos de código aberto, a qual foi capaz de economizar energia em todos os casos (média de 12% de economia). A versão refatorada foi enviada aos mantenedores do projeto original e, durante um período de 40 dias, 7 dos 9 mantenedores que responderam aos patches enviados já haviam aceitado-os e integrado-os. Discussões durante o processo de integração revelaram que desenvolvedores não estão cientes desta otimização. Esta tese então implementou essa refatoração como um plug-in da IDE Eclipse de forma que outros desenvolvedores possam (1) detectar usos de cópia em cenários o quais seriam beneficiais o uso do modelo de compartilhamento and (2) refatorar o código de forma automática.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessEficiência Energéticaprogramação concorrenterefatoraçãoEnergy efficiencyconcurrent programmingrefactoringA refactoring approach to improve energy consumption of parallel software systemsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisdoutoradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPETHUMBNAILversao_biblioteca.pdf.jpgversao_biblioteca.pdf.jpgGenerated Thumbnailimage/jpeg1260https://repositorio.ufpe.br/bitstream/123456789/16346/5/versao_biblioteca.pdf.jpgae39af1ca1e47775dc762d3cebf996daMD55ORIGINALversao_biblioteca.pdfversao_biblioteca.pdfapplication/pdf3051240https://repositorio.ufpe.br/bitstream/123456789/16346/1/versao_biblioteca.pdfac1a91e08d64c78a372cb0e151bcb7c7MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-81232https://repositorio.ufpe.br/bitstream/123456789/16346/2/license_rdf66e71c371cc565284e70f40736c94386MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82311https://repositorio.ufpe.br/bitstream/123456789/16346/3/license.txt4b8a02c7f2818eaf00dcf2260dd5eb08MD53TEXTversao_biblioteca.pdf.txtversao_biblioteca.pdf.txtExtracted texttext/plain369584https://repositorio.ufpe.br/bitstream/123456789/16346/4/versao_biblioteca.pdf.txtc6ddd0808ab68829aff28201ddfc0d88MD54123456789/163462019-10-25 17:42:14.502oai:repositorio.ufpe.br:123456789/16346TGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKClRvZG8gZGVwb3NpdGFudGUgZGUgbWF0ZXJpYWwgbm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgKFJJKSBkZXZlIGNvbmNlZGVyLCDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIChVRlBFKSwgdW1hIExpY2Vuw6dhIGRlIERpc3RyaWJ1acOnw6NvIE7Do28gRXhjbHVzaXZhIHBhcmEgbWFudGVyIGUgdG9ybmFyIGFjZXNzw612ZWlzIG9zIHNldXMgZG9jdW1lbnRvcywgZW0gZm9ybWF0byBkaWdpdGFsLCBuZXN0ZSByZXBvc2l0w7NyaW8uCgpDb20gYSBjb25jZXNzw6NvIGRlc3RhIGxpY2Vuw6dhIG7Do28gZXhjbHVzaXZhLCBvIGRlcG9zaXRhbnRlIG1hbnTDqW0gdG9kb3Mgb3MgZGlyZWl0b3MgZGUgYXV0b3IuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwoKTGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKCkFvIGNvbmNvcmRhciBjb20gZXN0YSBsaWNlbsOnYSBlIGFjZWl0w6EtbGEsIHZvY8OqIChhdXRvciBvdSBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMpOgoKYSkgRGVjbGFyYSBxdWUgY29uaGVjZSBhIHBvbMOtdGljYSBkZSBjb3B5cmlnaHQgZGEgZWRpdG9yYSBkbyBzZXUgZG9jdW1lbnRvOwpiKSBEZWNsYXJhIHF1ZSBjb25oZWNlIGUgYWNlaXRhIGFzIERpcmV0cml6ZXMgcGFyYSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGUEU7CmMpIENvbmNlZGUgw6AgVUZQRSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZGUgYXJxdWl2YXIsIHJlcHJvZHV6aXIsIGNvbnZlcnRlciAoY29tbyBkZWZpbmlkbyBhIHNlZ3VpciksIGNvbXVuaWNhciBlL291IGRpc3RyaWJ1aXIsIG5vIFJJLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgcG9yIG91dHJvIG1laW87CmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgVUZQRSBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgcGFyYSBxdWFscXVlciBmb3JtYXRvIGRlIGZpY2hlaXJvLCBtZWlvIG91IHN1cG9ydGUsIHBhcmEgZWZlaXRvcyBkZSBzZWd1cmFuw6dhLCBwcmVzZXJ2YcOnw6NvIChiYWNrdXApIGUgYWNlc3NvOwplKSBEZWNsYXJhIHF1ZSBvIGRvY3VtZW50byBzdWJtZXRpZG8gw6kgbyBzZXUgdHJhYmFsaG8gb3JpZ2luYWwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBhIGVudHJlZ2EgZG8gZG9jdW1lbnRvIG7Do28gaW5mcmluZ2Ugb3MgZGlyZWl0b3MgZGUgb3V0cmEgcGVzc29hIG91IGVudGlkYWRlOwpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlCmF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIGlycmVzdHJpdGEgZG8gcmVzcGVjdGl2byBkZXRlbnRvciBkZXNzZXMgZGlyZWl0b3MgcGFyYSBjZWRlciDDoApVRlBFIG9zIGRpcmVpdG9zIHJlcXVlcmlkb3MgcG9yIGVzdGEgTGljZW7Dp2EgZSBhdXRvcml6YXIgYSB1bml2ZXJzaWRhZGUgYSB1dGlsaXrDoS1sb3MgbGVnYWxtZW50ZS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBlc3NlIG1hdGVyaWFsIGN1am9zIGRpcmVpdG9zIHPDo28gZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZTsKZykgU2UgbyBkb2N1bWVudG8gZW50cmVndWUgw6kgYmFzZWFkbyBlbSB0cmFiYWxobyBmaW5hbmNpYWRvIG91IGFwb2lhZG8gcG9yIG91dHJhIGluc3RpdHVpw6fDo28gcXVlIG7Do28gYSBVRlBFLMKgZGVjbGFyYSBxdWUgY3VtcHJpdSBxdWFpc3F1ZXIgb2JyaWdhw6fDtWVzIGV4aWdpZGFzIHBlbG8gcmVzcGVjdGl2byBjb250cmF0byBvdSBhY29yZG8uCgpBIFVGUEUgaWRlbnRpZmljYXLDoSBjbGFyYW1lbnRlIG8ocykgbm9tZShzKSBkbyhzKSBhdXRvciAoZXMpIGRvcyBkaXJlaXRvcyBkbyBkb2N1bWVudG8gZW50cmVndWUgZSBuw6NvIGZhcsOhIHF1YWxxdWVyIGFsdGVyYcOnw6NvLCBwYXJhIGFsw6ltIGRvIHByZXZpc3RvIG5hIGFsw61uZWEgYykuCg==Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212019-10-25T20:42:14Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv A refactoring approach to improve energy consumption of parallel software systems
title A refactoring approach to improve energy consumption of parallel software systems
spellingShingle A refactoring approach to improve energy consumption of parallel software systems
PINTO, Gustavo Henrique Lima
Eficiência Energética
programação concorrente
refatoração
Energy efficiency
concurrent programming
refactoring
title_short A refactoring approach to improve energy consumption of parallel software systems
title_full A refactoring approach to improve energy consumption of parallel software systems
title_fullStr A refactoring approach to improve energy consumption of parallel software systems
title_full_unstemmed A refactoring approach to improve energy consumption of parallel software systems
title_sort A refactoring approach to improve energy consumption of parallel software systems
author PINTO, Gustavo Henrique Lima
author_facet PINTO, Gustavo Henrique Lima
author_role author
dc.contributor.authorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/1631238943341152
dc.contributor.advisorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/7310046838140771
dc.contributor.author.fl_str_mv PINTO, Gustavo Henrique Lima
dc.contributor.advisor1.fl_str_mv LIMA FILHO, Fernando José Castor de
contributor_str_mv LIMA FILHO, Fernando José Castor de
dc.subject.por.fl_str_mv Eficiência Energética
programação concorrente
refatoração
Energy efficiency
concurrent programming
refactoring
topic Eficiência Energética
programação concorrente
refatoração
Energy efficiency
concurrent programming
refactoring
description Empowering application programmers to make energy-aware decisions is a critical dimension in improving energy efficiency of computer systems. Despite the growing interest in designing software development processes, frameworks, and programming models to facilitate application-level energy management, little is known on how to design application-level energy-efficient solutions for concurrent software running on parallel architectures. This is unfortunate for at least two reasons: (1) thanks to the proliferation of multicore CPUs, concurrent programming is a standard practice in modern software engineering; (2) a CPU with more cores (say 32) often consumes more power than one with fewer cores (say 1 or 2). However, application developers still do not understand how their code modifications impact energy consumption in a parallel system. Analyzing STACKOVERFLOW showed evidence that this is a real problem; Even though the interest in energy consumption issues is increasing over the years, developers still hold misconceptions and assumptions that are not always true. This lack of knowledge is primarily due to a lack of appropriate tools to measure/identify/refactor energy consumption hotspots. This thesis begins to bridge the chasm of the first problem — the lack of knowledge — by presenting an extensive experimental space exploration over two concurrent programming building blocks: (1) thread-safe collections and (2) thread management constructs. Through a list of findings that are not always obvious, we illuminate the relationship between the choices and settings of design decisions and energy consumption of parallel systems. This thesis then starts to bridge the gap of the second problem — the lack of tools. Lessons learned in our previous studies showed that ForkJoin tasks often operate on an indexable data structure, with subtasks operating only on part of this data structure. One naïve solution is to copy part of the data structure and use it in the next computation. In a recursive framework such as ForkJoin, given an array-based representation, each recursive call will create n new arrays, where n is the width of forking. To address this, we derive a refactoring that, instead of copy part of the data structure, it shares it, allowing subtasks to operate on contiguous partitions of the data structure. We manually applied this refactoring into 15 open source projects. Our refactoring succeed in saving energy for each one of them (12% average saving). We sent the refactored versions to the project owner and, during a timeframe of 40 days, 7 out of 9 projects that replied to our patches have already accepted and merged them. Discussions during the merge process revealed that developers were not aware of this optimization. We then implemented this refactoring as an Eclipse plug-in so that other developers can (1) detect uses of copy where it would be beneficial to use sharing and (2) refactor the code in an automated way.
publishDate 2015
dc.date.issued.fl_str_mv 2015-02-24
dc.date.accessioned.fl_str_mv 2016-04-06T13:27:55Z
dc.date.available.fl_str_mv 2016-04-06T13:27:55Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/16346
url https://repositorio.ufpe.br/handle/123456789/16346
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv Programa de Pos Graduacao em Ciencia da Computacao
dc.publisher.initials.fl_str_mv UFPE
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
bitstream.url.fl_str_mv https://repositorio.ufpe.br/bitstream/123456789/16346/5/versao_biblioteca.pdf.jpg
https://repositorio.ufpe.br/bitstream/123456789/16346/1/versao_biblioteca.pdf
https://repositorio.ufpe.br/bitstream/123456789/16346/2/license_rdf
https://repositorio.ufpe.br/bitstream/123456789/16346/3/license.txt
https://repositorio.ufpe.br/bitstream/123456789/16346/4/versao_biblioteca.pdf.txt
bitstream.checksum.fl_str_mv ae39af1ca1e47775dc762d3cebf996da
ac1a91e08d64c78a372cb0e151bcb7c7
66e71c371cc565284e70f40736c94386
4b8a02c7f2818eaf00dcf2260dd5eb08
c6ddd0808ab68829aff28201ddfc0d88
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1797780404481032192