Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs

Detalhes bibliográficos
Autor(a) principal: Mandelli, Marcelo Grandi
Data de Publicação: 2015
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da PUC_RS
Texto Completo: http://tede2.pucrs.br/tede2/handle/tede/6317
Resumo: MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.
id P_RS_1fb1fc53b1276c685800cf5154705f90
oai_identifier_str oai:tede2.pucrs.br:tede/6317
network_acronym_str P_RS
network_name_str Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling Moraes, Fernando Gehm477.763.820-00http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782943Z2Ost, Luciano Copellohttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4779836J4007.216.910-99http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4421755E8Mandelli, Marcelo Grandi2015-09-18T20:30:53Z2015-07-13http://tede2.pucrs.br/tede2/handle/tede/6317MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.MPSoCs com centenas de processadores já estão disponíveis no mercado. De acordo com o ITRS, tais sistemas integrarão milhares de processadores até o final da década. A definição de onde cada tarefa será executada no sistema é um desafio importante na concepção de MPSoCs. Na literatura, tal desafio é definido como mapeamento de tarefas. O aumento do número de processadores aumenta a complexidade do mapeamento de tarefas. As principais preocupações em mapeamento de tarefas em grandes sistemas incluem: (i) escalabilidade; (ii) carga dinâmica de trabalho; e (iii) confiabilidade. É necessário distribuir a decisão do mapeamento pelo sistema para garantir escalabilidade. A carga de trabalho em MPSoCs pode ser dinâmica, ou seja, novas aplicações podem iniciar a execução a qualquer momento, levando a diferentes cenários de mapeamento. Portanto, é necessário executar o processo de mapeamento em tempo de execução para suportar uma carga de trabalho dinâmica. Confiabilidade é diretamente relacionada à distribuição da carga de trabalho no sistema. Desequilíbrio de carga pode gerar zonas de hotspots e implicações termais, que podem resultar em uma operação do sistema não confiável. Em MPSoCs de grande dimensão problemas de confiabilidade se agravam, uma vez que o crescente número de processadores no mesmo chip aumenta o consumo de energia e, consequentemente, a temperatura do sistema. A literatura apresenta diferentes técnicas de mapeamento de tarefas para melhorar a confiabilidade do sistema. No entanto, tais técnicas utilizam uma abordagem de mapeamento centralizado, a qual não é escalável. Em função destes três desafios, o principal objetivo desta Tese é propor e avaliar heurísticas de mapeamento distribuído, executadas em tempo de execução, garantindo escalabilidade e uma distribuição de carga de trabalho uniforme. Distribuir a carga de trabalho e o tráfego da NoC aumenta a confiabilidade do sistema no longo prazo, devido à minimização das regiões de hotspot. Para permitir a exploração do espaço de projeto em MPSoCs, a primeira contribuição desta Tese consiste em um ambiente de modelagem multi-nível, que suporta diferentes modelos e capacidades de depuração que enriquecem e facilitam o projeto de MPSoCs. A simulação de modelos de mais baixo nível (por exemplo, RTL) gera parâmetros de desempenho utilizados para calibrar modelos mais abstratos. Os modelos abstratos facilitam a exploração de heurísticas de mapeamento em grandes sistemas. A maioria das técnicas de mapeamento se concentram na otimização do volume comunicação na NoC, o que pode comprometer a confiabilidade, devido à sobrecarga de processadores. Por outro lado, uma heurística que visa a otimização apenas da distribuição de carga de trabalho pode sobrecarregar canais da NoC, comprometendo a sua confiabilidade. A segunda contribuição significativa desta Tese é a proposição de heurísticas de mapeamento dinâmico e distribuídos, fazendo um compromisso entre o volume de comunicação (canais da NoC) e distribuição de carga de trabalho (uso da CPU). Os resultados relacionados a tempo de execução, volume de comunicação, consumo de energia, distribuição de potência e temperatura em grandes MPSoCs (256 processadores) confirmam a hipótese deste compromisso. Fazer um compromisso entre carga de trabalho e volume de comunicação melhora a confiabilidade do sistema através da redução de regiões hotspots, sem comprometer o desempenho do sistema.Submitted by Setor de Tratamento da Informação - BC/PUCRS (tede2@pucrs.br) on 2015-09-18T20:30:53Z No. of bitstreams: 1 475052 - Texto Completo.pdf: 8325686 bytes, checksum: 5d74943dc9ee311c90eb182fb022e539 (MD5)Made available in DSpace on 2015-09-18T20:30:53Z (GMT). No. of bitstreams: 1 475052 - Texto Completo.pdf: 8325686 bytes, checksum: 5d74943dc9ee311c90eb182fb022e539 (MD5) Previous issue date: 2015-07-13application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/163350/475052%20-%20Texto%20Completo.pdf.jpgengPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilFaculdade de InformáticaINFORMÁTICAMULTIPROCESSADORESARQUITETURA DE COMPUTADORCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOExploration of runtime distributed mapping techniques for emerging large scale MPSoCsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis1974996533081274470600600600-30085425104011491443671711205811204509info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSTHUMBNAIL475052 - Texto Completo.pdf.jpg475052 - Texto Completo.pdf.jpgimage/jpeg4202http://tede2.pucrs.br/tede2/bitstream/tede/6317/4/475052+-+Texto+Completo.pdf.jpgc6a77f4c86b2dd0ae8db55fef619d331MD54TEXT475052 - Texto Completo.pdf.txt475052 - Texto Completo.pdf.txttext/plain293250http://tede2.pucrs.br/tede2/bitstream/tede/6317/3/475052+-+Texto+Completo.pdf.txt6a84ef809ebda22fb9d3ae4bf438f3d6MD53ORIGINAL475052 - Texto Completo.pdf475052 - Texto Completo.pdfapplication/pdf8325686http://tede2.pucrs.br/tede2/bitstream/tede/6317/2/475052+-+Texto+Completo.pdf5d74943dc9ee311c90eb182fb022e539MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/6317/1/license.txt5a9d6006225b368ef605ba16b4f6d1beMD51tede/63172015-09-29 08:34:33.846oai:tede2.pucrs.br:tede/6317QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br||opendoar:2015-09-29T11:34:33Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
spellingShingle Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
Mandelli, Marcelo Grandi
INFORMÁTICA
MULTIPROCESSADORES
ARQUITETURA DE COMPUTADOR
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_full Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_fullStr Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_full_unstemmed Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_sort Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
author Mandelli, Marcelo Grandi
author_facet Mandelli, Marcelo Grandi
author_role author
dc.contributor.advisor1.fl_str_mv Moraes, Fernando Gehm
dc.contributor.advisor1ID.fl_str_mv 477.763.820-00
dc.contributor.advisor1Lattes.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782943Z2
dc.contributor.advisor-co1.fl_str_mv Ost, Luciano Copello
dc.contributor.advisor-co1ID.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4779836J4
dc.contributor.authorID.fl_str_mv 007.216.910-99
dc.contributor.authorLattes.fl_str_mv http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4421755E8
dc.contributor.author.fl_str_mv Mandelli, Marcelo Grandi
contributor_str_mv Moraes, Fernando Gehm
Ost, Luciano Copello
dc.subject.por.fl_str_mv INFORMÁTICA
MULTIPROCESSADORES
ARQUITETURA DE COMPUTADOR
topic INFORMÁTICA
MULTIPROCESSADORES
ARQUITETURA DE COMPUTADOR
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.
publishDate 2015
dc.date.accessioned.fl_str_mv 2015-09-18T20:30:53Z
dc.date.issued.fl_str_mv 2015-07-13
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://tede2.pucrs.br/tede2/handle/tede/6317
url http://tede2.pucrs.br/tede2/handle/tede/6317
dc.language.iso.fl_str_mv eng
language eng
dc.relation.program.fl_str_mv 1974996533081274470
dc.relation.confidence.fl_str_mv 600
600
600
dc.relation.department.fl_str_mv -3008542510401149144
dc.relation.cnpq.fl_str_mv 3671711205811204509
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Pontifícia Universidade Católica do Rio Grande do Sul
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv PUCRS
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Faculdade de Informática
publisher.none.fl_str_mv Pontifícia Universidade Católica do Rio Grande do Sul
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS
instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron:PUC_RS
instname_str Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str PUC_RS
institution PUC_RS
reponame_str Biblioteca Digital de Teses e Dissertações da PUC_RS
collection Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv http://tede2.pucrs.br/tede2/bitstream/tede/6317/4/475052+-+Texto+Completo.pdf.jpg
http://tede2.pucrs.br/tede2/bitstream/tede/6317/3/475052+-+Texto+Completo.pdf.txt
http://tede2.pucrs.br/tede2/bitstream/tede/6317/2/475052+-+Texto+Completo.pdf
http://tede2.pucrs.br/tede2/bitstream/tede/6317/1/license.txt
bitstream.checksum.fl_str_mv c6a77f4c86b2dd0ae8db55fef619d331
6a84ef809ebda22fb9d3ae4bf438f3d6
5d74943dc9ee311c90eb182fb022e539
5a9d6006225b368ef605ba16b4f6d1be
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv biblioteca.central@pucrs.br||
_version_ 1799765315192619008