Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
Autor(a) principal: | |
---|---|
Data de Publicação: | 2015 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da PUC_RS |
Texto Completo: | http://tede2.pucrs.br/tede2/handle/tede/6317 |
Resumo: | MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance. |
id |
P_RS_1fb1fc53b1276c685800cf5154705f90 |
---|---|
oai_identifier_str |
oai:tede2.pucrs.br:tede/6317 |
network_acronym_str |
P_RS |
network_name_str |
Biblioteca Digital de Teses e Dissertações da PUC_RS |
repository_id_str |
|
spelling |
Moraes, Fernando Gehm477.763.820-00http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782943Z2Ost, Luciano Copellohttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4779836J4007.216.910-99http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4421755E8Mandelli, Marcelo Grandi2015-09-18T20:30:53Z2015-07-13http://tede2.pucrs.br/tede2/handle/tede/6317MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.MPSoCs com centenas de processadores já estão disponíveis no mercado. De acordo com o ITRS, tais sistemas integrarão milhares de processadores até o final da década. A definição de onde cada tarefa será executada no sistema é um desafio importante na concepção de MPSoCs. Na literatura, tal desafio é definido como mapeamento de tarefas. O aumento do número de processadores aumenta a complexidade do mapeamento de tarefas. As principais preocupações em mapeamento de tarefas em grandes sistemas incluem: (i) escalabilidade; (ii) carga dinâmica de trabalho; e (iii) confiabilidade. É necessário distribuir a decisão do mapeamento pelo sistema para garantir escalabilidade. A carga de trabalho em MPSoCs pode ser dinâmica, ou seja, novas aplicações podem iniciar a execução a qualquer momento, levando a diferentes cenários de mapeamento. Portanto, é necessário executar o processo de mapeamento em tempo de execução para suportar uma carga de trabalho dinâmica. Confiabilidade é diretamente relacionada à distribuição da carga de trabalho no sistema. Desequilíbrio de carga pode gerar zonas de hotspots e implicações termais, que podem resultar em uma operação do sistema não confiável. Em MPSoCs de grande dimensão problemas de confiabilidade se agravam, uma vez que o crescente número de processadores no mesmo chip aumenta o consumo de energia e, consequentemente, a temperatura do sistema. A literatura apresenta diferentes técnicas de mapeamento de tarefas para melhorar a confiabilidade do sistema. No entanto, tais técnicas utilizam uma abordagem de mapeamento centralizado, a qual não é escalável. Em função destes três desafios, o principal objetivo desta Tese é propor e avaliar heurísticas de mapeamento distribuído, executadas em tempo de execução, garantindo escalabilidade e uma distribuição de carga de trabalho uniforme. Distribuir a carga de trabalho e o tráfego da NoC aumenta a confiabilidade do sistema no longo prazo, devido à minimização das regiões de hotspot. Para permitir a exploração do espaço de projeto em MPSoCs, a primeira contribuição desta Tese consiste em um ambiente de modelagem multi-nível, que suporta diferentes modelos e capacidades de depuração que enriquecem e facilitam o projeto de MPSoCs. A simulação de modelos de mais baixo nível (por exemplo, RTL) gera parâmetros de desempenho utilizados para calibrar modelos mais abstratos. Os modelos abstratos facilitam a exploração de heurísticas de mapeamento em grandes sistemas. A maioria das técnicas de mapeamento se concentram na otimização do volume comunicação na NoC, o que pode comprometer a confiabilidade, devido à sobrecarga de processadores. Por outro lado, uma heurística que visa a otimização apenas da distribuição de carga de trabalho pode sobrecarregar canais da NoC, comprometendo a sua confiabilidade. A segunda contribuição significativa desta Tese é a proposição de heurísticas de mapeamento dinâmico e distribuídos, fazendo um compromisso entre o volume de comunicação (canais da NoC) e distribuição de carga de trabalho (uso da CPU). Os resultados relacionados a tempo de execução, volume de comunicação, consumo de energia, distribuição de potência e temperatura em grandes MPSoCs (256 processadores) confirmam a hipótese deste compromisso. Fazer um compromisso entre carga de trabalho e volume de comunicação melhora a confiabilidade do sistema através da redução de regiões hotspots, sem comprometer o desempenho do sistema.Submitted by Setor de Tratamento da Informação - BC/PUCRS (tede2@pucrs.br) on 2015-09-18T20:30:53Z No. of bitstreams: 1 475052 - Texto Completo.pdf: 8325686 bytes, checksum: 5d74943dc9ee311c90eb182fb022e539 (MD5)Made available in DSpace on 2015-09-18T20:30:53Z (GMT). No. of bitstreams: 1 475052 - Texto Completo.pdf: 8325686 bytes, checksum: 5d74943dc9ee311c90eb182fb022e539 (MD5) Previous issue date: 2015-07-13application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/163350/475052%20-%20Texto%20Completo.pdf.jpgengPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilFaculdade de InformáticaINFORMÁTICAMULTIPROCESSADORESARQUITETURA DE COMPUTADORCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOExploration of runtime distributed mapping techniques for emerging large scale MPSoCsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis1974996533081274470600600600-30085425104011491443671711205811204509info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSTHUMBNAIL475052 - Texto Completo.pdf.jpg475052 - Texto Completo.pdf.jpgimage/jpeg4202http://tede2.pucrs.br/tede2/bitstream/tede/6317/4/475052+-+Texto+Completo.pdf.jpgc6a77f4c86b2dd0ae8db55fef619d331MD54TEXT475052 - Texto Completo.pdf.txt475052 - Texto Completo.pdf.txttext/plain293250http://tede2.pucrs.br/tede2/bitstream/tede/6317/3/475052+-+Texto+Completo.pdf.txt6a84ef809ebda22fb9d3ae4bf438f3d6MD53ORIGINAL475052 - Texto Completo.pdf475052 - Texto Completo.pdfapplication/pdf8325686http://tede2.pucrs.br/tede2/bitstream/tede/6317/2/475052+-+Texto+Completo.pdf5d74943dc9ee311c90eb182fb022e539MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/6317/1/license.txt5a9d6006225b368ef605ba16b4f6d1beMD51tede/63172015-09-29 08:34:33.846oai:tede2.pucrs.br:tede/6317QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br||opendoar:2015-09-29T11:34:33Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false |
dc.title.por.fl_str_mv |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
title |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
spellingShingle |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs Mandelli, Marcelo Grandi INFORMÁTICA MULTIPROCESSADORES ARQUITETURA DE COMPUTADOR CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
title_short |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
title_full |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
title_fullStr |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
title_full_unstemmed |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
title_sort |
Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs |
author |
Mandelli, Marcelo Grandi |
author_facet |
Mandelli, Marcelo Grandi |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Moraes, Fernando Gehm |
dc.contributor.advisor1ID.fl_str_mv |
477.763.820-00 |
dc.contributor.advisor1Lattes.fl_str_mv |
http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782943Z2 |
dc.contributor.advisor-co1.fl_str_mv |
Ost, Luciano Copello |
dc.contributor.advisor-co1ID.fl_str_mv |
http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4779836J4 |
dc.contributor.authorID.fl_str_mv |
007.216.910-99 |
dc.contributor.authorLattes.fl_str_mv |
http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4421755E8 |
dc.contributor.author.fl_str_mv |
Mandelli, Marcelo Grandi |
contributor_str_mv |
Moraes, Fernando Gehm Ost, Luciano Copello |
dc.subject.por.fl_str_mv |
INFORMÁTICA MULTIPROCESSADORES ARQUITETURA DE COMPUTADOR |
topic |
INFORMÁTICA MULTIPROCESSADORES ARQUITETURA DE COMPUTADOR CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
dc.subject.cnpq.fl_str_mv |
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
description |
MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance. |
publishDate |
2015 |
dc.date.accessioned.fl_str_mv |
2015-09-18T20:30:53Z |
dc.date.issued.fl_str_mv |
2015-07-13 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://tede2.pucrs.br/tede2/handle/tede/6317 |
url |
http://tede2.pucrs.br/tede2/handle/tede/6317 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.program.fl_str_mv |
1974996533081274470 |
dc.relation.confidence.fl_str_mv |
600 600 600 |
dc.relation.department.fl_str_mv |
-3008542510401149144 |
dc.relation.cnpq.fl_str_mv |
3671711205811204509 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Pontifícia Universidade Católica do Rio Grande do Sul |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Ciência da Computação |
dc.publisher.initials.fl_str_mv |
PUCRS |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
Faculdade de Informática |
publisher.none.fl_str_mv |
Pontifícia Universidade Católica do Rio Grande do Sul |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) instacron:PUC_RS |
instname_str |
Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) |
instacron_str |
PUC_RS |
institution |
PUC_RS |
reponame_str |
Biblioteca Digital de Teses e Dissertações da PUC_RS |
collection |
Biblioteca Digital de Teses e Dissertações da PUC_RS |
bitstream.url.fl_str_mv |
http://tede2.pucrs.br/tede2/bitstream/tede/6317/4/475052+-+Texto+Completo.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/6317/3/475052+-+Texto+Completo.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/6317/2/475052+-+Texto+Completo.pdf http://tede2.pucrs.br/tede2/bitstream/tede/6317/1/license.txt |
bitstream.checksum.fl_str_mv |
c6a77f4c86b2dd0ae8db55fef619d331 6a84ef809ebda22fb9d3ae4bf438f3d6 5d74943dc9ee311c90eb182fb022e539 5a9d6006225b368ef605ba16b4f6d1be |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) |
repository.mail.fl_str_mv |
biblioteca.central@pucrs.br|| |
_version_ |
1799765315192619008 |