Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs

Mandelli, Marcelo Grandi

Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs

Detalhes bibliográficos
Autor(a) principal:	Mandelli, Marcelo Grandi
Data de Publicação:	2015
Tipo de documento:	Tese
Idioma:	eng
Título da fonte:	Biblioteca Digital de Teses e Dissertações da PUC_RS
Texto Completo:	http://tede2.pucrs.br/tede2/handle/tede/6317
Resumo:	MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.

Metadados do item

id	P_RS_1fb1fc53b1276c685800cf5154705f90
oai_identifier_str	oai:tede2.pucrs.br:tede/6317
network_acronym_str	P_RS
network_name_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling	Moraes, Fernando Gehm477.763.820-00http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782943Z2Ost, Luciano Copellohttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4779836J4007.216.910-99http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4421755E8Mandelli, Marcelo Grandi2015-09-18T20:30:53Z2015-07-13http://tede2.pucrs.br/tede2/handle/tede/6317MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.MPSoCs com centenas de processadores já estão disponíveis no mercado. De acordo com o ITRS, tais sistemas integrarão milhares de processadores até o final da década. A definição de onde cada tarefa será executada no sistema é um desafio importante na concepção de MPSoCs. Na literatura, tal desafio é definido como mapeamento de tarefas. O aumento do número de processadores aumenta a complexidade do mapeamento de tarefas. As principais preocupações em mapeamento de tarefas em grandes sistemas incluem: (i) escalabilidade; (ii) carga dinâmica de trabalho; e (iii) confiabilidade. É necessário distribuir a decisão do mapeamento pelo sistema para garantir escalabilidade. A carga de trabalho em MPSoCs pode ser dinâmica, ou seja, novas aplicações podem iniciar a execução a qualquer momento, levando a diferentes cenários de mapeamento. Portanto, é necessário executar o processo de mapeamento em tempo de execução para suportar uma carga de trabalho dinâmica. Confiabilidade é diretamente relacionada à distribuição da carga de trabalho no sistema. Desequilíbrio de carga pode gerar zonas de hotspots e implicações termais, que podem resultar em uma operação do sistema não confiável. Em MPSoCs de grande dimensão problemas de confiabilidade se agravam, uma vez que o crescente número de processadores no mesmo chip aumenta o consumo de energia e, consequentemente, a temperatura do sistema. A literatura apresenta diferentes técnicas de mapeamento de tarefas para melhorar a confiabilidade do sistema. No entanto, tais técnicas utilizam uma abordagem de mapeamento centralizado, a qual não é escalável. Em função destes três desafios, o principal objetivo desta Tese é propor e avaliar heurísticas de mapeamento distribuído, executadas em tempo de execução, garantindo escalabilidade e uma distribuição de carga de trabalho uniforme. Distribuir a carga de trabalho e o tráfego da NoC aumenta a confiabilidade do sistema no longo prazo, devido à minimização das regiões de hotspot. Para permitir a exploração do espaço de projeto em MPSoCs, a primeira contribuição desta Tese consiste em um ambiente de modelagem multi-nível, que suporta diferentes modelos e capacidades de depuração que enriquecem e facilitam o projeto de MPSoCs. A simulação de modelos de mais baixo nível (por exemplo, RTL) gera parâmetros de desempenho utilizados para calibrar modelos mais abstratos. Os modelos abstratos facilitam a exploração de heurísticas de mapeamento em grandes sistemas. A maioria das técnicas de mapeamento se concentram na otimização do volume comunicação na NoC, o que pode comprometer a confiabilidade, devido à sobrecarga de processadores. Por outro lado, uma heurística que visa a otimização apenas da distribuição de carga de trabalho pode sobrecarregar canais da NoC, comprometendo a sua confiabilidade. A segunda contribuição significativa desta Tese é a proposição de heurísticas de mapeamento dinâmico e distribuídos, fazendo um compromisso entre o volume de comunicação (canais da NoC) e distribuição de carga de trabalho (uso da CPU). Os resultados relacionados a tempo de execução, volume de comunicação, consumo de energia, distribuição de potência e temperatura em grandes MPSoCs (256 processadores) confirmam a hipótese deste compromisso. Fazer um compromisso entre carga de trabalho e volume de comunicação melhora a confiabilidade do sistema através da redução de regiões hotspots, sem comprometer o desempenho do sistema.Submitted by Setor de Tratamento da Informação - BC/PUCRS (tede2@pucrs.br) on 2015-09-18T20:30:53Z No. of bitstreams: 1 475052 - Texto Completo.pdf: 8325686 bytes, checksum: 5d74943dc9ee311c90eb182fb022e539 (MD5)Made available in DSpace on 2015-09-18T20:30:53Z (GMT). No. of bitstreams: 1 475052 - Texto Completo.pdf: 8325686 bytes, checksum: 5d74943dc9ee311c90eb182fb022e539 (MD5) Previous issue date: 2015-07-13application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/163350/475052%20-%20Texto%20Completo.pdf.jpgengPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilFaculdade de InformáticaINFORMÁTICAMULTIPROCESSADORESARQUITETURA DE COMPUTADORCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOExploration of runtime distributed mapping techniques for emerging large scale MPSoCsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis1974996533081274470600600600-30085425104011491443671711205811204509info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSTHUMBNAIL475052 - Texto Completo.pdf.jpg475052 - Texto Completo.pdf.jpgimage/jpeg4202http://tede2.pucrs.br/tede2/bitstream/tede/6317/4/475052+-+Texto+Completo.pdf.jpgc6a77f4c86b2dd0ae8db55fef619d331MD54TEXT475052 - Texto Completo.pdf.txt475052 - Texto Completo.pdf.txttext/plain293250http://tede2.pucrs.br/tede2/bitstream/tede/6317/3/475052+-+Texto+Completo.pdf.txt6a84ef809ebda22fb9d3ae4bf438f3d6MD53ORIGINAL475052 - Texto Completo.pdf475052 - Texto Completo.pdfapplication/pdf8325686http://tede2.pucrs.br/tede2/bitstream/tede/6317/2/475052+-+Texto+Completo.pdf5d74943dc9ee311c90eb182fb022e539MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/6317/1/license.txt5a9d6006225b368ef605ba16b4f6d1beMD51tede/63172015-09-29 08:34:33.846oai:tede2.pucrs.br:tede/6317QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br\|\|opendoar:2015-09-29T11:34:33Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
spellingShingle	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs Mandelli, Marcelo Grandi INFORMÁTICA MULTIPROCESSADORES ARQUITETURA DE COMPUTADOR CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_full	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_fullStr	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_full_unstemmed	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
title_sort	Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs
author	Mandelli, Marcelo Grandi
author_facet	Mandelli, Marcelo Grandi
author_role	author
dc.contributor.advisor1.fl_str_mv	Moraes, Fernando Gehm
dc.contributor.advisor1ID.fl_str_mv	477.763.820-00
dc.contributor.advisor1Lattes.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782943Z2
dc.contributor.advisor-co1.fl_str_mv	Ost, Luciano Copello
dc.contributor.advisor-co1ID.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4779836J4
dc.contributor.authorID.fl_str_mv	007.216.910-99
dc.contributor.authorLattes.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4421755E8
dc.contributor.author.fl_str_mv	Mandelli, Marcelo Grandi
contributor_str_mv	Moraes, Fernando Gehm Ost, Luciano Copello
dc.subject.por.fl_str_mv	INFORMÁTICA MULTIPROCESSADORES ARQUITETURA DE COMPUTADOR
topic	INFORMÁTICA MULTIPROCESSADORES ARQUITETURA DE COMPUTADOR CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description	MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.
publishDate	2015
dc.date.accessioned.fl_str_mv	2015-09-18T20:30:53Z
dc.date.issued.fl_str_mv	2015-07-13
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://tede2.pucrs.br/tede2/handle/tede/6317
url	http://tede2.pucrs.br/tede2/handle/tede/6317
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.program.fl_str_mv	1974996533081274470
dc.relation.confidence.fl_str_mv	600 600 600
dc.relation.department.fl_str_mv	-3008542510401149144
dc.relation.cnpq.fl_str_mv	3671711205811204509
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Pontifícia Universidade Católica do Rio Grande do Sul
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv	PUCRS
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Faculdade de Informática
publisher.none.fl_str_mv	Pontifícia Universidade Católica do Rio Grande do Sul
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) instacron:PUC_RS
instname_str	Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str	PUC_RS
institution	PUC_RS
reponame_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
collection	Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv	http://tede2.pucrs.br/tede2/bitstream/tede/6317/4/475052+-+Texto+Completo.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/6317/3/475052+-+Texto+Completo.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/6317/2/475052+-+Texto+Completo.pdf http://tede2.pucrs.br/tede2/bitstream/tede/6317/1/license.txt
bitstream.checksum.fl_str_mv	c6a77f4c86b2dd0ae8db55fef619d331 6a84ef809ebda22fb9d3ae4bf438f3d6 5d74943dc9ee311c90eb182fb022e539 5a9d6006225b368ef605ba16b4f6d1be
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv	biblioteca.central@pucrs.br\|\|
_version_	1799765315192619008

Exploration of runtime distributed mapping techniques for emerging large scale MPSoCs

Registros relacionados