MPSF: cloud scheduling framework for distributed workflow execution.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | http://www.teses.usp.br/teses/disponiveis/3/3141/tde-03032017-083914/ |
Resumo: | Cloud computing represents a distributed computing paradigm that gained notoriety due to its properties related to on-demand elastic and dynamic resource provisioning. These characteristics are highly desirable for the execution of workflows, in particular scientific workflows that required a great amount of computing resources and that handle large-scale data. One of the main questions in this sense is how to manage resources of one or more cloud infrastructures to execute workflows while optimizing resource utilization and minimizing the total duration of the execution of tasks (makespan). The more complex the infrastructure and the tasks to be executed are, the higher the risk of incorrectly estimating the amount of resources to be assigned to each task, leading to both performance and monetary costs. Scenarios which are inherently more complex, such as hybrid and multiclouds, rarely are considered by existing resource management solutions. Moreover, a thorough research of relevant related work revealed that most of the solutions do not address data-intensive workflows, a characteristic that is increasingly evident for modern scientific workflows. In this sense, this proposal presents MPSF, the Multiphase Proactive Scheduling Framework, a cloud resource management solution based on multiple scheduling phases that continuously assess the system to optimize resource utilization and task distribution. MPSF defines models to describe and characterize workflows and resources. MPSF also defines performance and reliability models to improve load distribution among nodes and to mitigate the effects of performance fluctuations and potential failures that might occur in the system. Finally, MPSF defines a framework and an architecture to integrate all these components and deliver a solution that can be implemented and tested in real applications. Experimental results show that MPSF is able to predict with much better accuracy the duration of workflows and workflow phases, as well as providing performance gains compared to greedy approaches. |
id |
USP_95834c1cd5487461ee3625a388cebf78 |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-03032017-083914 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
MPSF: cloud scheduling framework for distributed workflow execution.MPSF: um arcabouço para escalonamento em computação em nuvem para execução distribuída de fluxos de trabalho.Cloud computingComputação em nuvemGerenciamento de recursosResource managementCloud computing represents a distributed computing paradigm that gained notoriety due to its properties related to on-demand elastic and dynamic resource provisioning. These characteristics are highly desirable for the execution of workflows, in particular scientific workflows that required a great amount of computing resources and that handle large-scale data. One of the main questions in this sense is how to manage resources of one or more cloud infrastructures to execute workflows while optimizing resource utilization and minimizing the total duration of the execution of tasks (makespan). The more complex the infrastructure and the tasks to be executed are, the higher the risk of incorrectly estimating the amount of resources to be assigned to each task, leading to both performance and monetary costs. Scenarios which are inherently more complex, such as hybrid and multiclouds, rarely are considered by existing resource management solutions. Moreover, a thorough research of relevant related work revealed that most of the solutions do not address data-intensive workflows, a characteristic that is increasingly evident for modern scientific workflows. In this sense, this proposal presents MPSF, the Multiphase Proactive Scheduling Framework, a cloud resource management solution based on multiple scheduling phases that continuously assess the system to optimize resource utilization and task distribution. MPSF defines models to describe and characterize workflows and resources. MPSF also defines performance and reliability models to improve load distribution among nodes and to mitigate the effects of performance fluctuations and potential failures that might occur in the system. Finally, MPSF defines a framework and an architecture to integrate all these components and deliver a solution that can be implemented and tested in real applications. Experimental results show that MPSF is able to predict with much better accuracy the duration of workflows and workflow phases, as well as providing performance gains compared to greedy approaches.A computação em nuvem representa um paradigma de computação distribuída que ganhoudestaque devido a aspectos relacionados à obtenção de recursos sob demanda de modo elástico e dinâmico. Estas características são consideravelmente desejáveis para a execução de tarefas relacionadas a fluxos de trabalho científicos, que exigem grande quantidade de recursos computacionais e grande fluxo de dados. Uma das principais questões neste sentido é como gerenciar os recursos de uma ou mais infraestruturas de nuvem para execução de fluxos de trabalho de modo a otimizar a utilização destes recursos e minimizar o tempo total de execução das tarefas. Quanto mais complexa a infraestrutura e as tarefas a serem executadas, maior o risco de estimar incorretamente a quantidade de recursos destinada para cada tarefa, levando a prejuízos não só em termos de tempo de execução como também financeiros. Cenários inerentemente mais complexos como nuvens híbridas e múltiplas nuvens raramente são considerados em soluções existentes de gerenciamento de recursos para nuvens. Além destes fatores, a maioria das soluções não oferece mecanismos claros para tratar de fluxos de trabalho com alta intensidade de dados, característica cada vez mais proeminente em fluxos de trabalho moderno. Neste sentido, esta proposta apresenta MPSF, uma solução de gerenciamento de recursos baseada em múltiplas fases de gerenciamento baseadas em mecanismos dinâmicos de alocação de tarefas. MPSF define modelos para descrever e caracterizar fluxos de trabalho e recursos de modo a suportar cenários simples e complexos, como nuvens híbridas e nuvens integradas. MPSF também define modelos de desempenho e confiabilidade para melhor distribuir a carga e para combater os efeitos de possíveis falhas que possam ocorrer no sistema. Por fim, MPSF define um arcabouço e um arquitetura que integra todos estes componentes de modo a definir uma solução que possa ser implementada e utilizada em cenários reais. Testes experimentais indicam que MPSF não só é capaz de prever com maior precisão a duração da execução de tarefas, como também consegue otimizar a execução das mesmas, especialmente para tarefas que demandam alto poder computacional e alta quantidade de dados.Biblioteca Digitais de Teses e Dissertações da USPCarvalho, Tereza Cristina Melo de BritoGonzalez, Nelson Mimura2016-12-16info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://www.teses.usp.br/teses/disponiveis/3/3141/tde-03032017-083914/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-10-09T12:51:23Zoai:teses.usp.br:tde-03032017-083914Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212024-10-09T12:51:23Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
MPSF: cloud scheduling framework for distributed workflow execution. MPSF: um arcabouço para escalonamento em computação em nuvem para execução distribuída de fluxos de trabalho. |
title |
MPSF: cloud scheduling framework for distributed workflow execution. |
spellingShingle |
MPSF: cloud scheduling framework for distributed workflow execution. Gonzalez, Nelson Mimura Cloud computing Computação em nuvem Gerenciamento de recursos Resource management |
title_short |
MPSF: cloud scheduling framework for distributed workflow execution. |
title_full |
MPSF: cloud scheduling framework for distributed workflow execution. |
title_fullStr |
MPSF: cloud scheduling framework for distributed workflow execution. |
title_full_unstemmed |
MPSF: cloud scheduling framework for distributed workflow execution. |
title_sort |
MPSF: cloud scheduling framework for distributed workflow execution. |
author |
Gonzalez, Nelson Mimura |
author_facet |
Gonzalez, Nelson Mimura |
author_role |
author |
dc.contributor.none.fl_str_mv |
Carvalho, Tereza Cristina Melo de Brito |
dc.contributor.author.fl_str_mv |
Gonzalez, Nelson Mimura |
dc.subject.por.fl_str_mv |
Cloud computing Computação em nuvem Gerenciamento de recursos Resource management |
topic |
Cloud computing Computação em nuvem Gerenciamento de recursos Resource management |
description |
Cloud computing represents a distributed computing paradigm that gained notoriety due to its properties related to on-demand elastic and dynamic resource provisioning. These characteristics are highly desirable for the execution of workflows, in particular scientific workflows that required a great amount of computing resources and that handle large-scale data. One of the main questions in this sense is how to manage resources of one or more cloud infrastructures to execute workflows while optimizing resource utilization and minimizing the total duration of the execution of tasks (makespan). The more complex the infrastructure and the tasks to be executed are, the higher the risk of incorrectly estimating the amount of resources to be assigned to each task, leading to both performance and monetary costs. Scenarios which are inherently more complex, such as hybrid and multiclouds, rarely are considered by existing resource management solutions. Moreover, a thorough research of relevant related work revealed that most of the solutions do not address data-intensive workflows, a characteristic that is increasingly evident for modern scientific workflows. In this sense, this proposal presents MPSF, the Multiphase Proactive Scheduling Framework, a cloud resource management solution based on multiple scheduling phases that continuously assess the system to optimize resource utilization and task distribution. MPSF defines models to describe and characterize workflows and resources. MPSF also defines performance and reliability models to improve load distribution among nodes and to mitigate the effects of performance fluctuations and potential failures that might occur in the system. Finally, MPSF defines a framework and an architecture to integrate all these components and deliver a solution that can be implemented and tested in real applications. Experimental results show that MPSF is able to predict with much better accuracy the duration of workflows and workflow phases, as well as providing performance gains compared to greedy approaches. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-12-16 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://www.teses.usp.br/teses/disponiveis/3/3141/tde-03032017-083914/ |
url |
http://www.teses.usp.br/teses/disponiveis/3/3141/tde-03032017-083914/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1815256483616522240 |