Detalhes bibliográficos
Título da fonte: Repositório Institucional da UFMG
id UFMG_5dce1d9527a579b0f488cd7c70f96e84
oai_identifier_str oai:repositorio.ufmg.br:1843/61947
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
reponame_str Repositório Institucional da UFMG
instacron_str UFMG
institution Universidade Federal de Minas Gerais (UFMG)
instname_str Universidade Federal de Minas Gerais (UFMG)
spelling 2023-12-12T20:27:30Z2023-12-12T20:27:30Z20191019http://dx.doi.org/10.1186/s13174-019-0118-71869-0238http://hdl.handle.net/1843/61947http://orcid.org/0000-0002-1480-0039https://orcid.org/0000-0003-0865-1417High-performance computing (HPC) and massive data processing (Big Data) are two trends that are beginning to converge. In that process, aspects of hardware architectures, systems support and programming paradigms are being revisited from both perspectives. This paper presents our experience on this path of convergence with the proposal of a framework that addresses some of the programming issues derived from such integration. Our contribution is the development of an integrated environment that integretes (i) COMPSs, a programming framework for the development and execution of parallel applications for distributed infrastructures; (ii) Lemonade, a data mining and analysis tool; and (iii) HDFS, the most widely used distributed file system for Big Data systems. To validate our framework, we used Lemonade to create COMPSs applications that access data through HDFS, and compared them with equivalent applications built with Spark, a popular Big Data framework. The results show that the HDFS integration benefits COMPSs by simplifying data access and by rearranging data transfer, reducing execution time. The integration with Lemonade facilitates COMPSs’s use and may help its popularization in the Data Science community, by providing efficient algorithm implementations for experts from the data domain that want to develop applications with a higher level abstraction.A computação de alto desempenho (HPC) e o processamento massivo de dados (Big Data) são duas tendências que estão começando a convergir. Nesse processo, aspectos de arquiteturas de hardware, suporte de sistemas e paradigmas de programação estão sendo revisitados de ambas as perspectivas. Este artigo apresenta a nossa experiência neste caminho de convergência com a proposta de um quadro que aborda algumas das questões de programação derivadas dessa integração. Nossa contribuição é o desenvolvimento de um ambiente integrado que integre (i) COMPSs, um framework de programação para o desenvolvimento e execução de aplicações paralelas para infraestruturas distribuídas; (ii) Lemonade, ferramenta de mineração e análise de dados; e (iii) HDFS, o sistema de arquivos distribuídos mais utilizado para sistemas de Big Data. Para validar nossa estrutura, usamos Lemonade para criar aplicativos COMPSs que acessam dados por meio de HDFS e os comparamos com aplicativos equivalentes construídos com Spark, uma estrutura popular de Big Data. Os resultados mostram que a integração do HDFS beneficia os COMPSs ao simplificar o acesso aos dados e ao reorganizar a transferência de dados, reduzindo o tempo de execução. A integração com o Lemonade facilita o uso de COMPSs e pode ajudar na sua popularização na comunidade de Data Science, ao fornecer implementações eficientes de algoritmos para especialistas do domínio de dados que desejam desenvolver aplicações com maior nível de abstração.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoFAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorengUniversidade Federal de Minas GeraisUFMGBrasilICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃOJournal of Internet Services and ApplicationsProgramaçãoComputação de alto desempenhoBig dataProcessamento de dadosCOMPSsHigh-performance computingBig dataHDFSLemonadeUpgrading a high performance computing environment for massive data processingAtualizando um ambiente de computação de alto desempenho para processamento massivo de dadosinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://jisajournal.springeropen.com/articles/10.1186/s13174-019-0118-7Lucas Miguel Simões PonceWalter Dos SantosWagner Meira Jr.Dorgival GuedesDaniele LezziRosa M. Badiaapplication/pdfinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGLICENSELicense.txtLicense.txttext/plain; charset=utf-82042https://repositorio.ufmg.br/bitstream/1843/61947/1/License.txtfa505098d172de0bc8864fc1287ffe22MD51ORIGINALUpgrading a high performance computing environment for massive data processing.pdfUpgrading a high performance computing environment for massive data processing.pdfapplication/pdf20846488https://repositorio.ufmg.br/bitstream/1843/61947/2/Upgrading%20a%20high%20performance%20computing%20environment%20for%20massive%20data%20processing.pdfed2509be153e265a3864d337775a1309MD521843/619472023-12-12 17:27:30.646oai:repositorio.ufmg.br:1843/61947TElDRU7vv71BIERFIERJU1RSSUJVSe+/ve+/vU8gTu+/vU8tRVhDTFVTSVZBIERPIFJFUE9TSVTvv71SSU8gSU5TVElUVUNJT05BTCBEQSBVRk1HCiAKCkNvbSBhIGFwcmVzZW50Ye+/ve+/vW8gZGVzdGEgbGljZW7vv71hLCB2b2Pvv70gKG8gYXV0b3IgKGVzKSBvdSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGRlIGF1dG9yKSBjb25jZWRlIGFvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbu+/vW8gZXhjbHVzaXZvIGUgaXJyZXZvZ++/vXZlbCBkZSByZXByb2R1emlyIGUvb3UgZGlzdHJpYnVpciBhIHN1YSBwdWJsaWNh77+977+9byAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0cu+/vW5pY28gZSBlbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mg77+9dWRpbyBvdSB277+9ZGVvLgoKVm9j77+9IGRlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zvv710aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2Pvv70gY29uY29yZGEgcXVlIG8gUmVwb3NpdO+/vXJpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250Ze+/vWRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNh77+977+9byBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHvv73vv71vLgoKVm9j77+9IHRhbWLvv71tIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPvv71waWEgZGUgc3VhIHB1YmxpY2Hvv73vv71vIHBhcmEgZmlucyBkZSBzZWd1cmFu77+9YSwgYmFjay11cCBlIHByZXNlcnZh77+977+9by4KClZvY++/vSBkZWNsYXJhIHF1ZSBhIHN1YSBwdWJsaWNh77+977+9byDvv70gb3JpZ2luYWwgZSBxdWUgdm9j77+9IHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vu77+9YS4gVm9j77+9IHRhbWLvv71tIGRlY2xhcmEgcXVlIG8gZGVw77+9c2l0byBkZSBzdWEgcHVibGljYe+/ve+/vW8gbu+/vW8sIHF1ZSBzZWphIGRlIHNldSBjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd177+9bS4KCkNhc28gYSBzdWEgcHVibGljYe+/ve+/vW8gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY++/vSBu77+9byBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2Pvv70gZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc++/vW8gaXJyZXN0cml0YSBkbyBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgcGFyYSBjb25jZWRlciBhbyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7vv71hLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3Tvv70gY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250Ze+/vWRvIGRhIHB1YmxpY2Hvv73vv71vIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFBVQkxJQ0Hvv73vv71PIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ++/vU5JTyBPVSBBUE9JTyBERSBVTUEgQUfvv71OQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0Pvv70gREVDTEFSQSBRVUUgUkVTUEVJVE9VIFRPRE9TIEUgUVVBSVNRVUVSIERJUkVJVE9TIERFIFJFVklT77+9TyBDT01PIFRBTULvv71NIEFTIERFTUFJUyBPQlJJR0Hvv73vv71FUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKTyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNh77+977+9bywgZSBu77+9byBmYXLvv70gcXVhbHF1ZXIgYWx0ZXJh77+977+9bywgYWzvv71tIGRhcXVlbGFzIGNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7vv71hLgo=Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oaiopendoar:2023-12-12T20:27:30Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
_version_ 1813547988266516480