Exploring distributed computing tools through data mining tasks
Autor(a) principal: | |
---|---|
Data de Publicação: | 2014 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/6182 |
Resumo: | Harnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment. |
id |
RCAP_3cfe4d0cb1b1fa289ba61d2b7b2f0ead |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/6182 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Exploring distributed computing tools through data mining tasksDistributed computingCondorBOINCData mining taskComputação distribuídaTarefa de mineração de dadosHarnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment.Tirar partido dos recursos de CPU disponíveis, do espaço de armazenamento, e de outros recursos de computadores interligados em rede, de modo a que possam trabalhar conjuntamente, são características comuns a todos os grandes projetos de investigação em grid computing. Hoje em dia, a maioria dos laboratórios informáticos dos centros de investigação das instituições de ensino superior encontra-se equipada com poderosos computadores. Constata-se que, na maioria do tempo, estas máquinas não estão a utilizar o seu poder de processamento ou, pelo menos, não o utilizam na sua plenitude. No entanto, para problemas complexos e para a análise de grandes quantidades de dados, são necessários vastos recursos computacionais. Em tais situações, os algoritmos de análise requerem computadores muito potentes e caros, o que reduz o número de utilizadores que podem realizar essas tarefas de análise de dados. Em vez de se utilizarem máquinas individuais dispendiosas, os sistemas de computação distribuída oferecem a possibilidade de se utilizar um conjunto de máquinas muito menos onerosas que realizam a mesma tarefa. Os projectos BOINC e Condor têm sido utilizados com sucesso em trabalhos de investigação científica, em todo o mundo, com um custo reduzido. Neste trabalho, o objetivo principal é explorar ambas as ferramentas de computação distribuída, Condor e BOINC, para que se possa aproveitar os recursos computacionais disponíveis dos computadores, utilizando-os de modo a que os investigadores possam tirar partido deles nos seus trabalhos de investigação. Nesta dissertação, são realizadas tarefas de data mining com diferentes algoritmos de aprendizagem automática, num ambiente de computação distribuída.Oliveira, Paulo JorgeRepositório Científico do Instituto Politécnico do PortoRahman, Anishur2015-06-02T08:14:51Z20142014-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/6182TID:201819546enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T12:46:22Zoai:recipp.ipp.pt:10400.22/6182Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:26:46.045299Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Exploring distributed computing tools through data mining tasks |
title |
Exploring distributed computing tools through data mining tasks |
spellingShingle |
Exploring distributed computing tools through data mining tasks Rahman, Anishur Distributed computing Condor BOINC Data mining task Computação distribuída Tarefa de mineração de dados |
title_short |
Exploring distributed computing tools through data mining tasks |
title_full |
Exploring distributed computing tools through data mining tasks |
title_fullStr |
Exploring distributed computing tools through data mining tasks |
title_full_unstemmed |
Exploring distributed computing tools through data mining tasks |
title_sort |
Exploring distributed computing tools through data mining tasks |
author |
Rahman, Anishur |
author_facet |
Rahman, Anishur |
author_role |
author |
dc.contributor.none.fl_str_mv |
Oliveira, Paulo Jorge Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Rahman, Anishur |
dc.subject.por.fl_str_mv |
Distributed computing Condor BOINC Data mining task Computação distribuída Tarefa de mineração de dados |
topic |
Distributed computing Condor BOINC Data mining task Computação distribuída Tarefa de mineração de dados |
description |
Harnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014 2014-01-01T00:00:00Z 2015-06-02T08:14:51Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/6182 TID:201819546 |
url |
http://hdl.handle.net/10400.22/6182 |
identifier_str_mv |
TID:201819546 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131362127511552 |