A pipelined data-parallel algorithm for ILP
Autor(a) principal: | |
---|---|
Data de Publicação: | 2006 |
Outros Autores: | , , |
Tipo de documento: | Livro |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/74815 |
Resumo: | The amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from databases, and on using parallel processing for this task. Multi Relational Data Mining (MRDM) techniques, such as Inductive Logic Programming (ILP), can learn rules from relational databases consisting of multiple tables. However current ILP systems are designed to run in main memory and can have long running times. We propose a pipelined data-parallel algorithm for ILP. The algorithm was implemented and evaluated on a commodity PC cluster with 8 processors. The results show that our algorithm yields excellent speedups, while preserving the quality of learning. |
id |
RCAP_22570c888fd556d1494c7a39c422bfaf |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/74815 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
A pipelined data-parallel algorithm for ILPCiências da computação e da informaçãoComputer and information sciencesThe amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from databases, and on using parallel processing for this task. Multi Relational Data Mining (MRDM) techniques, such as Inductive Logic Programming (ILP), can learn rules from relational databases consisting of multiple tables. However current ILP systems are designed to run in main memory and can have long running times. We propose a pipelined data-parallel algorithm for ILP. The algorithm was implemented and evaluated on a commodity PC cluster with 8 processors. The results show that our algorithm yields excellent speedups, while preserving the quality of learning.20062006-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookapplication/pdfhttps://hdl.handle.net/10216/74815eng10.1109/clustr.2005.347059Nuno A. FonsecaFernando SilvaVitor Santos CostaRui Camachoinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T15:54:17Zoai:repositorio-aberto.up.pt:10216/74815Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:34:58.862019Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
A pipelined data-parallel algorithm for ILP |
title |
A pipelined data-parallel algorithm for ILP |
spellingShingle |
A pipelined data-parallel algorithm for ILP Nuno A. Fonseca Ciências da computação e da informação Computer and information sciences |
title_short |
A pipelined data-parallel algorithm for ILP |
title_full |
A pipelined data-parallel algorithm for ILP |
title_fullStr |
A pipelined data-parallel algorithm for ILP |
title_full_unstemmed |
A pipelined data-parallel algorithm for ILP |
title_sort |
A pipelined data-parallel algorithm for ILP |
author |
Nuno A. Fonseca |
author_facet |
Nuno A. Fonseca Fernando Silva Vitor Santos Costa Rui Camacho |
author_role |
author |
author2 |
Fernando Silva Vitor Santos Costa Rui Camacho |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Nuno A. Fonseca Fernando Silva Vitor Santos Costa Rui Camacho |
dc.subject.por.fl_str_mv |
Ciências da computação e da informação Computer and information sciences |
topic |
Ciências da computação e da informação Computer and information sciences |
description |
The amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from databases, and on using parallel processing for this task. Multi Relational Data Mining (MRDM) techniques, such as Inductive Logic Programming (ILP), can learn rules from relational databases consisting of multiple tables. However current ILP systems are designed to run in main memory and can have long running times. We propose a pipelined data-parallel algorithm for ILP. The algorithm was implemented and evaluated on a commodity PC cluster with 8 processors. The results show that our algorithm yields excellent speedups, while preserving the quality of learning. |
publishDate |
2006 |
dc.date.none.fl_str_mv |
2006 2006-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/book |
format |
book |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/74815 |
url |
https://hdl.handle.net/10216/74815 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1109/clustr.2005.347059 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136258990014464 |