Using quantitative information for efficient association rule generation
Autor(a) principal: | |
---|---|
Data de Publicação: | 2000 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Journal of the Brazilian Computer Society |
Texto Completo: | http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005 |
Resumo: | The solution of the mining association rules problem in customer transactions was introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in several directions such as adding or replacing the confidence and support by other measures, or how to also account for quantitative attributes. In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature while preserving its performance, as illustrated by a case study. Our approach is targeted at two of the most computationally demanding phases in the process of generating association rules: the enumeration of the candidate sets and the verification of which of them are frequent. The minimization of the cost of these phases is achieved by pruning early candidate sets based on additional quantitative information about the transactions. In summary, we explore certain multidimensional properties of the data allowing us to combine this additional information as a pruning criterion. Based on synthetically generated data, our strategy reduced the number of candidate sets examined by the algorithm up to 15%. Furthermore, it also reduced the execution time significantly, in the order of 23%. |
id |
UFRGS-28_2f433eebae39b51a60ee88b9aa2ed77a |
---|---|
oai_identifier_str |
oai:scielo:S0104-65002000000200005 |
network_acronym_str |
UFRGS-28 |
network_name_str |
Journal of the Brazilian Computer Society |
repository_id_str |
|
spelling |
Using quantitative information for efficient association rule generationData miningassociation rulesalgorithmsknowledge discovery in databasesThe solution of the mining association rules problem in customer transactions was introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in several directions such as adding or replacing the confidence and support by other measures, or how to also account for quantitative attributes. In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature while preserving its performance, as illustrated by a case study. Our approach is targeted at two of the most computationally demanding phases in the process of generating association rules: the enumeration of the candidate sets and the verification of which of them are frequent. The minimization of the cost of these phases is achieved by pruning early candidate sets based on additional quantitative information about the transactions. In summary, we explore certain multidimensional properties of the data allowing us to combine this additional information as a pruning criterion. Based on synthetically generated data, our strategy reduced the number of candidate sets examined by the algorithm up to 15%. Furthermore, it also reduced the execution time significantly, in the order of 23%.Sociedade Brasileira de Computação2000-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005Journal of the Brazilian Computer Society v.7 n.1 2000reponame:Journal of the Brazilian Computer Societyinstname:Sociedade Brasileira de Computação (SBC)instacron:UFRGS10.1590/S0104-65002000000200005info:eu-repo/semantics/openAccessPôssas,BrunoMeira Jr.,WagnerCarvalho,MárcioResende,Rodolfoeng2001-01-31T00:00:00Zoai:scielo:S0104-65002000000200005Revistahttps://journal-bcs.springeropen.com/PUBhttps://old.scielo.br/oai/scielo-oai.phpjbcs@icmc.sc.usp.br1678-48040104-6500opendoar:2001-01-31T00:00Journal of the Brazilian Computer Society - Sociedade Brasileira de Computação (SBC)false |
dc.title.none.fl_str_mv |
Using quantitative information for efficient association rule generation |
title |
Using quantitative information for efficient association rule generation |
spellingShingle |
Using quantitative information for efficient association rule generation Pôssas,Bruno Data mining association rules algorithms knowledge discovery in databases |
title_short |
Using quantitative information for efficient association rule generation |
title_full |
Using quantitative information for efficient association rule generation |
title_fullStr |
Using quantitative information for efficient association rule generation |
title_full_unstemmed |
Using quantitative information for efficient association rule generation |
title_sort |
Using quantitative information for efficient association rule generation |
author |
Pôssas,Bruno |
author_facet |
Pôssas,Bruno Meira Jr.,Wagner Carvalho,Márcio Resende,Rodolfo |
author_role |
author |
author2 |
Meira Jr.,Wagner Carvalho,Márcio Resende,Rodolfo |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Pôssas,Bruno Meira Jr.,Wagner Carvalho,Márcio Resende,Rodolfo |
dc.subject.por.fl_str_mv |
Data mining association rules algorithms knowledge discovery in databases |
topic |
Data mining association rules algorithms knowledge discovery in databases |
description |
The solution of the mining association rules problem in customer transactions was introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in several directions such as adding or replacing the confidence and support by other measures, or how to also account for quantitative attributes. In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature while preserving its performance, as illustrated by a case study. Our approach is targeted at two of the most computationally demanding phases in the process of generating association rules: the enumeration of the candidate sets and the verification of which of them are frequent. The minimization of the cost of these phases is achieved by pruning early candidate sets based on additional quantitative information about the transactions. In summary, we explore certain multidimensional properties of the data allowing us to combine this additional information as a pruning criterion. Based on synthetically generated data, our strategy reduced the number of candidate sets examined by the algorithm up to 15%. Furthermore, it also reduced the execution time significantly, in the order of 23%. |
publishDate |
2000 |
dc.date.none.fl_str_mv |
2000-01-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005 |
url |
http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1590/S0104-65002000000200005 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
text/html |
dc.publisher.none.fl_str_mv |
Sociedade Brasileira de Computação |
publisher.none.fl_str_mv |
Sociedade Brasileira de Computação |
dc.source.none.fl_str_mv |
Journal of the Brazilian Computer Society v.7 n.1 2000 reponame:Journal of the Brazilian Computer Society instname:Sociedade Brasileira de Computação (SBC) instacron:UFRGS |
instname_str |
Sociedade Brasileira de Computação (SBC) |
instacron_str |
UFRGS |
institution |
UFRGS |
reponame_str |
Journal of the Brazilian Computer Society |
collection |
Journal of the Brazilian Computer Society |
repository.name.fl_str_mv |
Journal of the Brazilian Computer Society - Sociedade Brasileira de Computação (SBC) |
repository.mail.fl_str_mv |
jbcs@icmc.sc.usp.br |
_version_ |
1754734669548486656 |