Using quantitative information for efficient association rule generation

Detalhes bibliográficos
Autor(a) principal: Pôssas,Bruno
Data de Publicação: 2000
Outros Autores: Meira Jr.,Wagner, Carvalho,Márcio, Resende,Rodolfo
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Journal of the Brazilian Computer Society
Texto Completo: http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005
Resumo: The solution of the mining association rules problem in customer transactions was introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in several directions such as adding or replacing the confidence and support by other measures, or how to also account for quantitative attributes. In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature while preserving its performance, as illustrated by a case study. Our approach is targeted at two of the most computationally demanding phases in the process of generating association rules: the enumeration of the candidate sets and the verification of which of them are frequent. The minimization of the cost of these phases is achieved by pruning early candidate sets based on additional quantitative information about the transactions. In summary, we explore certain multidimensional properties of the data allowing us to combine this additional information as a pruning criterion. Based on synthetically generated data, our strategy reduced the number of candidate sets examined by the algorithm up to 15%. Furthermore, it also reduced the execution time significantly, in the order of 23%.
id UFRGS-28_2f433eebae39b51a60ee88b9aa2ed77a
oai_identifier_str oai:scielo:S0104-65002000000200005
network_acronym_str UFRGS-28
network_name_str Journal of the Brazilian Computer Society
repository_id_str
spelling Using quantitative information for efficient association rule generationData miningassociation rulesalgorithmsknowledge discovery in databasesThe solution of the mining association rules problem in customer transactions was introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in several directions such as adding or replacing the confidence and support by other measures, or how to also account for quantitative attributes. In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature while preserving its performance, as illustrated by a case study. Our approach is targeted at two of the most computationally demanding phases in the process of generating association rules: the enumeration of the candidate sets and the verification of which of them are frequent. The minimization of the cost of these phases is achieved by pruning early candidate sets based on additional quantitative information about the transactions. In summary, we explore certain multidimensional properties of the data allowing us to combine this additional information as a pruning criterion. Based on synthetically generated data, our strategy reduced the number of candidate sets examined by the algorithm up to 15%. Furthermore, it also reduced the execution time significantly, in the order of 23%.Sociedade Brasileira de Computação2000-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005Journal of the Brazilian Computer Society v.7 n.1 2000reponame:Journal of the Brazilian Computer Societyinstname:Sociedade Brasileira de Computação (SBC)instacron:UFRGS10.1590/S0104-65002000000200005info:eu-repo/semantics/openAccessPôssas,BrunoMeira Jr.,WagnerCarvalho,MárcioResende,Rodolfoeng2001-01-31T00:00:00Zoai:scielo:S0104-65002000000200005Revistahttps://journal-bcs.springeropen.com/PUBhttps://old.scielo.br/oai/scielo-oai.phpjbcs@icmc.sc.usp.br1678-48040104-6500opendoar:2001-01-31T00:00Journal of the Brazilian Computer Society - Sociedade Brasileira de Computação (SBC)false
dc.title.none.fl_str_mv Using quantitative information for efficient association rule generation
title Using quantitative information for efficient association rule generation
spellingShingle Using quantitative information for efficient association rule generation
Pôssas,Bruno
Data mining
association rules
algorithms
knowledge discovery in databases
title_short Using quantitative information for efficient association rule generation
title_full Using quantitative information for efficient association rule generation
title_fullStr Using quantitative information for efficient association rule generation
title_full_unstemmed Using quantitative information for efficient association rule generation
title_sort Using quantitative information for efficient association rule generation
author Pôssas,Bruno
author_facet Pôssas,Bruno
Meira Jr.,Wagner
Carvalho,Márcio
Resende,Rodolfo
author_role author
author2 Meira Jr.,Wagner
Carvalho,Márcio
Resende,Rodolfo
author2_role author
author
author
dc.contributor.author.fl_str_mv Pôssas,Bruno
Meira Jr.,Wagner
Carvalho,Márcio
Resende,Rodolfo
dc.subject.por.fl_str_mv Data mining
association rules
algorithms
knowledge discovery in databases
topic Data mining
association rules
algorithms
knowledge discovery in databases
description The solution of the mining association rules problem in customer transactions was introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in several directions such as adding or replacing the confidence and support by other measures, or how to also account for quantitative attributes. In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature while preserving its performance, as illustrated by a case study. Our approach is targeted at two of the most computationally demanding phases in the process of generating association rules: the enumeration of the candidate sets and the verification of which of them are frequent. The minimization of the cost of these phases is achieved by pruning early candidate sets based on additional quantitative information about the transactions. In summary, we explore certain multidimensional properties of the data allowing us to combine this additional information as a pruning criterion. Based on synthetically generated data, our strategy reduced the number of candidate sets examined by the algorithm up to 15%. Furthermore, it also reduced the execution time significantly, in the order of 23%.
publishDate 2000
dc.date.none.fl_str_mv 2000-01-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0104-65002000000200005
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1590/S0104-65002000000200005
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv text/html
dc.publisher.none.fl_str_mv Sociedade Brasileira de Computação
publisher.none.fl_str_mv Sociedade Brasileira de Computação
dc.source.none.fl_str_mv Journal of the Brazilian Computer Society v.7 n.1 2000
reponame:Journal of the Brazilian Computer Society
instname:Sociedade Brasileira de Computação (SBC)
instacron:UFRGS
instname_str Sociedade Brasileira de Computação (SBC)
instacron_str UFRGS
institution UFRGS
reponame_str Journal of the Brazilian Computer Society
collection Journal of the Brazilian Computer Society
repository.name.fl_str_mv Journal of the Brazilian Computer Society - Sociedade Brasileira de Computação (SBC)
repository.mail.fl_str_mv jbcs@icmc.sc.usp.br
_version_ 1754734669548486656