Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.

Detalhes bibliográficos
Autor(a) principal: Lima, Joubert de Castro
Data de Publicação: 2011
Outros Autores: Hirata, Celso Massaki
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFOP
Texto Completo: http://www.repositorio.ufop.br/handle/123456789/4375
https://doi.org/10.1016/j.ins.2010.05.012
Resumo: We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fundamental problem. Previous approaches, such as Dwarf, Star and MDAG, have substantially reduced the cube size using graph representations. In general, they eliminate prefix redundancy and some suffix redundancy from a data cube. The MCG differs significantly from previous approaches as it completely eliminates prefix and suffix redundancies from a data cube. A data cube can be viewed as a set of sub-graphs. In general, redundant sub-graphs are quite common in a data cube, but eliminating them is a hard problem. Dwarf, Star and MDAG approaches only eliminate some specific common sub-graphs. The MCG approach efficiently eliminates all common sub-graphs from the entire cube, based on an exact sub-graph matching solution. We propose a matching function to guarantee one-to-one mapping between sub-graphs. The function is computed incrementally, in a top-down fashion, and its computation uses a minimal amount of information to generate unique results. In addition, it is computed for any measurement type: distributive, algebraic or holistic. MCG performance analysis demonstrates that MCG is 20–40% faster than Dwarf, Star and MDAG approaches when computing sparse data cubes. Dense data cubes have a small number of aggregations, so there is not enough room for runtime and memory consumption optimization, therefore the MCG approach is not useful in computing such dense cubes. The compact representation of sparse data cubes enables the MCG approach to reduce memory consumption by 70–90% when compared to the original Star approach, proposed in [33]. In the same scenarios, the improved Star approach, proposed in [34], reduces memory consumption by only 10–30%, Dwarf by 30–50% and MDAG by 40–60%, when compared to the original Star approach. The MCG is the first approach that uses an exact sub-graph matching function to reduce cube size, avoiding unnecessary aggregation, i.e. improving cube computation runtime.
id UFOP_6e7c3b08c99ea8e0e67ed968aac4d426
oai_identifier_str oai:repositorio.ufop.br:123456789/4375
network_acronym_str UFOP
network_name_str Repositório Institucional da UFOP
repository_id_str 3233
spelling Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.Data warehouseOnline analytical processingData cubeWe present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fundamental problem. Previous approaches, such as Dwarf, Star and MDAG, have substantially reduced the cube size using graph representations. In general, they eliminate prefix redundancy and some suffix redundancy from a data cube. The MCG differs significantly from previous approaches as it completely eliminates prefix and suffix redundancies from a data cube. A data cube can be viewed as a set of sub-graphs. In general, redundant sub-graphs are quite common in a data cube, but eliminating them is a hard problem. Dwarf, Star and MDAG approaches only eliminate some specific common sub-graphs. The MCG approach efficiently eliminates all common sub-graphs from the entire cube, based on an exact sub-graph matching solution. We propose a matching function to guarantee one-to-one mapping between sub-graphs. The function is computed incrementally, in a top-down fashion, and its computation uses a minimal amount of information to generate unique results. In addition, it is computed for any measurement type: distributive, algebraic or holistic. MCG performance analysis demonstrates that MCG is 20–40% faster than Dwarf, Star and MDAG approaches when computing sparse data cubes. Dense data cubes have a small number of aggregations, so there is not enough room for runtime and memory consumption optimization, therefore the MCG approach is not useful in computing such dense cubes. The compact representation of sparse data cubes enables the MCG approach to reduce memory consumption by 70–90% when compared to the original Star approach, proposed in [33]. In the same scenarios, the improved Star approach, proposed in [34], reduces memory consumption by only 10–30%, Dwarf by 30–50% and MDAG by 40–60%, when compared to the original Star approach. The MCG is the first approach that uses an exact sub-graph matching function to reduce cube size, avoiding unnecessary aggregation, i.e. improving cube computation runtime.2015-01-26T11:25:29Z2015-01-26T11:25:29Z2011info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfLIMA, J. de C.; HIRATA, C. M. Multidimensional cyclic graph approach: representing a data cube without common sub-graphs. Information Sciences, v. 181, p. 2626-2655, 2011. Disponível em: <http://www.sciencedirect.com/science/article/pii/S002002551000215X>. Acesso em: 22 jan. 2015.0020-0255http://www.repositorio.ufop.br/handle/123456789/4375https://doi.org/10.1016/j.ins.2010.05.012O periódico Information Sciences concede permissão para depósito do artigo no Repositório Institucional da UFOP. Número da licença: 3552521449494.info:eu-repo/semantics/openAccessLima, Joubert de CastroHirata, Celso Massakiengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOP2019-06-12T17:01:15Zoai:repositorio.ufop.br:123456789/4375Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332019-06-12T17:01:15Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false
dc.title.none.fl_str_mv Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
title Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
spellingShingle Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
Lima, Joubert de Castro
Data warehouse
Online analytical processing
Data cube
title_short Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
title_full Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
title_fullStr Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
title_full_unstemmed Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
title_sort Multidimensional cyclic graph approach : representing a data cube without common sub-graphs.
author Lima, Joubert de Castro
author_facet Lima, Joubert de Castro
Hirata, Celso Massaki
author_role author
author2 Hirata, Celso Massaki
author2_role author
dc.contributor.author.fl_str_mv Lima, Joubert de Castro
Hirata, Celso Massaki
dc.subject.por.fl_str_mv Data warehouse
Online analytical processing
Data cube
topic Data warehouse
Online analytical processing
Data cube
description We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fundamental problem. Previous approaches, such as Dwarf, Star and MDAG, have substantially reduced the cube size using graph representations. In general, they eliminate prefix redundancy and some suffix redundancy from a data cube. The MCG differs significantly from previous approaches as it completely eliminates prefix and suffix redundancies from a data cube. A data cube can be viewed as a set of sub-graphs. In general, redundant sub-graphs are quite common in a data cube, but eliminating them is a hard problem. Dwarf, Star and MDAG approaches only eliminate some specific common sub-graphs. The MCG approach efficiently eliminates all common sub-graphs from the entire cube, based on an exact sub-graph matching solution. We propose a matching function to guarantee one-to-one mapping between sub-graphs. The function is computed incrementally, in a top-down fashion, and its computation uses a minimal amount of information to generate unique results. In addition, it is computed for any measurement type: distributive, algebraic or holistic. MCG performance analysis demonstrates that MCG is 20–40% faster than Dwarf, Star and MDAG approaches when computing sparse data cubes. Dense data cubes have a small number of aggregations, so there is not enough room for runtime and memory consumption optimization, therefore the MCG approach is not useful in computing such dense cubes. The compact representation of sparse data cubes enables the MCG approach to reduce memory consumption by 70–90% when compared to the original Star approach, proposed in [33]. In the same scenarios, the improved Star approach, proposed in [34], reduces memory consumption by only 10–30%, Dwarf by 30–50% and MDAG by 40–60%, when compared to the original Star approach. The MCG is the first approach that uses an exact sub-graph matching function to reduce cube size, avoiding unnecessary aggregation, i.e. improving cube computation runtime.
publishDate 2011
dc.date.none.fl_str_mv 2011
2015-01-26T11:25:29Z
2015-01-26T11:25:29Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv LIMA, J. de C.; HIRATA, C. M. Multidimensional cyclic graph approach: representing a data cube without common sub-graphs. Information Sciences, v. 181, p. 2626-2655, 2011. Disponível em: <http://www.sciencedirect.com/science/article/pii/S002002551000215X>. Acesso em: 22 jan. 2015.
0020-0255
http://www.repositorio.ufop.br/handle/123456789/4375
https://doi.org/10.1016/j.ins.2010.05.012
identifier_str_mv LIMA, J. de C.; HIRATA, C. M. Multidimensional cyclic graph approach: representing a data cube without common sub-graphs. Information Sciences, v. 181, p. 2626-2655, 2011. Disponível em: <http://www.sciencedirect.com/science/article/pii/S002002551000215X>. Acesso em: 22 jan. 2015.
0020-0255
url http://www.repositorio.ufop.br/handle/123456789/4375
https://doi.org/10.1016/j.ins.2010.05.012
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFOP
instname:Universidade Federal de Ouro Preto (UFOP)
instacron:UFOP
instname_str Universidade Federal de Ouro Preto (UFOP)
instacron_str UFOP
institution UFOP
reponame_str Repositório Institucional da UFOP
collection Repositório Institucional da UFOP
repository.name.fl_str_mv Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)
repository.mail.fl_str_mv repositorio@ufop.edu.br
_version_ 1813002832601677824