3iCubing: An Interval Inverted Index Approach to Data Cubes

Detalhes bibliográficos
Autor(a) principal: Domingues, Marco
Data de Publicação: 2022
Outros Autores: Silva, Rodrigo Rocha, Bernardino, Jorge
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10316/100598
https://doi.org/10.1109/ACCESS.2022.3142449
Resumo: The increase in the amounts of information used to analyze data is problematic since the memory necessary to store and process it is getting quite big. The interval inverted index representation was developed to reduce the required memory to store data, and Frag-Cubing is one of the most popular algorithms. In this paper, we propose two new data cubing algorithms: 3iCubing and M3iCubing. 3iCubing is a Frag-Cubing-based algorithm that uses the interval inverted index representation, while M3iCubing uses both a normal and interval inverted index data representation. The algorithms were compared using synthetic and real data sets in indexation and querying operations, both runtime and memory-wise. The experimental evaluation shows that 3iCubing can considerably reduce the memory needed to index a data set, reducing around 25% of the memory used by Frag-Cubing. Moreover, the results show that the interval inverted index representation is dependent on the data skewness to reduce the memory consumption, having positive results with highly skewed and real-world data sets.
id RCAP_c99366c5a0eee2d9215d8fc2d7791b83
oai_identifier_str oai:estudogeral.uc.pt:10316/100598
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling 3iCubing: An Interval Inverted Index Approach to Data CubesBig datadata cubeinverted indexOLAPThe increase in the amounts of information used to analyze data is problematic since the memory necessary to store and process it is getting quite big. The interval inverted index representation was developed to reduce the required memory to store data, and Frag-Cubing is one of the most popular algorithms. In this paper, we propose two new data cubing algorithms: 3iCubing and M3iCubing. 3iCubing is a Frag-Cubing-based algorithm that uses the interval inverted index representation, while M3iCubing uses both a normal and interval inverted index data representation. The algorithms were compared using synthetic and real data sets in indexation and querying operations, both runtime and memory-wise. The experimental evaluation shows that 3iCubing can considerably reduce the memory needed to index a data set, reducing around 25% of the memory used by Frag-Cubing. Moreover, the results show that the interval inverted index representation is dependent on the data skewness to reduce the memory consumption, having positive results with highly skewed and real-world data sets.2022info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/100598http://hdl.handle.net/10316/100598https://doi.org/10.1109/ACCESS.2022.3142449eng2169-3536Domingues, MarcoSilva, Rodrigo RochaBernardino, Jorgeinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-07-06T20:37:09Zoai:estudogeral.uc.pt:10316/100598Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:17:57.194694Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv 3iCubing: An Interval Inverted Index Approach to Data Cubes
title 3iCubing: An Interval Inverted Index Approach to Data Cubes
spellingShingle 3iCubing: An Interval Inverted Index Approach to Data Cubes
Domingues, Marco
Big data
data cube
inverted index
OLAP
title_short 3iCubing: An Interval Inverted Index Approach to Data Cubes
title_full 3iCubing: An Interval Inverted Index Approach to Data Cubes
title_fullStr 3iCubing: An Interval Inverted Index Approach to Data Cubes
title_full_unstemmed 3iCubing: An Interval Inverted Index Approach to Data Cubes
title_sort 3iCubing: An Interval Inverted Index Approach to Data Cubes
author Domingues, Marco
author_facet Domingues, Marco
Silva, Rodrigo Rocha
Bernardino, Jorge
author_role author
author2 Silva, Rodrigo Rocha
Bernardino, Jorge
author2_role author
author
dc.contributor.author.fl_str_mv Domingues, Marco
Silva, Rodrigo Rocha
Bernardino, Jorge
dc.subject.por.fl_str_mv Big data
data cube
inverted index
OLAP
topic Big data
data cube
inverted index
OLAP
description The increase in the amounts of information used to analyze data is problematic since the memory necessary to store and process it is getting quite big. The interval inverted index representation was developed to reduce the required memory to store data, and Frag-Cubing is one of the most popular algorithms. In this paper, we propose two new data cubing algorithms: 3iCubing and M3iCubing. 3iCubing is a Frag-Cubing-based algorithm that uses the interval inverted index representation, while M3iCubing uses both a normal and interval inverted index data representation. The algorithms were compared using synthetic and real data sets in indexation and querying operations, both runtime and memory-wise. The experimental evaluation shows that 3iCubing can considerably reduce the memory needed to index a data set, reducing around 25% of the memory used by Frag-Cubing. Moreover, the results show that the interval inverted index representation is dependent on the data skewness to reduce the memory consumption, having positive results with highly skewed and real-world data sets.
publishDate 2022
dc.date.none.fl_str_mv 2022
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10316/100598
http://hdl.handle.net/10316/100598
https://doi.org/10.1109/ACCESS.2022.3142449
url http://hdl.handle.net/10316/100598
https://doi.org/10.1109/ACCESS.2022.3142449
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2169-3536
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134074951958528