3iCubing: An Interval Inverted Index Approach to Data Cubes
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10316/100598 https://doi.org/10.1109/ACCESS.2022.3142449 |
Resumo: | The increase in the amounts of information used to analyze data is problematic since the memory necessary to store and process it is getting quite big. The interval inverted index representation was developed to reduce the required memory to store data, and Frag-Cubing is one of the most popular algorithms. In this paper, we propose two new data cubing algorithms: 3iCubing and M3iCubing. 3iCubing is a Frag-Cubing-based algorithm that uses the interval inverted index representation, while M3iCubing uses both a normal and interval inverted index data representation. The algorithms were compared using synthetic and real data sets in indexation and querying operations, both runtime and memory-wise. The experimental evaluation shows that 3iCubing can considerably reduce the memory needed to index a data set, reducing around 25% of the memory used by Frag-Cubing. Moreover, the results show that the interval inverted index representation is dependent on the data skewness to reduce the memory consumption, having positive results with highly skewed and real-world data sets. |
id |
RCAP_c99366c5a0eee2d9215d8fc2d7791b83 |
---|---|
oai_identifier_str |
oai:estudogeral.uc.pt:10316/100598 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
3iCubing: An Interval Inverted Index Approach to Data CubesBig datadata cubeinverted indexOLAPThe increase in the amounts of information used to analyze data is problematic since the memory necessary to store and process it is getting quite big. The interval inverted index representation was developed to reduce the required memory to store data, and Frag-Cubing is one of the most popular algorithms. In this paper, we propose two new data cubing algorithms: 3iCubing and M3iCubing. 3iCubing is a Frag-Cubing-based algorithm that uses the interval inverted index representation, while M3iCubing uses both a normal and interval inverted index data representation. The algorithms were compared using synthetic and real data sets in indexation and querying operations, both runtime and memory-wise. The experimental evaluation shows that 3iCubing can considerably reduce the memory needed to index a data set, reducing around 25% of the memory used by Frag-Cubing. Moreover, the results show that the interval inverted index representation is dependent on the data skewness to reduce the memory consumption, having positive results with highly skewed and real-world data sets.2022info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/100598http://hdl.handle.net/10316/100598https://doi.org/10.1109/ACCESS.2022.3142449eng2169-3536Domingues, MarcoSilva, Rodrigo RochaBernardino, Jorgeinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-07-06T20:37:09Zoai:estudogeral.uc.pt:10316/100598Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:17:57.194694Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
title |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
spellingShingle |
3iCubing: An Interval Inverted Index Approach to Data Cubes Domingues, Marco Big data data cube inverted index OLAP |
title_short |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
title_full |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
title_fullStr |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
title_full_unstemmed |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
title_sort |
3iCubing: An Interval Inverted Index Approach to Data Cubes |
author |
Domingues, Marco |
author_facet |
Domingues, Marco Silva, Rodrigo Rocha Bernardino, Jorge |
author_role |
author |
author2 |
Silva, Rodrigo Rocha Bernardino, Jorge |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Domingues, Marco Silva, Rodrigo Rocha Bernardino, Jorge |
dc.subject.por.fl_str_mv |
Big data data cube inverted index OLAP |
topic |
Big data data cube inverted index OLAP |
description |
The increase in the amounts of information used to analyze data is problematic since the memory necessary to store and process it is getting quite big. The interval inverted index representation was developed to reduce the required memory to store data, and Frag-Cubing is one of the most popular algorithms. In this paper, we propose two new data cubing algorithms: 3iCubing and M3iCubing. 3iCubing is a Frag-Cubing-based algorithm that uses the interval inverted index representation, while M3iCubing uses both a normal and interval inverted index data representation. The algorithms were compared using synthetic and real data sets in indexation and querying operations, both runtime and memory-wise. The experimental evaluation shows that 3iCubing can considerably reduce the memory needed to index a data set, reducing around 25% of the memory used by Frag-Cubing. Moreover, the results show that the interval inverted index representation is dependent on the data skewness to reduce the memory consumption, having positive results with highly skewed and real-world data sets. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10316/100598 http://hdl.handle.net/10316/100598 https://doi.org/10.1109/ACCESS.2022.3142449 |
url |
http://hdl.handle.net/10316/100598 https://doi.org/10.1109/ACCESS.2022.3142449 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
2169-3536 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134074951958528 |