Logical big data integration and near real-time data analytics
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10773/37907 |
Resumo: | In the context of decision-making, there is a growing demand for near real-time data that traditional solutions, like data warehousing based on long-running ETL processes, cannot fully meet. On the other hand, existing logical data integration solutions are challenging because users must focus on data location and distribution details rather than on data analytics and decision-making. EasyBDI is an open-source system that provides logical integration of data and high-level business-oriented abstractions. It uses schema matching, integration, and mapping techniques, to automatically identify partitioned data and propose a global schema. Users can then specify star schemas based on global entities and submit analytical queries to retrieve data from distributed data sources without knowing the organization and other technical details of the underlying systems. This work presents the algorithms and methods for global schema creation and query execution. Experimental results show that the overhead imposed by logical integration layers is relatively small compared to the execution times of distributed queries. |
id |
RCAP_27c9d3cadb5999a231af21b1f6b44823 |
---|---|
oai_identifier_str |
oai:ria.ua.pt:10773/37907 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Logical big data integration and near real-time data analyticsBig data integrationDistributed databasesNear real-time OLAPIn the context of decision-making, there is a growing demand for near real-time data that traditional solutions, like data warehousing based on long-running ETL processes, cannot fully meet. On the other hand, existing logical data integration solutions are challenging because users must focus on data location and distribution details rather than on data analytics and decision-making. EasyBDI is an open-source system that provides logical integration of data and high-level business-oriented abstractions. It uses schema matching, integration, and mapping techniques, to automatically identify partitioned data and propose a global schema. Users can then specify star schemas based on global entities and submit analytical queries to retrieve data from distributed data sources without knowing the organization and other technical details of the underlying systems. This work presents the algorithms and methods for global schema creation and query execution. Experimental results show that the overhead imposed by logical integration layers is relatively small compared to the execution times of distributed queries.Elsevier2025-05-12T00:00:00Z2023-05-12T00:00:00Z2023-05-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10773/37907eng0169-023X10.1016/j.datak.2023.102185Silva, BrunoMoreira, Joséde C. Costa, Rogério Luísinfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-06T04:46:29Zoai:ria.ua.pt:10773/37907Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-06T04:46:29Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Logical big data integration and near real-time data analytics |
title |
Logical big data integration and near real-time data analytics |
spellingShingle |
Logical big data integration and near real-time data analytics Silva, Bruno Big data integration Distributed databases Near real-time OLAP |
title_short |
Logical big data integration and near real-time data analytics |
title_full |
Logical big data integration and near real-time data analytics |
title_fullStr |
Logical big data integration and near real-time data analytics |
title_full_unstemmed |
Logical big data integration and near real-time data analytics |
title_sort |
Logical big data integration and near real-time data analytics |
author |
Silva, Bruno |
author_facet |
Silva, Bruno Moreira, José de C. Costa, Rogério Luís |
author_role |
author |
author2 |
Moreira, José de C. Costa, Rogério Luís |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Silva, Bruno Moreira, José de C. Costa, Rogério Luís |
dc.subject.por.fl_str_mv |
Big data integration Distributed databases Near real-time OLAP |
topic |
Big data integration Distributed databases Near real-time OLAP |
description |
In the context of decision-making, there is a growing demand for near real-time data that traditional solutions, like data warehousing based on long-running ETL processes, cannot fully meet. On the other hand, existing logical data integration solutions are challenging because users must focus on data location and distribution details rather than on data analytics and decision-making. EasyBDI is an open-source system that provides logical integration of data and high-level business-oriented abstractions. It uses schema matching, integration, and mapping techniques, to automatically identify partitioned data and propose a global schema. Users can then specify star schemas based on global entities and submit analytical queries to retrieve data from distributed data sources without knowing the organization and other technical details of the underlying systems. This work presents the algorithms and methods for global schema creation and query execution. Experimental results show that the overhead imposed by logical integration layers is relatively small compared to the execution times of distributed queries. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-05-12T00:00:00Z 2023-05-12 2025-05-12T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10773/37907 |
url |
http://hdl.handle.net/10773/37907 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
0169-023X 10.1016/j.datak.2023.102185 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
mluisa.alvim@gmail.com |
_version_ |
1817543860394393600 |