Extending the Relational Algebra with the Mapper Operator

Detalhes bibliográficos
Autor(a) principal: Carreira, Paulo J.F.
Data de Publicação: 2005
Outros Autores: Lopes, Antónia, Galhardas, Helena, Pereira, João
Tipo de documento: Relatório
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10451/14193
Resumo: Application scenarios such as legacy data migration, Extract-Transform-Load (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In this paper, we propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, we supply a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Finally, experimental results report the benefits brought by some of the proposed semantic optimizations
id RCAP_e414c3443f39519dafb88d2390870e37
oai_identifier_str oai:repositorio.ul.pt:10451/14193
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Extending the Relational Algebra with the Mapper OperatorApplication scenarios such as legacy data migration, Extract-Transform-Load (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In this paper, we propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, we supply a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Finally, experimental results report the benefits brought by some of the proposed semantic optimizationsDepartment of Informatics, University of LisbonRepositório da Universidade de LisboaCarreira, Paulo J.F.Lopes, AntóniaGalhardas, HelenaPereira, João2009-02-10T13:11:39Z2005-012005-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/reportapplication/pdfhttp://hdl.handle.net/10451/14193porinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T15:59:51Zoai:repositorio.ul.pt:10451/14193Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:36:01.631904Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Extending the Relational Algebra with the Mapper Operator
title Extending the Relational Algebra with the Mapper Operator
spellingShingle Extending the Relational Algebra with the Mapper Operator
Carreira, Paulo J.F.
title_short Extending the Relational Algebra with the Mapper Operator
title_full Extending the Relational Algebra with the Mapper Operator
title_fullStr Extending the Relational Algebra with the Mapper Operator
title_full_unstemmed Extending the Relational Algebra with the Mapper Operator
title_sort Extending the Relational Algebra with the Mapper Operator
author Carreira, Paulo J.F.
author_facet Carreira, Paulo J.F.
Lopes, Antónia
Galhardas, Helena
Pereira, João
author_role author
author2 Lopes, Antónia
Galhardas, Helena
Pereira, João
author2_role author
author
author
dc.contributor.none.fl_str_mv Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Carreira, Paulo J.F.
Lopes, Antónia
Galhardas, Helena
Pereira, João
description Application scenarios such as legacy data migration, Extract-Transform-Load (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In this paper, we propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, we supply a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Finally, experimental results report the benefits brought by some of the proposed semantic optimizations
publishDate 2005
dc.date.none.fl_str_mv 2005-01
2005-01-01T00:00:00Z
2009-02-10T13:11:39Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/report
format report
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10451/14193
url http://hdl.handle.net/10451/14193
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Department of Informatics, University of Lisbon
publisher.none.fl_str_mv Department of Informatics, University of Lisbon
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134258619482113