Explicitly Involving the User in a Data Cleaning Process

Detalhes bibliográficos
Autor(a) principal: Galhardas, Helena
Data de Publicação: 2010
Outros Autores: Lopes, Antónia, Santos, Emanuel
Tipo de documento: Relatório
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10451/14171
Resumo: Reviewed by Mário Silva
id RCAP_b9107ed88280070c85195a6360146666
oai_identifier_str oai:repositorio.ul.pt:10451/14171
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Explicitly Involving the User in a Data Cleaning ProcessSupport for User Involvement in Data Cleaning Applications (original title)Data CleaningUser feedbackData TransformationReviewed by Mário SilvaData cleaning and Extract-Transform-Load processes are usually modeled as graphs of data transformations. These graphs typically involve a large number of data transformations, and must handle large amounts of data. The involvement of the users responsible for executing the corresponding programs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. In this paper, we extend the notion of data cleaning graph in order to better support the user involvement in data cleaning processes. We propose that data cleaning graphs include: (i) data quality constraints to help users to identify the points of the graph and the records that need their attention and (ii) manual data repairs for representing the way users can provide the feedback required to manually clean some data items. We provide preliminary experimental results that show, for a real-world data cleaning process, the significant gains obtained with our approach in terms of the quality of the data produced and the cost incurred by users in data visualization and updating tasks.Repositório da Universidade de LisboaGalhardas, HelenaLopes, AntóniaSantos, Emanuel2010-07-23T16:37:01Z2010-07-23T16:37:01Z2010-07-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/reportapplication/pdfhttp://hdl.handle.net/10451/14171enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T15:59:49Zoai:repositorio.ul.pt:10451/14171Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:36:00.625742Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Explicitly Involving the User in a Data Cleaning Process
Support for User Involvement in Data Cleaning Applications (original title)
title Explicitly Involving the User in a Data Cleaning Process
spellingShingle Explicitly Involving the User in a Data Cleaning Process
Galhardas, Helena
Data Cleaning
User feedback
Data Transformation
title_short Explicitly Involving the User in a Data Cleaning Process
title_full Explicitly Involving the User in a Data Cleaning Process
title_fullStr Explicitly Involving the User in a Data Cleaning Process
title_full_unstemmed Explicitly Involving the User in a Data Cleaning Process
title_sort Explicitly Involving the User in a Data Cleaning Process
author Galhardas, Helena
author_facet Galhardas, Helena
Lopes, Antónia
Santos, Emanuel
author_role author
author2 Lopes, Antónia
Santos, Emanuel
author2_role author
author
dc.contributor.none.fl_str_mv Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Galhardas, Helena
Lopes, Antónia
Santos, Emanuel
dc.subject.por.fl_str_mv Data Cleaning
User feedback
Data Transformation
topic Data Cleaning
User feedback
Data Transformation
description Reviewed by Mário Silva
publishDate 2010
dc.date.none.fl_str_mv 2010-07-23T16:37:01Z
2010-07-23T16:37:01Z
2010-07-23
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/report
format report
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10451/14171
url http://hdl.handle.net/10451/14171
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134258592219136