Explicitly Involving the User in a Data Cleaning Process
Autor(a) principal: | |
---|---|
Data de Publicação: | 2010 |
Outros Autores: | , |
Tipo de documento: | Relatório |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10451/14171 |
Resumo: | Reviewed by Mário Silva |
id |
RCAP_b9107ed88280070c85195a6360146666 |
---|---|
oai_identifier_str |
oai:repositorio.ul.pt:10451/14171 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Explicitly Involving the User in a Data Cleaning ProcessSupport for User Involvement in Data Cleaning Applications (original title)Data CleaningUser feedbackData TransformationReviewed by Mário SilvaData cleaning and Extract-Transform-Load processes are usually modeled as graphs of data transformations. These graphs typically involve a large number of data transformations, and must handle large amounts of data. The involvement of the users responsible for executing the corresponding programs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. In this paper, we extend the notion of data cleaning graph in order to better support the user involvement in data cleaning processes. We propose that data cleaning graphs include: (i) data quality constraints to help users to identify the points of the graph and the records that need their attention and (ii) manual data repairs for representing the way users can provide the feedback required to manually clean some data items. We provide preliminary experimental results that show, for a real-world data cleaning process, the significant gains obtained with our approach in terms of the quality of the data produced and the cost incurred by users in data visualization and updating tasks.Repositório da Universidade de LisboaGalhardas, HelenaLopes, AntóniaSantos, Emanuel2010-07-23T16:37:01Z2010-07-23T16:37:01Z2010-07-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/reportapplication/pdfhttp://hdl.handle.net/10451/14171enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T15:59:49Zoai:repositorio.ul.pt:10451/14171Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:36:00.625742Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Explicitly Involving the User in a Data Cleaning Process Support for User Involvement in Data Cleaning Applications (original title) |
title |
Explicitly Involving the User in a Data Cleaning Process |
spellingShingle |
Explicitly Involving the User in a Data Cleaning Process Galhardas, Helena Data Cleaning User feedback Data Transformation |
title_short |
Explicitly Involving the User in a Data Cleaning Process |
title_full |
Explicitly Involving the User in a Data Cleaning Process |
title_fullStr |
Explicitly Involving the User in a Data Cleaning Process |
title_full_unstemmed |
Explicitly Involving the User in a Data Cleaning Process |
title_sort |
Explicitly Involving the User in a Data Cleaning Process |
author |
Galhardas, Helena |
author_facet |
Galhardas, Helena Lopes, Antónia Santos, Emanuel |
author_role |
author |
author2 |
Lopes, Antónia Santos, Emanuel |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Repositório da Universidade de Lisboa |
dc.contributor.author.fl_str_mv |
Galhardas, Helena Lopes, Antónia Santos, Emanuel |
dc.subject.por.fl_str_mv |
Data Cleaning User feedback Data Transformation |
topic |
Data Cleaning User feedback Data Transformation |
description |
Reviewed by Mário Silva |
publishDate |
2010 |
dc.date.none.fl_str_mv |
2010-07-23T16:37:01Z 2010-07-23T16:37:01Z 2010-07-23 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/report |
format |
report |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10451/14171 |
url |
http://hdl.handle.net/10451/14171 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134258592219136 |