Data quality assessment in healthcare data: A case study

Detalhes bibliográficos
Autor(a) principal: Monteiro, Stephanie Cardoso
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10071/31061
Resumo: Reliable data is essential for monitoring and evaluating business activities. When critical domains such as Healthcare are involved, data quality has a crucial impact on de- livering more accurate and fast healthcare services. Considering epidemiological scenarios such as the COVID-19 pandemic, data can assume an essential role in supporting social answers carried on by the primary decision-makers. For that, sharing and having an inte- grated view of the data allow for identifying the best approaches and critical signals that could lead to better treatments and diagnoses. Nevertheless, leading with data extraction from several sources is not an easy task and can lead to enormous challenges related to data accessibility, representation, and interpretation. Several data quality problems can occur and, when not adequately addressed, can question the decision-making support. The contribution of this thesis was to perform a data quality assessment from a subset of data from a Portuguese hospital when used in the context of integration to a common shared repository within the scope of a European project. A deep data profiling analysis in the source database was conducted, identifying the main characteristics and, after- wards, the main issues. Each issue was later mapped with its corresponding data quality violation, and rules were defined as guidelines to address these issues and prevent future ones. To classify the quality of the source data, a methodology was proposed to evaluate the data into two levels, quality roles level and data quality dimensions level, calculating the data quality score. The final results are discussed and evaluated in this work.
id RCAP_569bcb62806b48b8533cdf905520040c
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/31061
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Data quality assessment in healthcare data: A case studyHealthcare dataData qualityData integration and data quality assessmentDados de saúdeQualidade de dadosIntegração de dados e avaliação da qualidade dos dadosReliable data is essential for monitoring and evaluating business activities. When critical domains such as Healthcare are involved, data quality has a crucial impact on de- livering more accurate and fast healthcare services. Considering epidemiological scenarios such as the COVID-19 pandemic, data can assume an essential role in supporting social answers carried on by the primary decision-makers. For that, sharing and having an inte- grated view of the data allow for identifying the best approaches and critical signals that could lead to better treatments and diagnoses. Nevertheless, leading with data extraction from several sources is not an easy task and can lead to enormous challenges related to data accessibility, representation, and interpretation. Several data quality problems can occur and, when not adequately addressed, can question the decision-making support. The contribution of this thesis was to perform a data quality assessment from a subset of data from a Portuguese hospital when used in the context of integration to a common shared repository within the scope of a European project. A deep data profiling analysis in the source database was conducted, identifying the main characteristics and, after- wards, the main issues. Each issue was later mapped with its corresponding data quality violation, and rules were defined as guidelines to address these issues and prevent future ones. To classify the quality of the source data, a methodology was proposed to evaluate the data into two levels, quality roles level and data quality dimensions level, calculating the data quality score. The final results are discussed and evaluated in this work.Dados com qualidade são essenciais para monitorizar e avaliar as atividades do negócio. Quando se trata de domínios críticos como a área da saúde, a qualidade dos dados tem um impacto fundamental na prestação de serviços mais precisos e rápidos. Considerando cenários epidemiológicos como a COVID-19, os dados assumem um papel essencial no apoio às respostas sociais dadas pelos decisores. Para tal, a partilha e a visão integrada dos dados permitem identificar as melhores abordagens e os sinais críticos que podem conduzir a melhores diagnósticos e tratamentos, entretanto, lidar com a extração e integração de dados provenientes de várias fontes não é uma tarefa fácil e implicam inúmeros desafios relacionados com a acessibilidade, representação e interpretação dos mesmos. Diferentes problemas relacionados à qualidade dos dados podem ser levantados e quando não tratados corretamente, podem pôr em causa à tomada de decisões. O principal contributo desta tese é avaliar a qualidade de um conjunto de dados de um hospital português aquando utilizados para integração com repositório de dados partilhado no âmbito de um projeto europeu. Estes dados foram analisados, identificando as principais características e problemas. Os problemas identificados foram posteriormente mapeados com a respetiva dimensão da qualidade de dados violado. Regras foram definidas servindo como diretrizes para auxiliar na correção dos problemas e prevenir que os mesmos ocorram futuramente. Para efetuar esta avaliação, propôs-se uma metodologia que avalia a qualidade dos dados a dois níveis, a nível das regras e das dimensões da qualidade dos dados, calculando posteriormente o score relativamente a qualidade dos dados avaliados. Os resultados foram discutidos e avaliados.2024-02-16T13:08:46Z2023-12-05T00:00:00Z2023-12-052023-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10071/31061TID:203493095engMonteiro, Stephanie Cardosoinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-07-07T03:14:50Zoai:repositorio.iscte-iul.pt:10071/31061Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-07-07T03:14:50Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Data quality assessment in healthcare data: A case study
title Data quality assessment in healthcare data: A case study
spellingShingle Data quality assessment in healthcare data: A case study
Monteiro, Stephanie Cardoso
Healthcare data
Data quality
Data integration and data quality assessment
Dados de saúde
Qualidade de dados
Integração de dados e avaliação da qualidade dos dados
title_short Data quality assessment in healthcare data: A case study
title_full Data quality assessment in healthcare data: A case study
title_fullStr Data quality assessment in healthcare data: A case study
title_full_unstemmed Data quality assessment in healthcare data: A case study
title_sort Data quality assessment in healthcare data: A case study
author Monteiro, Stephanie Cardoso
author_facet Monteiro, Stephanie Cardoso
author_role author
dc.contributor.author.fl_str_mv Monteiro, Stephanie Cardoso
dc.subject.por.fl_str_mv Healthcare data
Data quality
Data integration and data quality assessment
Dados de saúde
Qualidade de dados
Integração de dados e avaliação da qualidade dos dados
topic Healthcare data
Data quality
Data integration and data quality assessment
Dados de saúde
Qualidade de dados
Integração de dados e avaliação da qualidade dos dados
description Reliable data is essential for monitoring and evaluating business activities. When critical domains such as Healthcare are involved, data quality has a crucial impact on de- livering more accurate and fast healthcare services. Considering epidemiological scenarios such as the COVID-19 pandemic, data can assume an essential role in supporting social answers carried on by the primary decision-makers. For that, sharing and having an inte- grated view of the data allow for identifying the best approaches and critical signals that could lead to better treatments and diagnoses. Nevertheless, leading with data extraction from several sources is not an easy task and can lead to enormous challenges related to data accessibility, representation, and interpretation. Several data quality problems can occur and, when not adequately addressed, can question the decision-making support. The contribution of this thesis was to perform a data quality assessment from a subset of data from a Portuguese hospital when used in the context of integration to a common shared repository within the scope of a European project. A deep data profiling analysis in the source database was conducted, identifying the main characteristics and, after- wards, the main issues. Each issue was later mapped with its corresponding data quality violation, and rules were defined as guidelines to address these issues and prevent future ones. To classify the quality of the source data, a methodology was proposed to evaluate the data into two levels, quality roles level and data quality dimensions level, calculating the data quality score. The final results are discussed and evaluated in this work.
publishDate 2023
dc.date.none.fl_str_mv 2023-12-05T00:00:00Z
2023-12-05
2023-12
2024-02-16T13:08:46Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/31061
TID:203493095
url http://hdl.handle.net/10071/31061
identifier_str_mv TID:203493095
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv mluisa.alvim@gmail.com
_version_ 1817546424650301440