Exploring data lakehouse as data infrastructure for ambient assisted living

Detalhes bibliográficos
Autor(a) principal: Cunha, Diogo Guilherme Rocha
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/41714
Resumo: Over the past decade, a data explosion has generated 30,000 gigabytes of data every second. Within this data-rich landscape, emergent data infrastructures like data lakes and, notably, data lakehouses have emerged. The data lakehouse represents a revolutionary approach, seamlessly combining the agility of data lakes with the structured querying capabilities of data warehouses. One of our primary objectives is to conduct a comparative analysis and gain a deeper understanding of the distinctions between these concepts (data warehouse, data lake, and data lakehouse). Data lakehouse solutions offer a promising, technology-agnostic approach to handle data from gathering to information extraction and visualization. One relevant context nowadays is Ambient Assisted Living (AAL) systems, which are increasingly essential due to aging populations. AAL environments generate vast amounts of data from various sources, making traditional data management systems inadequate. This dissertation explores implementing a data lakehouse architecture to address technical and privacy concerns associated with integrating sensor data for contextdependent AAL objectives. As a proof of concept scenario, we used smart mirrors, a challenging monitoring solution with potential privacy and resource issues involving real-time video processing to extract health-related measures. The deployed system illustrates the data lakehouse’s ability to cover scenario requirements while following typical data lakehouse architecture blueprints and patterns using open-source solutions. Although a proof of concept, it provided caregivers with tools for informed decision-making through user-friendly dashboards. The system development process also allowed us to highlight some issues and concerns that must be taken into consideration when applying data lakehouse solutions to an AAL-like scenario.
id RCAP_c808bbb56daa596e02aa844d2c1591e1
oai_identifier_str oai:ria.ua.pt:10773/41714
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Exploring data lakehouse as data infrastructure for ambient assisted livingData lakeData lakehouseSmart mirrorSmart homeAmbient assisted livingOver the past decade, a data explosion has generated 30,000 gigabytes of data every second. Within this data-rich landscape, emergent data infrastructures like data lakes and, notably, data lakehouses have emerged. The data lakehouse represents a revolutionary approach, seamlessly combining the agility of data lakes with the structured querying capabilities of data warehouses. One of our primary objectives is to conduct a comparative analysis and gain a deeper understanding of the distinctions between these concepts (data warehouse, data lake, and data lakehouse). Data lakehouse solutions offer a promising, technology-agnostic approach to handle data from gathering to information extraction and visualization. One relevant context nowadays is Ambient Assisted Living (AAL) systems, which are increasingly essential due to aging populations. AAL environments generate vast amounts of data from various sources, making traditional data management systems inadequate. This dissertation explores implementing a data lakehouse architecture to address technical and privacy concerns associated with integrating sensor data for contextdependent AAL objectives. As a proof of concept scenario, we used smart mirrors, a challenging monitoring solution with potential privacy and resource issues involving real-time video processing to extract health-related measures. The deployed system illustrates the data lakehouse’s ability to cover scenario requirements while following typical data lakehouse architecture blueprints and patterns using open-source solutions. Although a proof of concept, it provided caregivers with tools for informed decision-making through user-friendly dashboards. The system development process also allowed us to highlight some issues and concerns that must be taken into consideration when applying data lakehouse solutions to an AAL-like scenario.Na última década, uma explosão de dados gerou 30.000 gigabytes de dados por segundo. Neste cenário abundante em dados, surgiram infra-estruturas de dados emergentes, como os data lakes e, nomeadamente, os data lakehouses. O data lakehouse representa uma abordagem revolucionária, combinando na perfeição a agilidade dos data lakes com a capacidade de consulta de dados estrurados dos data warehouses. Um dos nossos principais objectivos é realizar uma análise comparativa e compreender melhor as diferenças entre estes conceitos (data warehouses, data lake e data lakehouse). As soluções de data lakehouse oferecem uma abordagem promissora e independente da tecnologia para tratar os dados desde a recolha até à extração e visualização de informação. Atualmente, um contexto relevante é o dos sistemas de Assistência à Autonomia no Domicílio (AAD), que são cada vez mais essenciais devido ao envelhecimento da população. Os ambientes AAD geram grandes quantidades de dados de várias fontes, tornando os sistemas tradicionais de gestão de dados inadequados. Esta dissertação explora a implementação de uma arquitetura de data lakehouse para resolver problemas técnicos e de privacidade associados à integração de dados de sensores para objectivos dependentes do contexto de AAD. Como cenário de prova de conceito, utilizámos o smart mirror, uma solução de monitorização exigente com potenciais problemas de privacidade e de recursos que envolve o processamento de vídeo em tempo real para extrair medidas relacionadas com a saúde. O sistema implementado ilustra a capacidade do data lakehouse para cobrir os requisitos do cenário, seguindo os esquemas e padrões típicos da arquitetura do data lakehouse, utilizando soluções de código aberto. Embora se trate de uma prova de conceito, forneceu aos prestadores de cuidados ferramentas para a tomada de decisões informadas através de painéis de controlo de fácil utilização. O processo de desenvolvimento do sistema também nos permitiu destacar algumas questões e preocupações que devem ser tidas em consideração quando se aplicam soluções de data lakehouse a um cenário do tipo AAD.2024-04-26T09:24:18Z2023-11-29T00:00:00Z2023-11-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/41714engCunha, Diogo Guilherme Rochainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-06T04:56:42Zoai:ria.ua.pt:10773/41714Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-06T04:56:42Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Exploring data lakehouse as data infrastructure for ambient assisted living
title Exploring data lakehouse as data infrastructure for ambient assisted living
spellingShingle Exploring data lakehouse as data infrastructure for ambient assisted living
Cunha, Diogo Guilherme Rocha
Data lake
Data lakehouse
Smart mirror
Smart home
Ambient assisted living
title_short Exploring data lakehouse as data infrastructure for ambient assisted living
title_full Exploring data lakehouse as data infrastructure for ambient assisted living
title_fullStr Exploring data lakehouse as data infrastructure for ambient assisted living
title_full_unstemmed Exploring data lakehouse as data infrastructure for ambient assisted living
title_sort Exploring data lakehouse as data infrastructure for ambient assisted living
author Cunha, Diogo Guilherme Rocha
author_facet Cunha, Diogo Guilherme Rocha
author_role author
dc.contributor.author.fl_str_mv Cunha, Diogo Guilherme Rocha
dc.subject.por.fl_str_mv Data lake
Data lakehouse
Smart mirror
Smart home
Ambient assisted living
topic Data lake
Data lakehouse
Smart mirror
Smart home
Ambient assisted living
description Over the past decade, a data explosion has generated 30,000 gigabytes of data every second. Within this data-rich landscape, emergent data infrastructures like data lakes and, notably, data lakehouses have emerged. The data lakehouse represents a revolutionary approach, seamlessly combining the agility of data lakes with the structured querying capabilities of data warehouses. One of our primary objectives is to conduct a comparative analysis and gain a deeper understanding of the distinctions between these concepts (data warehouse, data lake, and data lakehouse). Data lakehouse solutions offer a promising, technology-agnostic approach to handle data from gathering to information extraction and visualization. One relevant context nowadays is Ambient Assisted Living (AAL) systems, which are increasingly essential due to aging populations. AAL environments generate vast amounts of data from various sources, making traditional data management systems inadequate. This dissertation explores implementing a data lakehouse architecture to address technical and privacy concerns associated with integrating sensor data for contextdependent AAL objectives. As a proof of concept scenario, we used smart mirrors, a challenging monitoring solution with potential privacy and resource issues involving real-time video processing to extract health-related measures. The deployed system illustrates the data lakehouse’s ability to cover scenario requirements while following typical data lakehouse architecture blueprints and patterns using open-source solutions. Although a proof of concept, it provided caregivers with tools for informed decision-making through user-friendly dashboards. The system development process also allowed us to highlight some issues and concerns that must be taken into consideration when applying data lakehouse solutions to an AAL-like scenario.
publishDate 2023
dc.date.none.fl_str_mv 2023-11-29T00:00:00Z
2023-11-29
2024-04-26T09:24:18Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/41714
url http://hdl.handle.net/10773/41714
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv mluisa.alvim@gmail.com
_version_ 1817543904755449856