Flexible Time Series Matching for Clinical and Behavioral Data

Detalhes bibliográficos
Autor(a) principal: Matias, Pedro António Correia
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/157934
Resumo: Time Series data became broadly applied by the research community in the last decades after a massive explosion of its availability. Nonetheless, this rise required an improvement in the existing analysis techniques which, in the medical domain, would help specialists to evaluate their patients condition. One of the key tasks in time series analysis is pattern recognition (segmentation and classification). Traditional methods typically perform subsequence matching, making use of a pattern template and a similarity metric to search for similar sequences throughout time series. However, real-world data is noisy and variable (morphological distortions), making a template-based exact matching an elementary approach. Intending to increase flexibility and generalize the pattern searching tasks across domains, this dissertation proposes two Deep Learning-based frameworks to solve pattern segmentation and anomaly detection problems. Regarding pattern segmentation, a Convolution/Deconvolution Neural Network is proposed, learning to distinguish, point-by-point, desired sub-patterns from background content within a time series. The proposed framework was validated in two use-cases: electrocardiogram (ECG) and inertial sensor-based human activity (IMU) signals. It outperformed two conventional matching techniques, being capable of notably detecting the targeted cycles even in noise-corrupted or extremely distorted signals, without using any reference template nor hand-coded similarity scores. Concerning anomaly detection, the proposed unsupervised framework uses the reconstruction ability of Variational Autoencoders and a local similarity score to identify non-labeled abnormalities. The proposal was validated in two public ECG datasets (MITBIH Arrhythmia and ECG5000), performing cardiac arrhythmia identification. Results indicated competitiveness relative to recent techniques, achieving detection AUC scores of 98.84% (ECG5000) and 93.32% (MIT-BIH Arrhythmia).
id RCAP_8c8808f1d04a993b9ee5ffe445e41b5d
oai_identifier_str oai:run.unl.pt:10362/157934
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Flexible Time Series Matching for Clinical and Behavioral DataTime SeriesPattern SegmentationAnomaly DetectionDeep LearningDomínio/Área Científica::Engenharia e Tecnologia::Outras Engenharias e TecnologiasTime Series data became broadly applied by the research community in the last decades after a massive explosion of its availability. Nonetheless, this rise required an improvement in the existing analysis techniques which, in the medical domain, would help specialists to evaluate their patients condition. One of the key tasks in time series analysis is pattern recognition (segmentation and classification). Traditional methods typically perform subsequence matching, making use of a pattern template and a similarity metric to search for similar sequences throughout time series. However, real-world data is noisy and variable (morphological distortions), making a template-based exact matching an elementary approach. Intending to increase flexibility and generalize the pattern searching tasks across domains, this dissertation proposes two Deep Learning-based frameworks to solve pattern segmentation and anomaly detection problems. Regarding pattern segmentation, a Convolution/Deconvolution Neural Network is proposed, learning to distinguish, point-by-point, desired sub-patterns from background content within a time series. The proposed framework was validated in two use-cases: electrocardiogram (ECG) and inertial sensor-based human activity (IMU) signals. It outperformed two conventional matching techniques, being capable of notably detecting the targeted cycles even in noise-corrupted or extremely distorted signals, without using any reference template nor hand-coded similarity scores. Concerning anomaly detection, the proposed unsupervised framework uses the reconstruction ability of Variational Autoencoders and a local similarity score to identify non-labeled abnormalities. The proposal was validated in two public ECG datasets (MITBIH Arrhythmia and ECG5000), performing cardiac arrhythmia identification. Results indicated competitiveness relative to recent techniques, achieving detection AUC scores of 98.84% (ECG5000) and 93.32% (MIT-BIH Arrhythmia).Dados de séries temporais tornaram-se largamente aplicados pela comunidade científica nas últimas decadas após um aumento massivo da sua disponibilidade. Contudo, este aumento exigiu uma melhoria das atuais técnicas de análise que, no domínio clínico, auxiliaria os especialistas na avaliação da condição dos seus pacientes. Um dos principais tipos de análise em séries temporais é o reconhecimento de padrões (segmentação e classificação). Métodos tradicionais assentam, tipicamente, em técnicas de correspondência em subsequências, fazendo uso de um padrão de referência e uma métrica de similaridade para procurar por subsequências similares ao longo de séries temporais. Todavia, dados do mundo real são ruidosos e variáveis (morfologicamente), tornando uma correspondência exata baseada num padrão de referência uma abordagem rudimentar. Pretendendo aumentar a flexibilidade da análise de séries temporais e generalizar tarefas de procura de padrões entre domínios, esta dissertação propõe duas abordagens baseadas em Deep Learning para solucionar problemas de segmentação de padrões e deteção de anomalias. Acerca da segmentação de padrões, a rede neuronal de Convolução/Deconvolução proposta aprende a distinguir, ponto a ponto, sub-padrões pretendidos de conteúdo de fundo numa série temporal. O modelo proposto foi validado em dois casos de uso: sinais eletrocardiográficos (ECG) e de sensores inerciais em atividade humana (IMU). Este superou duas técnicas convencionais, sendo capaz de detetar os ciclos-alvo notavelmente, mesmo em sinais corrompidos por ruído ou extremamente distorcidos, sem o uso de nenhum padrão de referência nem métricas de similaridade codificadas manualmente. A respeito da deteção de anomalias, a técnica não supervisionada proposta usa a capacidade de reconstrução dos Variational Autoencoders e uma métrica de similaridade local para identificar anomalias desconhecidas. A proposta foi validada na identificação de arritmias cardíacas em duas bases de dados públicas de ECG (MIT-BIH Arrhythmia e ECG5000). Os resultados revelam competitividade face a técnicas recentes, alcançando métricas AUC de deteção de 93.32% (MIT-BIH Arrhythmia) e 98.84% (ECG5000).Gamboa, HugoCarreiro, AndréRUNMatias, Pedro António Correia2023-09-18T13:36:34Z2021-022021-02-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/157934enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:40:13Zoai:run.unl.pt:10362/157934Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:56:55.217998Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Flexible Time Series Matching for Clinical and Behavioral Data
title Flexible Time Series Matching for Clinical and Behavioral Data
spellingShingle Flexible Time Series Matching for Clinical and Behavioral Data
Matias, Pedro António Correia
Time Series
Pattern Segmentation
Anomaly Detection
Deep Learning
Domínio/Área Científica::Engenharia e Tecnologia::Outras Engenharias e Tecnologias
title_short Flexible Time Series Matching for Clinical and Behavioral Data
title_full Flexible Time Series Matching for Clinical and Behavioral Data
title_fullStr Flexible Time Series Matching for Clinical and Behavioral Data
title_full_unstemmed Flexible Time Series Matching for Clinical and Behavioral Data
title_sort Flexible Time Series Matching for Clinical and Behavioral Data
author Matias, Pedro António Correia
author_facet Matias, Pedro António Correia
author_role author
dc.contributor.none.fl_str_mv Gamboa, Hugo
Carreiro, André
RUN
dc.contributor.author.fl_str_mv Matias, Pedro António Correia
dc.subject.por.fl_str_mv Time Series
Pattern Segmentation
Anomaly Detection
Deep Learning
Domínio/Área Científica::Engenharia e Tecnologia::Outras Engenharias e Tecnologias
topic Time Series
Pattern Segmentation
Anomaly Detection
Deep Learning
Domínio/Área Científica::Engenharia e Tecnologia::Outras Engenharias e Tecnologias
description Time Series data became broadly applied by the research community in the last decades after a massive explosion of its availability. Nonetheless, this rise required an improvement in the existing analysis techniques which, in the medical domain, would help specialists to evaluate their patients condition. One of the key tasks in time series analysis is pattern recognition (segmentation and classification). Traditional methods typically perform subsequence matching, making use of a pattern template and a similarity metric to search for similar sequences throughout time series. However, real-world data is noisy and variable (morphological distortions), making a template-based exact matching an elementary approach. Intending to increase flexibility and generalize the pattern searching tasks across domains, this dissertation proposes two Deep Learning-based frameworks to solve pattern segmentation and anomaly detection problems. Regarding pattern segmentation, a Convolution/Deconvolution Neural Network is proposed, learning to distinguish, point-by-point, desired sub-patterns from background content within a time series. The proposed framework was validated in two use-cases: electrocardiogram (ECG) and inertial sensor-based human activity (IMU) signals. It outperformed two conventional matching techniques, being capable of notably detecting the targeted cycles even in noise-corrupted or extremely distorted signals, without using any reference template nor hand-coded similarity scores. Concerning anomaly detection, the proposed unsupervised framework uses the reconstruction ability of Variational Autoencoders and a local similarity score to identify non-labeled abnormalities. The proposal was validated in two public ECG datasets (MITBIH Arrhythmia and ECG5000), performing cardiac arrhythmia identification. Results indicated competitiveness relative to recent techniques, achieving detection AUC scores of 98.84% (ECG5000) and 93.32% (MIT-BIH Arrhythmia).
publishDate 2021
dc.date.none.fl_str_mv 2021-02
2021-02-01T00:00:00Z
2023-09-18T13:36:34Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/157934
url http://hdl.handle.net/10362/157934
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138152967831552