Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida

Detalhes bibliográficos
Autor(a) principal: ALMEIDA, Mauricio Morais
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações da UFMA
Texto Completo: https://tedebc.ufma.br/jspui/handle/tede/tede/4710
Resumo: Time series are data collected over time in a regular manner, describing the average of an event over time. For this reason, among others, time series have been gaining increasing importance in various areas, such as business, natural, and medical applications. One of the main challenges involving time series is data loss, and to recover them, there are various approaches to imputing missing values in univariate time series. In order to contribute to the field of imputation in time series, this study proposes a new method of imputing missing values based on meta-learning. Initially, ten classical techniques were selected to impute time series data, and based on the error, a metadata set was constructed with the series labeled into ten classes according to the lowest obtained error. In addition to the ten techniques used, a new imputation technique using the Pix2Pix GAN network was proposed, which imputes based on images of time series. Furthermore, a new network architecture called HybridLSTM was proposed to recommend the best imputation technique for a given series based on the labeled metadata. It was shown that the HybridLSTM network suggested the best data imputation techniques based on the characteristics of the series, surpassing classical techniques such as linear interpolation and Akima interpolation in several instances. The proposed imputation technique was evaluated on nine different datasets and achieved an average ASMAPE of 9.51%, with a maximum of 22.75% and a minimum of 3.73%. It was also shown that the approach of imputing data through windowing using various techniques on small slices of time series is a promising field, opening up space for various other research areas such as imputing missing data in time series through images and GAN networks.
id UFMA_517fb9dc3d7ed268cd4e2aac7f78adf7
oai_identifier_str oai:tede2:tede/4710
network_acronym_str UFMA
network_name_str Biblioteca Digital de Teses e Dissertações da UFMA
repository_id_str 2131
spelling ALMEIDA, João Dallyson Sousa dehttp://lattes.cnpq.br/6047330108382641QUINTANILHA, Darlan Bruno Ponteshttp://lattes.cnpq.br/4222253532775153ALMEIDA, João Dallyson Sousa dehttp://lattes.cnpq.br/6047330108382641QUINTANILHA, Darlan Bruno Ponteshttp://lattes.cnpq.br/4222253532775153DINIZ, João Otávio Bandeirahttp://lattes.cnpq.br/6165165599787140SERRA, Ginalber Luiz de Oliveirahttp://lattes.cnpq.br/0831092299374520http://lattes.cnpq.br/5687872061960566ALMEIDA, Mauricio Morais2023-05-23T12:19:13Z2023-05-05ALMEIDA, Mauricio Morais. Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida. 2023. 96 f. Dissertação (Programa de Pós-Graduação em Ciência da Computação/CCET) - Universidade Federal do Maranhão, São Luís, 2023.https://tedebc.ufma.br/jspui/handle/tede/tede/4710Time series are data collected over time in a regular manner, describing the average of an event over time. For this reason, among others, time series have been gaining increasing importance in various areas, such as business, natural, and medical applications. One of the main challenges involving time series is data loss, and to recover them, there are various approaches to imputing missing values in univariate time series. In order to contribute to the field of imputation in time series, this study proposes a new method of imputing missing values based on meta-learning. Initially, ten classical techniques were selected to impute time series data, and based on the error, a metadata set was constructed with the series labeled into ten classes according to the lowest obtained error. In addition to the ten techniques used, a new imputation technique using the Pix2Pix GAN network was proposed, which imputes based on images of time series. Furthermore, a new network architecture called HybridLSTM was proposed to recommend the best imputation technique for a given series based on the labeled metadata. It was shown that the HybridLSTM network suggested the best data imputation techniques based on the characteristics of the series, surpassing classical techniques such as linear interpolation and Akima interpolation in several instances. The proposed imputation technique was evaluated on nine different datasets and achieved an average ASMAPE of 9.51%, with a maximum of 22.75% and a minimum of 3.73%. It was also shown that the approach of imputing data through windowing using various techniques on small slices of time series is a promising field, opening up space for various other research areas such as imputing missing data in time series through images and GAN networks.Séries temporais são dados coletado ao longo do tempo regularmente, descrevendo a média de um evento no tempo. Por esse, e outros motivos, as séries temporais vêm ganhando cada vez mais espaço em diversas áreas, tais como aplicações comerciais, naturais, médicas. Uma das principais problemáticas envolvendo séries temporais está na perda de dados e, para recuperá-los, existem diversas abordagens de imputação em séries temporais univariadas. Com objetivo de contribuir com a área de imputação em séries temporais, este estudo propõe um novo método de imputação de valores faltosos baseado em meta-aprendizado. Inicialmente, selecionou-se dez técnicas clássicas para imputar dados de séries temporais e a partir do erro construiu-se uma base de metadados, com as séries rotuladas em dez classes, conforme o menor erro obtido. Além das dez técnicas utilizadas, propôs-se uma nova técnica de imputação usando a rede Pix2Pix GAN, que imputa a partir de imagens de séries temporais. Somado a isso, foi proposta uma nova arquitetura de rede denominada HybridLSTM para recomendar, a partir dos metadados rotulados, a melhor técnica de imputação para uma determinada série. Assim, mostrou-se que a rede HybridLSTM sugeriu as melhores técnicas de imputação de dados a partir das características das séries, superando em diversas oportunidades as imputações de técnicas clássicas como interpolação linear e interpolação Akima. A técnica de imputação proposta foi avaliada em nove datasets diferentes e alcançou um ASMAPE médio de 9,51%, um máximo de 22,75% e um mínimo de 3,73%. Mostrou-se ainda que a abordagem de imputar dados por meio de janelamento utilizando várias técnicas em pequenas fatias de séries temporais é um campo promissor e, assim, abriu-se espaço para diversas outras pesquisas como a imputação de dados faltosos em séries temporais por meio de imagens e redes GANs.Submitted by Jonathan Sousa de Almeida (jonathan.sousa@ufma.br) on 2023-05-23T12:19:13Z No. of bitstreams: 1 MAURICIOMORAISALMEIDA.pdf: 2620688 bytes, checksum: 3e37f0afed42725d098f3aa8317effdf (MD5)Made available in DSpace on 2023-05-23T12:19:13Z (GMT). No. of bitstreams: 1 MAURICIOMORAISALMEIDA.pdf: 2620688 bytes, checksum: 3e37f0afed42725d098f3aa8317effdf (MD5) Previous issue date: 2023-05-05CAPESapplication/pdfporUniversidade Federal do MaranhãoPROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCETUFMABrasilDEPARTAMENTO DE INFORMÁTICA/CCETSéries Temporais;imputação de dados;meta-aprendizado;Pix2Pix;HybridLSTM.Time series;Convolutional Neural Networks;Time Series Image;Meta-Learning;Imputation.Ciência da ComputaçãoImputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM HíbridaImputation of Missing Data in Univariate Time Series using Meta-Learning based on Hybrid LSTM Neural Networkinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFMAinstname:Universidade Federal do Maranhão (UFMA)instacron:UFMAORIGINALMAURICIOMORAISALMEIDA.pdfMAURICIOMORAISALMEIDA.pdfapplication/pdf2620688http://tedebc.ufma.br:8080/bitstream/tede/4710/2/MAURICIOMORAISALMEIDA.pdf3e37f0afed42725d098f3aa8317effdfMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82255http://tedebc.ufma.br:8080/bitstream/tede/4710/1/license.txt97eeade1fce43278e63fe063657f8083MD51tede/47102023-05-23 09:19:13.412oai:tede2:tede/4710IExJQ0VOw4dBIERFIERJU1RSSUJVScOHw4NPIE7Dg08tRVhDTFVTSVZBCgpDb20gYSBhcHJlc2VudGHDp8OjbyBkZXN0YSBsaWNlbsOnYSxvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvciBjb25jZWRlIMOgIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRvIE1hcmFuaMOjbyAoVUZNQSkgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IGRpc3RyaWJ1aXIgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBjb25jb3JkYSBxdWUgYSBVRk1BIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGTUEgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgw6AgVUZNQSBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRk1BLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCkEgVUZNQSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIG91IG8ocykgbm9tZShzKSBkbyhzKSBkZXRlbnRvcihlcykgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbywgZSBuw6NvIGZhcsOhIHF1YWxxdWVyIGFsdGVyYcOnw6NvLCBhbMOpbSBkYXF1ZWxhcyBjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgoKRGVjbGFyYSB0YW1iw6ltIHF1ZSB0b2RhcyBhcyBhZmlsaWHDp8O1ZXMgY29ycG9yYXRpdmFzIG91IGluc3RpdHVjaW9uYWlzIGUgdG9kYXMgYXMgZm9udGVzIGRlIGFwb2lvIGZpbmFuY2Vpcm8gYW8gdHJhYmFsaG8gZXN0w6NvIGRldmlkYW1lbnRlIGNpdGFkYXMgb3UgbWVuY2lvbmFkYXMgZSBjZXJ0aWZpY2EgcXVlIG7Do28gaMOhIG5lbmh1bSBpbnRlcmVzc2UgY29tZXJjaWFsIG91IGFzc29jaWF0aXZvIHF1ZSByZXByZXNlbnRlIGNvbmZsaXRvIGRlIGludGVyZXNzZSBlbSBjb25leMOjbyBjb20gbyB0cmFiYWxobyBzdWJtZXRpZG8uCgoKCgoKCgo=Biblioteca Digital de Teses e Dissertaçõeshttps://tedebc.ufma.br/jspui/PUBhttp://tedebc.ufma.br:8080/oai/requestrepositorio@ufma.br||repositorio@ufma.bropendoar:21312023-05-23T12:19:13Biblioteca Digital de Teses e Dissertações da UFMA - Universidade Federal do Maranhão (UFMA)false
dc.title.por.fl_str_mv Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
dc.title.alternative.eng.fl_str_mv Imputation of Missing Data in Univariate Time Series using Meta-Learning based on Hybrid LSTM Neural Network
title Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
spellingShingle Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
ALMEIDA, Mauricio Morais
Séries Temporais;
imputação de dados;
meta-aprendizado;
Pix2Pix;
HybridLSTM.
Time series;
Convolutional Neural Networks;
Time Series Image;
Meta-Learning;
Imputation.
Ciência da Computação
title_short Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
title_full Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
title_fullStr Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
title_full_unstemmed Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
title_sort Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
author ALMEIDA, Mauricio Morais
author_facet ALMEIDA, Mauricio Morais
author_role author
dc.contributor.advisor1.fl_str_mv ALMEIDA, João Dallyson Sousa de
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/6047330108382641
dc.contributor.advisor-co1.fl_str_mv QUINTANILHA, Darlan Bruno Pontes
dc.contributor.advisor-co1Lattes.fl_str_mv http://lattes.cnpq.br/4222253532775153
dc.contributor.referee1.fl_str_mv ALMEIDA, João Dallyson Sousa de
dc.contributor.referee1Lattes.fl_str_mv http://lattes.cnpq.br/6047330108382641
dc.contributor.referee2.fl_str_mv QUINTANILHA, Darlan Bruno Pontes
dc.contributor.referee2Lattes.fl_str_mv http://lattes.cnpq.br/4222253532775153
dc.contributor.referee3.fl_str_mv DINIZ, João Otávio Bandeira
dc.contributor.referee3Lattes.fl_str_mv http://lattes.cnpq.br/6165165599787140
dc.contributor.referee4.fl_str_mv SERRA, Ginalber Luiz de Oliveira
dc.contributor.referee4Lattes.fl_str_mv http://lattes.cnpq.br/0831092299374520
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/5687872061960566
dc.contributor.author.fl_str_mv ALMEIDA, Mauricio Morais
contributor_str_mv ALMEIDA, João Dallyson Sousa de
QUINTANILHA, Darlan Bruno Pontes
ALMEIDA, João Dallyson Sousa de
QUINTANILHA, Darlan Bruno Pontes
DINIZ, João Otávio Bandeira
SERRA, Ginalber Luiz de Oliveira
dc.subject.por.fl_str_mv Séries Temporais;
imputação de dados;
meta-aprendizado;
Pix2Pix;
HybridLSTM.
topic Séries Temporais;
imputação de dados;
meta-aprendizado;
Pix2Pix;
HybridLSTM.
Time series;
Convolutional Neural Networks;
Time Series Image;
Meta-Learning;
Imputation.
Ciência da Computação
dc.subject.eng.fl_str_mv Time series;
Convolutional Neural Networks;
Time Series Image;
Meta-Learning;
Imputation.
dc.subject.cnpq.fl_str_mv Ciência da Computação
description Time series are data collected over time in a regular manner, describing the average of an event over time. For this reason, among others, time series have been gaining increasing importance in various areas, such as business, natural, and medical applications. One of the main challenges involving time series is data loss, and to recover them, there are various approaches to imputing missing values in univariate time series. In order to contribute to the field of imputation in time series, this study proposes a new method of imputing missing values based on meta-learning. Initially, ten classical techniques were selected to impute time series data, and based on the error, a metadata set was constructed with the series labeled into ten classes according to the lowest obtained error. In addition to the ten techniques used, a new imputation technique using the Pix2Pix GAN network was proposed, which imputes based on images of time series. Furthermore, a new network architecture called HybridLSTM was proposed to recommend the best imputation technique for a given series based on the labeled metadata. It was shown that the HybridLSTM network suggested the best data imputation techniques based on the characteristics of the series, surpassing classical techniques such as linear interpolation and Akima interpolation in several instances. The proposed imputation technique was evaluated on nine different datasets and achieved an average ASMAPE of 9.51%, with a maximum of 22.75% and a minimum of 3.73%. It was also shown that the approach of imputing data through windowing using various techniques on small slices of time series is a promising field, opening up space for various other research areas such as imputing missing data in time series through images and GAN networks.
publishDate 2023
dc.date.accessioned.fl_str_mv 2023-05-23T12:19:13Z
dc.date.issued.fl_str_mv 2023-05-05
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv ALMEIDA, Mauricio Morais. Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida. 2023. 96 f. Dissertação (Programa de Pós-Graduação em Ciência da Computação/CCET) - Universidade Federal do Maranhão, São Luís, 2023.
dc.identifier.uri.fl_str_mv https://tedebc.ufma.br/jspui/handle/tede/tede/4710
identifier_str_mv ALMEIDA, Mauricio Morais. Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida. 2023. 96 f. Dissertação (Programa de Pós-Graduação em Ciência da Computação/CCET) - Universidade Federal do Maranhão, São Luís, 2023.
url https://tedebc.ufma.br/jspui/handle/tede/tede/4710
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal do Maranhão
dc.publisher.program.fl_str_mv PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCET
dc.publisher.initials.fl_str_mv UFMA
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv DEPARTAMENTO DE INFORMÁTICA/CCET
publisher.none.fl_str_mv Universidade Federal do Maranhão
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UFMA
instname:Universidade Federal do Maranhão (UFMA)
instacron:UFMA
instname_str Universidade Federal do Maranhão (UFMA)
instacron_str UFMA
institution UFMA
reponame_str Biblioteca Digital de Teses e Dissertações da UFMA
collection Biblioteca Digital de Teses e Dissertações da UFMA
bitstream.url.fl_str_mv http://tedebc.ufma.br:8080/bitstream/tede/4710/2/MAURICIOMORAISALMEIDA.pdf
http://tedebc.ufma.br:8080/bitstream/tede/4710/1/license.txt
bitstream.checksum.fl_str_mv 3e37f0afed42725d098f3aa8317effdf
97eeade1fce43278e63fe063657f8083
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UFMA - Universidade Federal do Maranhão (UFMA)
repository.mail.fl_str_mv repositorio@ufma.br||repositorio@ufma.br
_version_ 1800303819676975104