Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da UFMA |
Texto Completo: | https://tedebc.ufma.br/jspui/handle/tede/tede/4710 |
Resumo: | Time series are data collected over time in a regular manner, describing the average of an event over time. For this reason, among others, time series have been gaining increasing importance in various areas, such as business, natural, and medical applications. One of the main challenges involving time series is data loss, and to recover them, there are various approaches to imputing missing values in univariate time series. In order to contribute to the field of imputation in time series, this study proposes a new method of imputing missing values based on meta-learning. Initially, ten classical techniques were selected to impute time series data, and based on the error, a metadata set was constructed with the series labeled into ten classes according to the lowest obtained error. In addition to the ten techniques used, a new imputation technique using the Pix2Pix GAN network was proposed, which imputes based on images of time series. Furthermore, a new network architecture called HybridLSTM was proposed to recommend the best imputation technique for a given series based on the labeled metadata. It was shown that the HybridLSTM network suggested the best data imputation techniques based on the characteristics of the series, surpassing classical techniques such as linear interpolation and Akima interpolation in several instances. The proposed imputation technique was evaluated on nine different datasets and achieved an average ASMAPE of 9.51%, with a maximum of 22.75% and a minimum of 3.73%. It was also shown that the approach of imputing data through windowing using various techniques on small slices of time series is a promising field, opening up space for various other research areas such as imputing missing data in time series through images and GAN networks. |
id |
UFMA_517fb9dc3d7ed268cd4e2aac7f78adf7 |
---|---|
oai_identifier_str |
oai:tede2:tede/4710 |
network_acronym_str |
UFMA |
network_name_str |
Biblioteca Digital de Teses e Dissertações da UFMA |
repository_id_str |
2131 |
spelling |
ALMEIDA, João Dallyson Sousa dehttp://lattes.cnpq.br/6047330108382641QUINTANILHA, Darlan Bruno Ponteshttp://lattes.cnpq.br/4222253532775153ALMEIDA, João Dallyson Sousa dehttp://lattes.cnpq.br/6047330108382641QUINTANILHA, Darlan Bruno Ponteshttp://lattes.cnpq.br/4222253532775153DINIZ, João Otávio Bandeirahttp://lattes.cnpq.br/6165165599787140SERRA, Ginalber Luiz de Oliveirahttp://lattes.cnpq.br/0831092299374520http://lattes.cnpq.br/5687872061960566ALMEIDA, Mauricio Morais2023-05-23T12:19:13Z2023-05-05ALMEIDA, Mauricio Morais. Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida. 2023. 96 f. Dissertação (Programa de Pós-Graduação em Ciência da Computação/CCET) - Universidade Federal do Maranhão, São Luís, 2023.https://tedebc.ufma.br/jspui/handle/tede/tede/4710Time series are data collected over time in a regular manner, describing the average of an event over time. For this reason, among others, time series have been gaining increasing importance in various areas, such as business, natural, and medical applications. One of the main challenges involving time series is data loss, and to recover them, there are various approaches to imputing missing values in univariate time series. In order to contribute to the field of imputation in time series, this study proposes a new method of imputing missing values based on meta-learning. Initially, ten classical techniques were selected to impute time series data, and based on the error, a metadata set was constructed with the series labeled into ten classes according to the lowest obtained error. In addition to the ten techniques used, a new imputation technique using the Pix2Pix GAN network was proposed, which imputes based on images of time series. Furthermore, a new network architecture called HybridLSTM was proposed to recommend the best imputation technique for a given series based on the labeled metadata. It was shown that the HybridLSTM network suggested the best data imputation techniques based on the characteristics of the series, surpassing classical techniques such as linear interpolation and Akima interpolation in several instances. The proposed imputation technique was evaluated on nine different datasets and achieved an average ASMAPE of 9.51%, with a maximum of 22.75% and a minimum of 3.73%. It was also shown that the approach of imputing data through windowing using various techniques on small slices of time series is a promising field, opening up space for various other research areas such as imputing missing data in time series through images and GAN networks.Séries temporais são dados coletado ao longo do tempo regularmente, descrevendo a média de um evento no tempo. Por esse, e outros motivos, as séries temporais vêm ganhando cada vez mais espaço em diversas áreas, tais como aplicações comerciais, naturais, médicas. Uma das principais problemáticas envolvendo séries temporais está na perda de dados e, para recuperá-los, existem diversas abordagens de imputação em séries temporais univariadas. Com objetivo de contribuir com a área de imputação em séries temporais, este estudo propõe um novo método de imputação de valores faltosos baseado em meta-aprendizado. Inicialmente, selecionou-se dez técnicas clássicas para imputar dados de séries temporais e a partir do erro construiu-se uma base de metadados, com as séries rotuladas em dez classes, conforme o menor erro obtido. Além das dez técnicas utilizadas, propôs-se uma nova técnica de imputação usando a rede Pix2Pix GAN, que imputa a partir de imagens de séries temporais. Somado a isso, foi proposta uma nova arquitetura de rede denominada HybridLSTM para recomendar, a partir dos metadados rotulados, a melhor técnica de imputação para uma determinada série. Assim, mostrou-se que a rede HybridLSTM sugeriu as melhores técnicas de imputação de dados a partir das características das séries, superando em diversas oportunidades as imputações de técnicas clássicas como interpolação linear e interpolação Akima. A técnica de imputação proposta foi avaliada em nove datasets diferentes e alcançou um ASMAPE médio de 9,51%, um máximo de 22,75% e um mínimo de 3,73%. Mostrou-se ainda que a abordagem de imputar dados por meio de janelamento utilizando várias técnicas em pequenas fatias de séries temporais é um campo promissor e, assim, abriu-se espaço para diversas outras pesquisas como a imputação de dados faltosos em séries temporais por meio de imagens e redes GANs.Submitted by Jonathan Sousa de Almeida (jonathan.sousa@ufma.br) on 2023-05-23T12:19:13Z No. of bitstreams: 1 MAURICIOMORAISALMEIDA.pdf: 2620688 bytes, checksum: 3e37f0afed42725d098f3aa8317effdf (MD5)Made available in DSpace on 2023-05-23T12:19:13Z (GMT). No. of bitstreams: 1 MAURICIOMORAISALMEIDA.pdf: 2620688 bytes, checksum: 3e37f0afed42725d098f3aa8317effdf (MD5) Previous issue date: 2023-05-05CAPESapplication/pdfporUniversidade Federal do MaranhãoPROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCETUFMABrasilDEPARTAMENTO DE INFORMÁTICA/CCETSéries Temporais;imputação de dados;meta-aprendizado;Pix2Pix;HybridLSTM.Time series;Convolutional Neural Networks;Time Series Image;Meta-Learning;Imputation.Ciência da ComputaçãoImputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM HíbridaImputation of Missing Data in Univariate Time Series using Meta-Learning based on Hybrid LSTM Neural Networkinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFMAinstname:Universidade Federal do Maranhão (UFMA)instacron:UFMAORIGINALMAURICIOMORAISALMEIDA.pdfMAURICIOMORAISALMEIDA.pdfapplication/pdf2620688http://tedebc.ufma.br:8080/bitstream/tede/4710/2/MAURICIOMORAISALMEIDA.pdf3e37f0afed42725d098f3aa8317effdfMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82255http://tedebc.ufma.br:8080/bitstream/tede/4710/1/license.txt97eeade1fce43278e63fe063657f8083MD51tede/47102023-05-23 09:19:13.412oai:tede2:tede/4710IExJQ0VOw4dBIERFIERJU1RSSUJVScOHw4NPIE7Dg08tRVhDTFVTSVZBCgpDb20gYSBhcHJlc2VudGHDp8OjbyBkZXN0YSBsaWNlbsOnYSxvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvciBjb25jZWRlIMOgIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRvIE1hcmFuaMOjbyAoVUZNQSkgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IGRpc3RyaWJ1aXIgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBjb25jb3JkYSBxdWUgYSBVRk1BIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGTUEgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgw6AgVUZNQSBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRk1BLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCkEgVUZNQSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIG91IG8ocykgbm9tZShzKSBkbyhzKSBkZXRlbnRvcihlcykgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbywgZSBuw6NvIGZhcsOhIHF1YWxxdWVyIGFsdGVyYcOnw6NvLCBhbMOpbSBkYXF1ZWxhcyBjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgoKRGVjbGFyYSB0YW1iw6ltIHF1ZSB0b2RhcyBhcyBhZmlsaWHDp8O1ZXMgY29ycG9yYXRpdmFzIG91IGluc3RpdHVjaW9uYWlzIGUgdG9kYXMgYXMgZm9udGVzIGRlIGFwb2lvIGZpbmFuY2Vpcm8gYW8gdHJhYmFsaG8gZXN0w6NvIGRldmlkYW1lbnRlIGNpdGFkYXMgb3UgbWVuY2lvbmFkYXMgZSBjZXJ0aWZpY2EgcXVlIG7Do28gaMOhIG5lbmh1bSBpbnRlcmVzc2UgY29tZXJjaWFsIG91IGFzc29jaWF0aXZvIHF1ZSByZXByZXNlbnRlIGNvbmZsaXRvIGRlIGludGVyZXNzZSBlbSBjb25leMOjbyBjb20gbyB0cmFiYWxobyBzdWJtZXRpZG8uCgoKCgoKCgo=Biblioteca Digital de Teses e Dissertaçõeshttps://tedebc.ufma.br/jspui/PUBhttp://tedebc.ufma.br:8080/oai/requestrepositorio@ufma.br||repositorio@ufma.bropendoar:21312023-05-23T12:19:13Biblioteca Digital de Teses e Dissertações da UFMA - Universidade Federal do Maranhão (UFMA)false |
dc.title.por.fl_str_mv |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
dc.title.alternative.eng.fl_str_mv |
Imputation of Missing Data in Univariate Time Series using Meta-Learning based on Hybrid LSTM Neural Network |
title |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
spellingShingle |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida ALMEIDA, Mauricio Morais Séries Temporais; imputação de dados; meta-aprendizado; Pix2Pix; HybridLSTM. Time series; Convolutional Neural Networks; Time Series Image; Meta-Learning; Imputation. Ciência da Computação |
title_short |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
title_full |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
title_fullStr |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
title_full_unstemmed |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
title_sort |
Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida |
author |
ALMEIDA, Mauricio Morais |
author_facet |
ALMEIDA, Mauricio Morais |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
ALMEIDA, João Dallyson Sousa de |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/6047330108382641 |
dc.contributor.advisor-co1.fl_str_mv |
QUINTANILHA, Darlan Bruno Pontes |
dc.contributor.advisor-co1Lattes.fl_str_mv |
http://lattes.cnpq.br/4222253532775153 |
dc.contributor.referee1.fl_str_mv |
ALMEIDA, João Dallyson Sousa de |
dc.contributor.referee1Lattes.fl_str_mv |
http://lattes.cnpq.br/6047330108382641 |
dc.contributor.referee2.fl_str_mv |
QUINTANILHA, Darlan Bruno Pontes |
dc.contributor.referee2Lattes.fl_str_mv |
http://lattes.cnpq.br/4222253532775153 |
dc.contributor.referee3.fl_str_mv |
DINIZ, João Otávio Bandeira |
dc.contributor.referee3Lattes.fl_str_mv |
http://lattes.cnpq.br/6165165599787140 |
dc.contributor.referee4.fl_str_mv |
SERRA, Ginalber Luiz de Oliveira |
dc.contributor.referee4Lattes.fl_str_mv |
http://lattes.cnpq.br/0831092299374520 |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/5687872061960566 |
dc.contributor.author.fl_str_mv |
ALMEIDA, Mauricio Morais |
contributor_str_mv |
ALMEIDA, João Dallyson Sousa de QUINTANILHA, Darlan Bruno Pontes ALMEIDA, João Dallyson Sousa de QUINTANILHA, Darlan Bruno Pontes DINIZ, João Otávio Bandeira SERRA, Ginalber Luiz de Oliveira |
dc.subject.por.fl_str_mv |
Séries Temporais; imputação de dados; meta-aprendizado; Pix2Pix; HybridLSTM. |
topic |
Séries Temporais; imputação de dados; meta-aprendizado; Pix2Pix; HybridLSTM. Time series; Convolutional Neural Networks; Time Series Image; Meta-Learning; Imputation. Ciência da Computação |
dc.subject.eng.fl_str_mv |
Time series; Convolutional Neural Networks; Time Series Image; Meta-Learning; Imputation. |
dc.subject.cnpq.fl_str_mv |
Ciência da Computação |
description |
Time series are data collected over time in a regular manner, describing the average of an event over time. For this reason, among others, time series have been gaining increasing importance in various areas, such as business, natural, and medical applications. One of the main challenges involving time series is data loss, and to recover them, there are various approaches to imputing missing values in univariate time series. In order to contribute to the field of imputation in time series, this study proposes a new method of imputing missing values based on meta-learning. Initially, ten classical techniques were selected to impute time series data, and based on the error, a metadata set was constructed with the series labeled into ten classes according to the lowest obtained error. In addition to the ten techniques used, a new imputation technique using the Pix2Pix GAN network was proposed, which imputes based on images of time series. Furthermore, a new network architecture called HybridLSTM was proposed to recommend the best imputation technique for a given series based on the labeled metadata. It was shown that the HybridLSTM network suggested the best data imputation techniques based on the characteristics of the series, surpassing classical techniques such as linear interpolation and Akima interpolation in several instances. The proposed imputation technique was evaluated on nine different datasets and achieved an average ASMAPE of 9.51%, with a maximum of 22.75% and a minimum of 3.73%. It was also shown that the approach of imputing data through windowing using various techniques on small slices of time series is a promising field, opening up space for various other research areas such as imputing missing data in time series through images and GAN networks. |
publishDate |
2023 |
dc.date.accessioned.fl_str_mv |
2023-05-23T12:19:13Z |
dc.date.issued.fl_str_mv |
2023-05-05 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
ALMEIDA, Mauricio Morais. Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida. 2023. 96 f. Dissertação (Programa de Pós-Graduação em Ciência da Computação/CCET) - Universidade Federal do Maranhão, São Luís, 2023. |
dc.identifier.uri.fl_str_mv |
https://tedebc.ufma.br/jspui/handle/tede/tede/4710 |
identifier_str_mv |
ALMEIDA, Mauricio Morais. Imputação de dados faltosos em séries Temporais Univariadas utilizando meta-aprendizado baseado em Rede Neural LSTM Híbrida. 2023. 96 f. Dissertação (Programa de Pós-Graduação em Ciência da Computação/CCET) - Universidade Federal do Maranhão, São Luís, 2023. |
url |
https://tedebc.ufma.br/jspui/handle/tede/tede/4710 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Federal do Maranhão |
dc.publisher.program.fl_str_mv |
PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCET |
dc.publisher.initials.fl_str_mv |
UFMA |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
DEPARTAMENTO DE INFORMÁTICA/CCET |
publisher.none.fl_str_mv |
Universidade Federal do Maranhão |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da UFMA instname:Universidade Federal do Maranhão (UFMA) instacron:UFMA |
instname_str |
Universidade Federal do Maranhão (UFMA) |
instacron_str |
UFMA |
institution |
UFMA |
reponame_str |
Biblioteca Digital de Teses e Dissertações da UFMA |
collection |
Biblioteca Digital de Teses e Dissertações da UFMA |
bitstream.url.fl_str_mv |
http://tedebc.ufma.br:8080/bitstream/tede/4710/2/MAURICIOMORAISALMEIDA.pdf http://tedebc.ufma.br:8080/bitstream/tede/4710/1/license.txt |
bitstream.checksum.fl_str_mv |
3e37f0afed42725d098f3aa8317effdf 97eeade1fce43278e63fe063657f8083 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da UFMA - Universidade Federal do Maranhão (UFMA) |
repository.mail.fl_str_mv |
repositorio@ufma.br||repositorio@ufma.br |
_version_ |
1800303819676975104 |