Time series forecasting with deep forest regression

Detalhes bibliográficos
Autor(a) principal: ANDRADE, Renata Correia de
Data de Publicação: 2020
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFPE
Texto Completo: https://repositorio.ufpe.br/handle/123456789/43505
Resumo: A time series is a collection of ordered observations which are usually measured in repeated intervals. Time series forecasting is an area of research which studies methods for prediction of future values in a series. Forecasting methods range from statistical procedures, such as ARIMA to, more recently, machine learning approaches. Deep neural networks (DNNs) have shown good performance on a great number of tasks, including time series forecasting for which DNNs are considered state-of-the-art. Most deep models today are neural networks, but despite its popularity and proven competitive performance when compared to other machine learning algorithms, DNNs still face some limitations. Most notably, they usually require a large number of training examples - which could be unavailable for smaller time series - and they possess a large number of hyper-parameters which need to be tuned to individual datasets. Multi-grained cascade forest (gcForest) is a deep machine learning algorithm which has been proposed for classification and that addresses DNNs limitations while replicating the features which are responsible for the success of this type of model. This dissertation’s goal is to adapt the original gcForest algorithm in order for it to work with regression problems, enabling it to be applied to time series forecasting. The influence of the two different stages of gcForest - multi-grained scanning and cascade forest, is also investigated. Also explored is the possibility of adding an additional model to the end of the cascade forest structure and thus change the way the final result is calculated. Changes to the algorithm are presented and its performance is evaluated on four different time series datasets, according to three performance metrics: mean squared error, mean absolute error and mean absolute percentage error. Results show that gcForest achieves competitive performance on all four datasets, when compared to traditional machine learning models.
id UFPE_b802060797d5600e522e1e6b7a4679fa
oai_identifier_str oai:repositorio.ufpe.br:123456789/43505
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str 2221
spelling ANDRADE, Renata Correia dehttp://lattes.cnpq.br/7170312279134322http://lattes.cnpq.br/4610098557429398http://lattes.cnpq.br/8577312109146354MATTOS NETO, Paulo Salgado Gomes deCAVALCANTI, George Darmiton da Cunha2022-03-24T17:32:03Z2022-03-24T17:32:03Z2020-11-30ANDRADE, Renata Correia de. Time series forecasting with deep forest regression. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2021.https://repositorio.ufpe.br/handle/123456789/43505A time series is a collection of ordered observations which are usually measured in repeated intervals. Time series forecasting is an area of research which studies methods for prediction of future values in a series. Forecasting methods range from statistical procedures, such as ARIMA to, more recently, machine learning approaches. Deep neural networks (DNNs) have shown good performance on a great number of tasks, including time series forecasting for which DNNs are considered state-of-the-art. Most deep models today are neural networks, but despite its popularity and proven competitive performance when compared to other machine learning algorithms, DNNs still face some limitations. Most notably, they usually require a large number of training examples - which could be unavailable for smaller time series - and they possess a large number of hyper-parameters which need to be tuned to individual datasets. Multi-grained cascade forest (gcForest) is a deep machine learning algorithm which has been proposed for classification and that addresses DNNs limitations while replicating the features which are responsible for the success of this type of model. This dissertation’s goal is to adapt the original gcForest algorithm in order for it to work with regression problems, enabling it to be applied to time series forecasting. The influence of the two different stages of gcForest - multi-grained scanning and cascade forest, is also investigated. Also explored is the possibility of adding an additional model to the end of the cascade forest structure and thus change the way the final result is calculated. Changes to the algorithm are presented and its performance is evaluated on four different time series datasets, according to three performance metrics: mean squared error, mean absolute error and mean absolute percentage error. Results show that gcForest achieves competitive performance on all four datasets, when compared to traditional machine learning models.Uma série temporal é uma sequência de observações medidas em espaços de tempo definidos. Previsão de séries temporais é uma área de pesquisa que estuda métodos para previsão de valores futuros em uma série. Métodos de previsão variam de procedimentos estatísticos, como o ARIMA, para, mais recentemente, abordagens com aprendizagem de máquina. Redes neurais profundas, em específico, mostraram um bom desempenho em uma variedade de problemas, incluindo a previsão de séries temporais, onde são consideradas estado da arte. A maioria dos modelos de aprendizagem profunda atualmente são redes neurais, mas, apesar de sua popularidade e bom desempenho quando comparado a outros algoritmos de aprendizagem de máquina, redes neurais profundas ainda possuem algumas limitações. Mais notavelmente, este tipo de modelo normalmente precisa de uma quantidade maior de exemplos de treinamento, que podem não estar disponíveis para séries temporais mais curtas, além de possuir uma grande quantidade de parâmetros que precisam ser ajustados a cada conjunto de dados. O Multi-grained Cascade Forest (gcForest) é um modelo de aprendizagem profunda proposto para problemas de classificação e que endereça as limitações das redes neurais profundas, enquanto replica características responsáveis pelo sucesso desse tipo de modelo. O objetivo desta dissertação é adaptar o algoritmo original do gcForest para que ele possa ser aplicado a problemas de regressão, possibilitando que o mesmo seja utilizado para previsão de séries temporais. A influência das duas etapas do gcForest - multi-grained scanning e cascade forest - também é avaliada. Além disso, é explorada a possibilidade de adicionar um regressor ao final da etapa de cascade forest e assim alterar a forma de cálculo do resultado final. Após apresentar as mudanças feitas ao algoritmo, seu desempenho é avaliado em quatro séries temporais diferentes de acordo com três métricas de performance: erro médio quadrado, erro médio absoluto e o erro percentual absoluto médio. Resultados mostram que a versão proposta do gcForest atinge um desempenho competitivo quando comparado a modelos tradicionais de aprendizagem de máquina.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessInteligência computacionalRegressãoSéries temporaisTime series forecasting with deep forest regressioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Renata Correia de Andrade.pdfDISSERTAÇÃO Renata Correia de Andrade.pdfapplication/pdf2047412https://repositorio.ufpe.br/bitstream/123456789/43505/1/DISSERTA%c3%87%c3%83O%20Renata%20Correia%20de%20Andrade.pdf1d9399e2124a927e090cfa36a6553da9MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/43505/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82142https://repositorio.ufpe.br/bitstream/123456789/43505/3/license.txt6928b9260b07fb2755249a5ca9903395MD53TEXTDISSERTAÇÃO Renata Correia de Andrade.pdf.txtDISSERTAÇÃO Renata Correia de Andrade.pdf.txtExtracted texttext/plain132601https://repositorio.ufpe.br/bitstream/123456789/43505/4/DISSERTA%c3%87%c3%83O%20Renata%20Correia%20de%20Andrade.pdf.txt086e232c99088ee67a624130c70c0f0aMD54THUMBNAILDISSERTAÇÃO Renata Correia de Andrade.pdf.jpgDISSERTAÇÃO Renata Correia de Andrade.pdf.jpgGenerated Thumbnailimage/jpeg1259https://repositorio.ufpe.br/bitstream/123456789/43505/5/DISSERTA%c3%87%c3%83O%20Renata%20Correia%20de%20Andrade.pdf.jpgf0c3a8f803e35408961bb27abebfcdcdMD55123456789/435052022-03-25 02:11:14.68oai:repositorio.ufpe.br:123456789/43505VGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBkZSBEb2N1bWVudG9zIG5vIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUKIAoKRGVjbGFybyBlc3RhciBjaWVudGUgZGUgcXVlIGVzdGUgVGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyB0ZW0gbyBvYmpldGl2byBkZSBkaXZ1bGdhw6fDo28gZG9zIGRvY3VtZW50b3MgZGVwb3NpdGFkb3Mgbm8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRSBlIGRlY2xhcm8gcXVlOgoKSSAtICBvIGNvbnRlw7pkbyBkaXNwb25pYmlsaXphZG8gw6kgZGUgcmVzcG9uc2FiaWxpZGFkZSBkZSBzdWEgYXV0b3JpYTsKCklJIC0gbyBjb250ZcO6ZG8gw6kgb3JpZ2luYWwsIGUgc2UgbyB0cmFiYWxobyBlL291IHBhbGF2cmFzIGRlIG91dHJhcyBwZXNzb2FzIGZvcmFtIHV0aWxpemFkb3MsIGVzdGFzIGZvcmFtIGRldmlkYW1lbnRlIHJlY29uaGVjaWRhczsKCklJSSAtIHF1YW5kbyB0cmF0YXItc2UgZGUgVHJhYmFsaG8gZGUgQ29uY2x1c8OjbyBkZSBDdXJzbywgRGlzc2VydGHDp8OjbyBvdSBUZXNlOiBvIGFycXVpdm8gZGVwb3NpdGFkbyBjb3JyZXNwb25kZSDDoCB2ZXJzw6NvIGZpbmFsIGRvIHRyYWJhbGhvOwoKSVYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIFRyYWJhbGhvIGRlIENvbmNsdXPDo28gZGUgQ3Vyc28sIERpc3NlcnRhw6fDo28gb3UgVGVzZTogZXN0b3UgY2llbnRlIGRlIHF1ZSBhIGFsdGVyYcOnw6NvIGRhIG1vZGFsaWRhZGUgZGUgYWNlc3NvIGFvIGRvY3VtZW50byBhcMOzcyBvIGRlcMOzc2l0byBlIGFudGVzIGRlIGZpbmRhciBvIHBlcsOtb2RvIGRlIGVtYmFyZ28sIHF1YW5kbyBmb3IgZXNjb2xoaWRvIGFjZXNzbyByZXN0cml0bywgc2Vyw6EgcGVybWl0aWRhIG1lZGlhbnRlIHNvbGljaXRhw6fDo28gZG8gKGEpIGF1dG9yIChhKSBhbyBTaXN0ZW1hIEludGVncmFkbyBkZSBCaWJsaW90ZWNhcyBkYSBVRlBFIChTSUIvVUZQRSkuCgogClBhcmEgdHJhYmFsaG9zIGVtIEFjZXNzbyBBYmVydG86CgpOYSBxdWFsaWRhZGUgZGUgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGUgYXV0b3IgcXVlIHJlY2FlbSBzb2JyZSBlc3RlIGRvY3VtZW50bywgZnVuZGFtZW50YWRvIG5hIExlaSBkZSBEaXJlaXRvIEF1dG9yYWwgbm8gOS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBhcnQuIDI5LCBpbmNpc28gSUlJLCBhdXRvcml6byBhIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRlIFBlcm5hbWJ1Y28gYSBkaXNwb25pYmlsaXphciBncmF0dWl0YW1lbnRlLCBzZW0gcmVzc2FyY2ltZW50byBkb3MgZGlyZWl0b3MgYXV0b3JhaXMsIHBhcmEgZmlucyBkZSBsZWl0dXJhLCBpbXByZXNzw6NvIGUvb3UgZG93bmxvYWQgKGFxdWlzacOnw6NvKSBhdHJhdsOpcyBkbyBzaXRlIGRvIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUgbm8gZW5kZXJlw6dvIGh0dHA6Ly93d3cucmVwb3NpdG9yaW8udWZwZS5iciwgYSBwYXJ0aXIgZGEgZGF0YSBkZSBkZXDDs3NpdG8uCgogClBhcmEgdHJhYmFsaG9zIGVtIEFjZXNzbyBSZXN0cml0bzoKCk5hIHF1YWxpZGFkZSBkZSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkZSBhdXRvciBxdWUgcmVjYWVtIHNvYnJlIGVzdGUgZG9jdW1lbnRvLCBmdW5kYW1lbnRhZG8gbmEgTGVpIGRlIERpcmVpdG8gQXV0b3JhbCBubyA5LjYxMCBkZSAxOSBkZSBmZXZlcmVpcm8gZGUgMTk5OCwgYXJ0LiAyOSwgaW5jaXNvIElJSSwgYXV0b3Jpem8gYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIGEgZGlzcG9uaWJpbGl6YXIgZ3JhdHVpdGFtZW50ZSwgc2VtIHJlc3NhcmNpbWVudG8gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBwYXJhIGZpbnMgZGUgbGVpdHVyYSwgaW1wcmVzc8OjbyBlL291IGRvd25sb2FkIChhcXVpc2nDp8OjbykgYXRyYXbDqXMgZG8gc2l0ZSBkbyBSZXBvc2l0w7NyaW8gRGlnaXRhbCBkYSBVRlBFIG5vIGVuZGVyZcOnbyBodHRwOi8vd3d3LnJlcG9zaXRvcmlvLnVmcGUuYnIsIHF1YW5kbyBmaW5kYXIgbyBwZXLDrW9kbyBkZSBlbWJhcmdvIGNvbmRpemVudGUgYW8gdGlwbyBkZSBkb2N1bWVudG8sIGNvbmZvcm1lIGluZGljYWRvIG5vIGNhbXBvIERhdGEgZGUgRW1iYXJnby4KRepositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212022-03-25T05:11:14Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv Time series forecasting with deep forest regression
title Time series forecasting with deep forest regression
spellingShingle Time series forecasting with deep forest regression
ANDRADE, Renata Correia de
Inteligência computacional
Regressão
Séries temporais
title_short Time series forecasting with deep forest regression
title_full Time series forecasting with deep forest regression
title_fullStr Time series forecasting with deep forest regression
title_full_unstemmed Time series forecasting with deep forest regression
title_sort Time series forecasting with deep forest regression
author ANDRADE, Renata Correia de
author_facet ANDRADE, Renata Correia de
author_role author
dc.contributor.authorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/7170312279134322
dc.contributor.advisorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/4610098557429398
dc.contributor.advisor-coLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/8577312109146354
dc.contributor.author.fl_str_mv ANDRADE, Renata Correia de
dc.contributor.advisor1.fl_str_mv MATTOS NETO, Paulo Salgado Gomes de
dc.contributor.advisor-co1.fl_str_mv CAVALCANTI, George Darmiton da Cunha
contributor_str_mv MATTOS NETO, Paulo Salgado Gomes de
CAVALCANTI, George Darmiton da Cunha
dc.subject.por.fl_str_mv Inteligência computacional
Regressão
Séries temporais
topic Inteligência computacional
Regressão
Séries temporais
description A time series is a collection of ordered observations which are usually measured in repeated intervals. Time series forecasting is an area of research which studies methods for prediction of future values in a series. Forecasting methods range from statistical procedures, such as ARIMA to, more recently, machine learning approaches. Deep neural networks (DNNs) have shown good performance on a great number of tasks, including time series forecasting for which DNNs are considered state-of-the-art. Most deep models today are neural networks, but despite its popularity and proven competitive performance when compared to other machine learning algorithms, DNNs still face some limitations. Most notably, they usually require a large number of training examples - which could be unavailable for smaller time series - and they possess a large number of hyper-parameters which need to be tuned to individual datasets. Multi-grained cascade forest (gcForest) is a deep machine learning algorithm which has been proposed for classification and that addresses DNNs limitations while replicating the features which are responsible for the success of this type of model. This dissertation’s goal is to adapt the original gcForest algorithm in order for it to work with regression problems, enabling it to be applied to time series forecasting. The influence of the two different stages of gcForest - multi-grained scanning and cascade forest, is also investigated. Also explored is the possibility of adding an additional model to the end of the cascade forest structure and thus change the way the final result is calculated. Changes to the algorithm are presented and its performance is evaluated on four different time series datasets, according to three performance metrics: mean squared error, mean absolute error and mean absolute percentage error. Results show that gcForest achieves competitive performance on all four datasets, when compared to traditional machine learning models.
publishDate 2020
dc.date.issued.fl_str_mv 2020-11-30
dc.date.accessioned.fl_str_mv 2022-03-24T17:32:03Z
dc.date.available.fl_str_mv 2022-03-24T17:32:03Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv ANDRADE, Renata Correia de. Time series forecasting with deep forest regression. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2021.
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/43505
identifier_str_mv ANDRADE, Renata Correia de. Time series forecasting with deep forest regression. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2021.
url https://repositorio.ufpe.br/handle/123456789/43505
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv Programa de Pos Graduacao em Ciencia da Computacao
dc.publisher.initials.fl_str_mv UFPE
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
bitstream.url.fl_str_mv https://repositorio.ufpe.br/bitstream/123456789/43505/1/DISSERTA%c3%87%c3%83O%20Renata%20Correia%20de%20Andrade.pdf
https://repositorio.ufpe.br/bitstream/123456789/43505/2/license_rdf
https://repositorio.ufpe.br/bitstream/123456789/43505/3/license.txt
https://repositorio.ufpe.br/bitstream/123456789/43505/4/DISSERTA%c3%87%c3%83O%20Renata%20Correia%20de%20Andrade.pdf.txt
https://repositorio.ufpe.br/bitstream/123456789/43505/5/DISSERTA%c3%87%c3%83O%20Renata%20Correia%20de%20Andrade.pdf.jpg
bitstream.checksum.fl_str_mv 1d9399e2124a927e090cfa36a6553da9
e39d27027a6cc9cb039ad269a5db8e34
6928b9260b07fb2755249a5ca9903395
086e232c99088ee67a624130c70c0f0a
f0c3a8f803e35408961bb27abebfcdcd
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1793515679577538560