Missing data in time series: analysis, model and software application

Detalhes bibliográficos
Autor(a) principal: Minglino, Francesca
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10071/19502
Resumo: Missing data in univariate time series are a recurring problem causing bias and leading to inefficient analyses. Most existing statistical methods which address the missingness problem do not consider the characteristics of the time series when imputing the missing values and, most of all, do not allow the imputation in a univariate time series context. Moreover, just a few methods can be applied to all missing data patterns. Finally, no intuitive procedure addressing the missingness obstacle exists in the literature. In this work of investigation, an algorithm having the aim of filling in these gaps is presented; its main purpose is to find a procedure that gives reliable imputations of the missing values, i.e. not far from the true ones. To this aim, the reliability and robustness of the algorithm have been tested through the simulations campaigns approach. Its innovative feature is the combination of the ARMA models, used to impute the missing values through a forecast and a backcast approach, and the Expectation-Maximization algorithm, used to achieve the parameters convergence. This approach was evaluated through the RMSE and the MAPE metrics, which showed that the algorithm can be used in almost every model setting among the tested ones, with a good reliability. However, one of the main limitations of the introduced procedure is that the nonconvergence of the algorithm could bring to biased imputations. The algorithm can be applied step by step by a common analyst, in a more intuitive way than the majority of other existing approaches.
id RCAP_092c9a1b975bdf8863430858dac912ab
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/19502
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Missing data in time series: analysis, model and software applicationMissing dataUnivariate time seriesARMA modelsExpectation-maximization algorithmMissing data in univariate time series are a recurring problem causing bias and leading to inefficient analyses. Most existing statistical methods which address the missingness problem do not consider the characteristics of the time series when imputing the missing values and, most of all, do not allow the imputation in a univariate time series context. Moreover, just a few methods can be applied to all missing data patterns. Finally, no intuitive procedure addressing the missingness obstacle exists in the literature. In this work of investigation, an algorithm having the aim of filling in these gaps is presented; its main purpose is to find a procedure that gives reliable imputations of the missing values, i.e. not far from the true ones. To this aim, the reliability and robustness of the algorithm have been tested through the simulations campaigns approach. Its innovative feature is the combination of the ARMA models, used to impute the missing values through a forecast and a backcast approach, and the Expectation-Maximization algorithm, used to achieve the parameters convergence. This approach was evaluated through the RMSE and the MAPE metrics, which showed that the algorithm can be used in almost every model setting among the tested ones, with a good reliability. However, one of the main limitations of the introduced procedure is that the nonconvergence of the algorithm could bring to biased imputations. The algorithm can be applied step by step by a common analyst, in a more intuitive way than the majority of other existing approaches.2020-01-20T15:49:25Z2018-11-19T00:00:00Z2018-11-192018-09info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfapplication/octet-streamhttp://hdl.handle.net/10071/19502TID:202169910porMinglino, Francescainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:49:47Zoai:repositorio.iscte-iul.pt:10071/19502Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:24:29.638542Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Missing data in time series: analysis, model and software application
title Missing data in time series: analysis, model and software application
spellingShingle Missing data in time series: analysis, model and software application
Minglino, Francesca
Missing data
Univariate time series
ARMA models
Expectation-maximization algorithm
title_short Missing data in time series: analysis, model and software application
title_full Missing data in time series: analysis, model and software application
title_fullStr Missing data in time series: analysis, model and software application
title_full_unstemmed Missing data in time series: analysis, model and software application
title_sort Missing data in time series: analysis, model and software application
author Minglino, Francesca
author_facet Minglino, Francesca
author_role author
dc.contributor.author.fl_str_mv Minglino, Francesca
dc.subject.por.fl_str_mv Missing data
Univariate time series
ARMA models
Expectation-maximization algorithm
topic Missing data
Univariate time series
ARMA models
Expectation-maximization algorithm
description Missing data in univariate time series are a recurring problem causing bias and leading to inefficient analyses. Most existing statistical methods which address the missingness problem do not consider the characteristics of the time series when imputing the missing values and, most of all, do not allow the imputation in a univariate time series context. Moreover, just a few methods can be applied to all missing data patterns. Finally, no intuitive procedure addressing the missingness obstacle exists in the literature. In this work of investigation, an algorithm having the aim of filling in these gaps is presented; its main purpose is to find a procedure that gives reliable imputations of the missing values, i.e. not far from the true ones. To this aim, the reliability and robustness of the algorithm have been tested through the simulations campaigns approach. Its innovative feature is the combination of the ARMA models, used to impute the missing values through a forecast and a backcast approach, and the Expectation-Maximization algorithm, used to achieve the parameters convergence. This approach was evaluated through the RMSE and the MAPE metrics, which showed that the algorithm can be used in almost every model setting among the tested ones, with a good reliability. However, one of the main limitations of the introduced procedure is that the nonconvergence of the algorithm could bring to biased imputations. The algorithm can be applied step by step by a common analyst, in a more intuitive way than the majority of other existing approaches.
publishDate 2018
dc.date.none.fl_str_mv 2018-11-19T00:00:00Z
2018-11-19
2018-09
2020-01-20T15:49:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/19502
TID:202169910
url http://hdl.handle.net/10071/19502
identifier_str_mv TID:202169910
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/octet-stream
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134806946086912