Missing data in time series: analysis, model and software application
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10071/19502 |
Resumo: | Missing data in univariate time series are a recurring problem causing bias and leading to inefficient analyses. Most existing statistical methods which address the missingness problem do not consider the characteristics of the time series when imputing the missing values and, most of all, do not allow the imputation in a univariate time series context. Moreover, just a few methods can be applied to all missing data patterns. Finally, no intuitive procedure addressing the missingness obstacle exists in the literature. In this work of investigation, an algorithm having the aim of filling in these gaps is presented; its main purpose is to find a procedure that gives reliable imputations of the missing values, i.e. not far from the true ones. To this aim, the reliability and robustness of the algorithm have been tested through the simulations campaigns approach. Its innovative feature is the combination of the ARMA models, used to impute the missing values through a forecast and a backcast approach, and the Expectation-Maximization algorithm, used to achieve the parameters convergence. This approach was evaluated through the RMSE and the MAPE metrics, which showed that the algorithm can be used in almost every model setting among the tested ones, with a good reliability. However, one of the main limitations of the introduced procedure is that the nonconvergence of the algorithm could bring to biased imputations. The algorithm can be applied step by step by a common analyst, in a more intuitive way than the majority of other existing approaches. |
id |
RCAP_092c9a1b975bdf8863430858dac912ab |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/19502 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Missing data in time series: analysis, model and software applicationMissing dataUnivariate time seriesARMA modelsExpectation-maximization algorithmMissing data in univariate time series are a recurring problem causing bias and leading to inefficient analyses. Most existing statistical methods which address the missingness problem do not consider the characteristics of the time series when imputing the missing values and, most of all, do not allow the imputation in a univariate time series context. Moreover, just a few methods can be applied to all missing data patterns. Finally, no intuitive procedure addressing the missingness obstacle exists in the literature. In this work of investigation, an algorithm having the aim of filling in these gaps is presented; its main purpose is to find a procedure that gives reliable imputations of the missing values, i.e. not far from the true ones. To this aim, the reliability and robustness of the algorithm have been tested through the simulations campaigns approach. Its innovative feature is the combination of the ARMA models, used to impute the missing values through a forecast and a backcast approach, and the Expectation-Maximization algorithm, used to achieve the parameters convergence. This approach was evaluated through the RMSE and the MAPE metrics, which showed that the algorithm can be used in almost every model setting among the tested ones, with a good reliability. However, one of the main limitations of the introduced procedure is that the nonconvergence of the algorithm could bring to biased imputations. The algorithm can be applied step by step by a common analyst, in a more intuitive way than the majority of other existing approaches.2020-01-20T15:49:25Z2018-11-19T00:00:00Z2018-11-192018-09info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfapplication/octet-streamhttp://hdl.handle.net/10071/19502TID:202169910porMinglino, Francescainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:49:47Zoai:repositorio.iscte-iul.pt:10071/19502Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:24:29.638542Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Missing data in time series: analysis, model and software application |
title |
Missing data in time series: analysis, model and software application |
spellingShingle |
Missing data in time series: analysis, model and software application Minglino, Francesca Missing data Univariate time series ARMA models Expectation-maximization algorithm |
title_short |
Missing data in time series: analysis, model and software application |
title_full |
Missing data in time series: analysis, model and software application |
title_fullStr |
Missing data in time series: analysis, model and software application |
title_full_unstemmed |
Missing data in time series: analysis, model and software application |
title_sort |
Missing data in time series: analysis, model and software application |
author |
Minglino, Francesca |
author_facet |
Minglino, Francesca |
author_role |
author |
dc.contributor.author.fl_str_mv |
Minglino, Francesca |
dc.subject.por.fl_str_mv |
Missing data Univariate time series ARMA models Expectation-maximization algorithm |
topic |
Missing data Univariate time series ARMA models Expectation-maximization algorithm |
description |
Missing data in univariate time series are a recurring problem causing bias and leading to inefficient analyses. Most existing statistical methods which address the missingness problem do not consider the characteristics of the time series when imputing the missing values and, most of all, do not allow the imputation in a univariate time series context. Moreover, just a few methods can be applied to all missing data patterns. Finally, no intuitive procedure addressing the missingness obstacle exists in the literature. In this work of investigation, an algorithm having the aim of filling in these gaps is presented; its main purpose is to find a procedure that gives reliable imputations of the missing values, i.e. not far from the true ones. To this aim, the reliability and robustness of the algorithm have been tested through the simulations campaigns approach. Its innovative feature is the combination of the ARMA models, used to impute the missing values through a forecast and a backcast approach, and the Expectation-Maximization algorithm, used to achieve the parameters convergence. This approach was evaluated through the RMSE and the MAPE metrics, which showed that the algorithm can be used in almost every model setting among the tested ones, with a good reliability. However, one of the main limitations of the introduced procedure is that the nonconvergence of the algorithm could bring to biased imputations. The algorithm can be applied step by step by a common analyst, in a more intuitive way than the majority of other existing approaches. |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-11-19T00:00:00Z 2018-11-19 2018-09 2020-01-20T15:49:25Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/19502 TID:202169910 |
url |
http://hdl.handle.net/10071/19502 |
identifier_str_mv |
TID:202169910 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf application/octet-stream |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134806946086912 |