Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port

Detalhes bibliográficos
Autor(a) principal: Ho, P.
Data de Publicação: 2001
Outros Autores: Silva, M. C. M., Hogg, T. A.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.14/6855
Resumo: A multivariate data matrix containing a number of missing values was obtained from a study on the changes in colour and phenolic composition during the ageing of port. Two approaches were taken in the analysis of the data. The first involved the use of multiple imputation (MI) followed by principal components analysis (PCA). The second examined the use of maximum likelihood principal component analysis (MLPCA). The use of multiple imputation allows for missing value uncertainty to be incorporated into the analysis of the data. Initial estimates of missing values were firstly calculated using the Expectation Maximization algorithm (EM), followed by Data Augmentation (DA) in order to generate five imputed data matrices. Each complete data matrix was subsequently analysed by PCA, then averaging their principal component (PC) scores and loadings to give an estimation of errors. The first three PCs accounted for 93.3% of the explained variance. Changes to colour and monomeric anthocyanin composition were explained on PC1 (79.63% explained variance), phenolic composition and hue mainly on PC2 (8.61% explained variance) and phenolic composition and the formation of polymeric pigment on PC3 (5.04% explained variance). In MLPCA estimates of measurement uncertainty is incorporated in the decomposition step, with missing values being assigned large measurement uncertainties. PC scores on the first two PCs after multiple imputation and PCA (MI+PCA) were comparable to maximum likelihood scores on the first two PCs extracted by MLPCA.
id RCAP_9850f327d5c975b054829af854581300
oai_identifier_str oai:repositorio.ucp.pt:10400.14/6855
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of portMissing valuesPrincipal components analysisMultiple imputationMaximum likelihood principal components analysisPortAgeingColourPhenolic compositionA multivariate data matrix containing a number of missing values was obtained from a study on the changes in colour and phenolic composition during the ageing of port. Two approaches were taken in the analysis of the data. The first involved the use of multiple imputation (MI) followed by principal components analysis (PCA). The second examined the use of maximum likelihood principal component analysis (MLPCA). The use of multiple imputation allows for missing value uncertainty to be incorporated into the analysis of the data. Initial estimates of missing values were firstly calculated using the Expectation Maximization algorithm (EM), followed by Data Augmentation (DA) in order to generate five imputed data matrices. Each complete data matrix was subsequently analysed by PCA, then averaging their principal component (PC) scores and loadings to give an estimation of errors. The first three PCs accounted for 93.3% of the explained variance. Changes to colour and monomeric anthocyanin composition were explained on PC1 (79.63% explained variance), phenolic composition and hue mainly on PC2 (8.61% explained variance) and phenolic composition and the formation of polymeric pigment on PC3 (5.04% explained variance). In MLPCA estimates of measurement uncertainty is incorporated in the decomposition step, with missing values being assigned large measurement uncertainties. PC scores on the first two PCs after multiple imputation and PCA (MI+PCA) were comparable to maximum likelihood scores on the first two PCs extracted by MLPCA.ElsevierVeritati - Repositório Institucional da Universidade Católica PortuguesaHo, P.Silva, M. C. M.Hogg, T. A.2011-10-22T17:05:55Z20012001-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.14/6855engHO, P. ; SILVA, M.C.M ; HOGG, T.A. - Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port. Chemometrics and Intelligent Laboratory Systems. ISSN 0169-7439.Vol. 55, n.º 1-2 (2001), p. 1-11info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-12T17:10:37Zoai:repositorio.ucp.pt:10400.14/6855Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:05:51.701150Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
title Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
spellingShingle Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
Ho, P.
Missing values
Principal components analysis
Multiple imputation
Maximum likelihood principal components analysis
Port
Ageing
Colour
Phenolic composition
title_short Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
title_full Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
title_fullStr Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
title_full_unstemmed Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
title_sort Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port
author Ho, P.
author_facet Ho, P.
Silva, M. C. M.
Hogg, T. A.
author_role author
author2 Silva, M. C. M.
Hogg, T. A.
author2_role author
author
dc.contributor.none.fl_str_mv Veritati - Repositório Institucional da Universidade Católica Portuguesa
dc.contributor.author.fl_str_mv Ho, P.
Silva, M. C. M.
Hogg, T. A.
dc.subject.por.fl_str_mv Missing values
Principal components analysis
Multiple imputation
Maximum likelihood principal components analysis
Port
Ageing
Colour
Phenolic composition
topic Missing values
Principal components analysis
Multiple imputation
Maximum likelihood principal components analysis
Port
Ageing
Colour
Phenolic composition
description A multivariate data matrix containing a number of missing values was obtained from a study on the changes in colour and phenolic composition during the ageing of port. Two approaches were taken in the analysis of the data. The first involved the use of multiple imputation (MI) followed by principal components analysis (PCA). The second examined the use of maximum likelihood principal component analysis (MLPCA). The use of multiple imputation allows for missing value uncertainty to be incorporated into the analysis of the data. Initial estimates of missing values were firstly calculated using the Expectation Maximization algorithm (EM), followed by Data Augmentation (DA) in order to generate five imputed data matrices. Each complete data matrix was subsequently analysed by PCA, then averaging their principal component (PC) scores and loadings to give an estimation of errors. The first three PCs accounted for 93.3% of the explained variance. Changes to colour and monomeric anthocyanin composition were explained on PC1 (79.63% explained variance), phenolic composition and hue mainly on PC2 (8.61% explained variance) and phenolic composition and the formation of polymeric pigment on PC3 (5.04% explained variance). In MLPCA estimates of measurement uncertainty is incorporated in the decomposition step, with missing values being assigned large measurement uncertainties. PC scores on the first two PCs after multiple imputation and PCA (MI+PCA) were comparable to maximum likelihood scores on the first two PCs extracted by MLPCA.
publishDate 2001
dc.date.none.fl_str_mv 2001
2001-01-01T00:00:00Z
2011-10-22T17:05:55Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.14/6855
url http://hdl.handle.net/10400.14/6855
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv HO, P. ; SILVA, M.C.M ; HOGG, T.A. - Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port. Chemometrics and Intelligent Laboratory Systems. ISSN 0169-7439.Vol. 55, n.º 1-2 (2001), p. 1-11
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799131724434636800