A comparison of regression methods based on dimensional reduction for genomic prediction.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
Texto Completo: | http://www.alice.cnptia.embrapa.br/alice/handle/doc/1139234 https://doi.org/10.4238/gmr18877 |
Resumo: | multicollinearity and high dimensionality problems, making it impossible to obtain stable estimates through the traditional method of estimation based on ordinary least squares. To overcome such challenges, dimensionality reduction methods have been proposed, because of their simple theory and easy application. We compared three dimensionality reduction methods: Principal Components Regression (PCR), Partial Least Squares (PLS), and Independent Components Regression (ICR). An important step for dimensionality reduction and prediction is selecting the number of components, as it affects the linear combinations of the explanatory variables. The linear combinations are inserted into the model to predict the response based on a reduced number of parameters. We examined the criteria for the selection of the number of components. The dimensionality reduction methods were applied to genomic and phenotype data. We evaluated 370 accessions of Asian rice, Oryza sativa, which were genotyped for 36,901 SNPs markers considered to predict the genomic values for the number of panicles per plant trait.This data set presented multicollinearity and high dimensionality. The computational time for each method was also recorded. Among the methods, PCR and ICR gave the highest accuracy values, with ICR standing out for presenting estimates of the least biased genomic values. However, ICR required more computational time than the other methodologies. |
id |
EMBR_bc922f9d0ea63104134572be3f4945dd |
---|---|
oai_identifier_str |
oai:www.alice.cnptia.embrapa.br:doc/1139234 |
network_acronym_str |
EMBR |
network_name_str |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
repository_id_str |
2154 |
spelling |
A comparison of regression methods based on dimensional reduction for genomic prediction.Regression analysisGenomicsmulticollinearity and high dimensionality problems, making it impossible to obtain stable estimates through the traditional method of estimation based on ordinary least squares. To overcome such challenges, dimensionality reduction methods have been proposed, because of their simple theory and easy application. We compared three dimensionality reduction methods: Principal Components Regression (PCR), Partial Least Squares (PLS), and Independent Components Regression (ICR). An important step for dimensionality reduction and prediction is selecting the number of components, as it affects the linear combinations of the explanatory variables. The linear combinations are inserted into the model to predict the response based on a reduced number of parameters. We examined the criteria for the selection of the number of components. The dimensionality reduction methods were applied to genomic and phenotype data. We evaluated 370 accessions of Asian rice, Oryza sativa, which were genotyped for 36,901 SNPs markers considered to predict the genomic values for the number of panicles per plant trait.This data set presented multicollinearity and high dimensionality. The computational time for each method was also recorded. Among the methods, PCR and ICR gave the highest accuracy values, with ICR standing out for presenting estimates of the least biased genomic values. However, ICR required more computational time than the other methodologies.JAQUICELE APARECIDA DA COSTA, UFV; CAMILA FERREIRA AZEVEDO, UFV; MOYSÉS NASCIMENTO, UFV; FABYANO FONSECA E SILVA, UFV; MARCOS DEON VILELA DE RESENDE, CNPCa; ANA CAROLINA CAMPANA NASCIMENTO, UFV.COSTA, J. A. daAZEVEDO, C. F.NASCIMENTO, M.SILVA, F. F. eRESENDE, M. D. V. deNASCIMENTO, A. C. C.2022-01-21T14:30:04Z2022-01-21T14:30:04Z2022-01-212021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleGenetics and Molecular Research, v. 20, n. 2, p. 1-15, 2021.http://www.alice.cnptia.embrapa.br/alice/handle/doc/1139234https://doi.org/10.4238/gmr18877enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2022-01-21T14:30:13Zoai:www.alice.cnptia.embrapa.br:doc/1139234Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestopendoar:21542022-01-21T14:30:13falseRepositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542022-01-21T14:30:13Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false |
dc.title.none.fl_str_mv |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
title |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
spellingShingle |
A comparison of regression methods based on dimensional reduction for genomic prediction. COSTA, J. A. da Regression analysis Genomics |
title_short |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
title_full |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
title_fullStr |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
title_full_unstemmed |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
title_sort |
A comparison of regression methods based on dimensional reduction for genomic prediction. |
author |
COSTA, J. A. da |
author_facet |
COSTA, J. A. da AZEVEDO, C. F. NASCIMENTO, M. SILVA, F. F. e RESENDE, M. D. V. de NASCIMENTO, A. C. C. |
author_role |
author |
author2 |
AZEVEDO, C. F. NASCIMENTO, M. SILVA, F. F. e RESENDE, M. D. V. de NASCIMENTO, A. C. C. |
author2_role |
author author author author author |
dc.contributor.none.fl_str_mv |
JAQUICELE APARECIDA DA COSTA, UFV; CAMILA FERREIRA AZEVEDO, UFV; MOYSÉS NASCIMENTO, UFV; FABYANO FONSECA E SILVA, UFV; MARCOS DEON VILELA DE RESENDE, CNPCa; ANA CAROLINA CAMPANA NASCIMENTO, UFV. |
dc.contributor.author.fl_str_mv |
COSTA, J. A. da AZEVEDO, C. F. NASCIMENTO, M. SILVA, F. F. e RESENDE, M. D. V. de NASCIMENTO, A. C. C. |
dc.subject.por.fl_str_mv |
Regression analysis Genomics |
topic |
Regression analysis Genomics |
description |
multicollinearity and high dimensionality problems, making it impossible to obtain stable estimates through the traditional method of estimation based on ordinary least squares. To overcome such challenges, dimensionality reduction methods have been proposed, because of their simple theory and easy application. We compared three dimensionality reduction methods: Principal Components Regression (PCR), Partial Least Squares (PLS), and Independent Components Regression (ICR). An important step for dimensionality reduction and prediction is selecting the number of components, as it affects the linear combinations of the explanatory variables. The linear combinations are inserted into the model to predict the response based on a reduced number of parameters. We examined the criteria for the selection of the number of components. The dimensionality reduction methods were applied to genomic and phenotype data. We evaluated 370 accessions of Asian rice, Oryza sativa, which were genotyped for 36,901 SNPs markers considered to predict the genomic values for the number of panicles per plant trait.This data set presented multicollinearity and high dimensionality. The computational time for each method was also recorded. Among the methods, PCR and ICR gave the highest accuracy values, with ICR standing out for presenting estimates of the least biased genomic values. However, ICR required more computational time than the other methodologies. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021 2022-01-21T14:30:04Z 2022-01-21T14:30:04Z 2022-01-21 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
Genetics and Molecular Research, v. 20, n. 2, p. 1-15, 2021. http://www.alice.cnptia.embrapa.br/alice/handle/doc/1139234 https://doi.org/10.4238/gmr18877 |
identifier_str_mv |
Genetics and Molecular Research, v. 20, n. 2, p. 1-15, 2021. |
url |
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1139234 https://doi.org/10.4238/gmr18877 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa) instacron:EMBRAPA |
instname_str |
Empresa Brasileira de Pesquisa Agropecuária (Embrapa) |
instacron_str |
EMBRAPA |
institution |
EMBRAPA |
reponame_str |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
collection |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
repository.name.fl_str_mv |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa) |
repository.mail.fl_str_mv |
cg-riaa@embrapa.br |
_version_ |
1794503516828467200 |