Tournaments between markers as a strategy to enhance genomic predictions.

Detalhes bibliográficos
Autor(a) principal: FERREIRA FILHO, D.
Data de Publicação: 2019
Outros Autores: BUENO FILHO, J. S. de S., REGITANO, L. C. de A., ALENCAR, M. M. de, ALVES, R. R., BAENA, M. M., MEIRELLES, S. L. C.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
Texto Completo: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1110534
Resumo: Analysis of a large number of markers is crucial in both genome-wide association studies (GWAS) and genome-wide selection (GWS). However there are two methodological issues that restrict statistical analysis: high dimensionality (p>>n) and multicollinearity. Although there are methodologies that can be used to fit models for data with high dimensionality (eg,the Bayesian Lasso), a big problem that can occurs in this cases is that the predictive ability of the model should perform well for the individuals used to fit the model, but should not perform well for other individuals, restricting the applicability of the model. This problem can be circumvent by applying some selection methodology to reduce the number of markers (but keeping the markers associated with the phenotypic trait) before adjusting a model to predict GBVs. We revisit a tournament-based strategy between marker samples, where each sample has good statistical properties for estimation: n>p and low collinearity. Such tournaments are elaborated using multiple linear regression to eliminate markers. This method is adapted from previous works found in the literature. We used simulated data as well as real data derived from a study with SNPs in beef cattle. Tournament strategies not only circumvent the p>>n issue, but also minimize spurious associations. For real data, when we selected a few more than 20 markers, we obtained correlations greater than 0.70 between predicted Genomic Breeding Values (GBVs) and phenotypes in validation groups of a cross-validation scheme; and when we selected a larger number of markers (more than 100), the correlations exceeded 0.90, showing the efficiency in identifying relevant SNPs (or segregations) for both GWAS and GWS. In the simulation study, we obtained similar results.
id EMBR_565d7ac4a5552795bf1943ff0a53b426
oai_identifier_str oai:www.alice.cnptia.embrapa.br:doc/1110534
network_acronym_str EMBR
network_name_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository_id_str 2154
spelling Tournaments between markers as a strategy to enhance genomic predictions.Genome-wideGWASGWSSNPsGenomic Breeding ValuesGenótipoGenomaGenotypingAnalysis of a large number of markers is crucial in both genome-wide association studies (GWAS) and genome-wide selection (GWS). However there are two methodological issues that restrict statistical analysis: high dimensionality (p>>n) and multicollinearity. Although there are methodologies that can be used to fit models for data with high dimensionality (eg,the Bayesian Lasso), a big problem that can occurs in this cases is that the predictive ability of the model should perform well for the individuals used to fit the model, but should not perform well for other individuals, restricting the applicability of the model. This problem can be circumvent by applying some selection methodology to reduce the number of markers (but keeping the markers associated with the phenotypic trait) before adjusting a model to predict GBVs. We revisit a tournament-based strategy between marker samples, where each sample has good statistical properties for estimation: n>p and low collinearity. Such tournaments are elaborated using multiple linear regression to eliminate markers. This method is adapted from previous works found in the literature. We used simulated data as well as real data derived from a study with SNPs in beef cattle. Tournament strategies not only circumvent the p>>n issue, but also minimize spurious associations. For real data, when we selected a few more than 20 markers, we obtained correlations greater than 0.70 between predicted Genomic Breeding Values (GBVs) and phenotypes in validation groups of a cross-validation scheme; and when we selected a larger number of markers (more than 100), the correlations exceeded 0.90, showing the efficiency in identifying relevant SNPs (or segregations) for both GWAS and GWS. In the simulation study, we obtained similar results.Diógenes Ferreira Filho, UFRRJ; Júio Sílvio de Sousa Bueno Filho, UFLA; LUCIANA CORREIA DE ALMEIDA REGITANO, CPPSE; MAURICIO MELLO DE ALENCAR, CPPSE; ROSIANA RODRIGUES ALVES, CNPASA; Marielle Moura Baena, UFLA; Sarah Laguna Conceição Meirelles, UFLA.FERREIRA FILHO, D.BUENO FILHO, J. S. de S.REGITANO, L. C. de A.ALENCAR, M. M. deALVES, R. R.BAENA, M. M.MEIRELLES, S. L. C.2021-01-29T00:53:03Z2021-01-29T00:53:03Z2019-07-102019info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlePlos One, v. 14, n. 7, e0219448, p. 1-17, 2019.http://www.alice.cnptia.embrapa.br/alice/handle/doc/111053410.1371/journal.pone.0219448enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2021-01-29T00:53:11Zoai:www.alice.cnptia.embrapa.br:doc/1110534Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestopendoar:21542021-01-29T00:53:11falseRepositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542021-01-29T00:53:11Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false
dc.title.none.fl_str_mv Tournaments between markers as a strategy to enhance genomic predictions.
title Tournaments between markers as a strategy to enhance genomic predictions.
spellingShingle Tournaments between markers as a strategy to enhance genomic predictions.
FERREIRA FILHO, D.
Genome-wide
GWAS
GWS
SNPs
Genomic Breeding Values
Genótipo
Genoma
Genotyping
title_short Tournaments between markers as a strategy to enhance genomic predictions.
title_full Tournaments between markers as a strategy to enhance genomic predictions.
title_fullStr Tournaments between markers as a strategy to enhance genomic predictions.
title_full_unstemmed Tournaments between markers as a strategy to enhance genomic predictions.
title_sort Tournaments between markers as a strategy to enhance genomic predictions.
author FERREIRA FILHO, D.
author_facet FERREIRA FILHO, D.
BUENO FILHO, J. S. de S.
REGITANO, L. C. de A.
ALENCAR, M. M. de
ALVES, R. R.
BAENA, M. M.
MEIRELLES, S. L. C.
author_role author
author2 BUENO FILHO, J. S. de S.
REGITANO, L. C. de A.
ALENCAR, M. M. de
ALVES, R. R.
BAENA, M. M.
MEIRELLES, S. L. C.
author2_role author
author
author
author
author
author
dc.contributor.none.fl_str_mv Diógenes Ferreira Filho, UFRRJ; Júio Sílvio de Sousa Bueno Filho, UFLA; LUCIANA CORREIA DE ALMEIDA REGITANO, CPPSE; MAURICIO MELLO DE ALENCAR, CPPSE; ROSIANA RODRIGUES ALVES, CNPASA; Marielle Moura Baena, UFLA; Sarah Laguna Conceição Meirelles, UFLA.
dc.contributor.author.fl_str_mv FERREIRA FILHO, D.
BUENO FILHO, J. S. de S.
REGITANO, L. C. de A.
ALENCAR, M. M. de
ALVES, R. R.
BAENA, M. M.
MEIRELLES, S. L. C.
dc.subject.por.fl_str_mv Genome-wide
GWAS
GWS
SNPs
Genomic Breeding Values
Genótipo
Genoma
Genotyping
topic Genome-wide
GWAS
GWS
SNPs
Genomic Breeding Values
Genótipo
Genoma
Genotyping
description Analysis of a large number of markers is crucial in both genome-wide association studies (GWAS) and genome-wide selection (GWS). However there are two methodological issues that restrict statistical analysis: high dimensionality (p>>n) and multicollinearity. Although there are methodologies that can be used to fit models for data with high dimensionality (eg,the Bayesian Lasso), a big problem that can occurs in this cases is that the predictive ability of the model should perform well for the individuals used to fit the model, but should not perform well for other individuals, restricting the applicability of the model. This problem can be circumvent by applying some selection methodology to reduce the number of markers (but keeping the markers associated with the phenotypic trait) before adjusting a model to predict GBVs. We revisit a tournament-based strategy between marker samples, where each sample has good statistical properties for estimation: n>p and low collinearity. Such tournaments are elaborated using multiple linear regression to eliminate markers. This method is adapted from previous works found in the literature. We used simulated data as well as real data derived from a study with SNPs in beef cattle. Tournament strategies not only circumvent the p>>n issue, but also minimize spurious associations. For real data, when we selected a few more than 20 markers, we obtained correlations greater than 0.70 between predicted Genomic Breeding Values (GBVs) and phenotypes in validation groups of a cross-validation scheme; and when we selected a larger number of markers (more than 100), the correlations exceeded 0.90, showing the efficiency in identifying relevant SNPs (or segregations) for both GWAS and GWS. In the simulation study, we obtained similar results.
publishDate 2019
dc.date.none.fl_str_mv 2019-07-10
2019
2021-01-29T00:53:03Z
2021-01-29T00:53:03Z
dc.type.driver.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv Plos One, v. 14, n. 7, e0219448, p. 1-17, 2019.
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1110534
10.1371/journal.pone.0219448
identifier_str_mv Plos One, v. 14, n. 7, e0219448, p. 1-17, 2019.
10.1371/journal.pone.0219448
url http://www.alice.cnptia.embrapa.br/alice/handle/doc/1110534
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron:EMBRAPA
instname_str Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron_str EMBRAPA
institution EMBRAPA
reponame_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
collection Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository.name.fl_str_mv Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
repository.mail.fl_str_mv cg-riaa@embrapa.br
_version_ 1794503501824393216