Using visual scores for genomic prediction of complex traits in breeding programs.

AZEVEDO, C. F.; FERRÃO, L. F. V.; BENEVENUTO, J.; RESENDE, M. D. V. de; NASCIMENTO, M.; NASCIMENTO, A. C. C.; MUNOZ, P. R.

Using visual scores for genomic prediction of complex traits in breeding programs.

Detalhes bibliográficos
Autor(a) principal:	AZEVEDO, C. F.
Data de Publicação:	2024
Outros Autores:	FERRÃO, L. F. V., BENEVENUTO, J., RESENDE, M. D. V. de, NASCIMENTO, M., NASCIMENTO, A. C. C., MUNOZ, P. R.
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
Texto Completo:	http://www.alice.cnptia.embrapa.br/alice/handle/doc/1160409 https://doi.org/10.1007/s00122-023-04512-w
Resumo:	An approach for handling visual scores with potential errors and subjectivity in scores was evaluated in simulated and blueberry recurrent selection breeding schemes to assist breeders in their decision-making. Most genomic prediction methods are based on assumptions of normality due to their simplicity and ease of implementation. However, in plant and animal breeding, continuous traits are often visually scored as categorical traits and analyzed as a Gaussian variable, thus violating the normality assumption, which could affect the prediction of breeding values and the estimation of genetic parameters. In this study, we examined the main challenges of visual scores for genomic prediction and genetic parameter estimation using mixed models, Bayesian, and machine learning methods. We evaluated these approaches using simulated and real breeding data sets. Our contribution in this study is a five-fold demonstration: (i) collecting data using an intermediate number of categories (1-3 and 1-5) is the best strategy, even considering errors associated with visual scores; (ii) Linear Mixed Models and Bayesian Linear Regression are robust to the normality violation, but marginal gains can be achieved when using Bayesian Ordinal Regression Models (BORM) and Random Forest Classification; (iii) genetic parameters are better estimated using BORM; (iv) our conclusions using simulated data are also applicable to real data in autotetraploid blueberry; and (v) a comparison of continuous and categorical phenotypes found that investing in the evaluation of 600-1000 categorical data points with low error, when it is not feasible to collect continuous phenotypes, is a strategy for improving predictive abilities. Our findings suggest the best approaches for effectively using visual scores traits to explore genetic information in breeding programs and highlight the importance of investing in the training of evaluator teams and in high-quality phenotyping.

Metadados do item

id	EMBR_6a2f94db588495be14483e2b4e53fb58
oai_identifier_str	oai:www.alice.cnptia.embrapa.br:doc/1160409
network_acronym_str	EMBR
network_name_str	Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository_id_str	2154
spelling	Using visual scores for genomic prediction of complex traits in breeding programs.Plant breedingAnimal breedingBayesian theoryGenomeInheritance (genetics)PhenotypeAn approach for handling visual scores with potential errors and subjectivity in scores was evaluated in simulated and blueberry recurrent selection breeding schemes to assist breeders in their decision-making. Most genomic prediction methods are based on assumptions of normality due to their simplicity and ease of implementation. However, in plant and animal breeding, continuous traits are often visually scored as categorical traits and analyzed as a Gaussian variable, thus violating the normality assumption, which could affect the prediction of breeding values and the estimation of genetic parameters. In this study, we examined the main challenges of visual scores for genomic prediction and genetic parameter estimation using mixed models, Bayesian, and machine learning methods. We evaluated these approaches using simulated and real breeding data sets. Our contribution in this study is a five-fold demonstration: (i) collecting data using an intermediate number of categories (1-3 and 1-5) is the best strategy, even considering errors associated with visual scores; (ii) Linear Mixed Models and Bayesian Linear Regression are robust to the normality violation, but marginal gains can be achieved when using Bayesian Ordinal Regression Models (BORM) and Random Forest Classification; (iii) genetic parameters are better estimated using BORM; (iv) our conclusions using simulated data are also applicable to real data in autotetraploid blueberry; and (v) a comparison of continuous and categorical phenotypes found that investing in the evaluation of 600-1000 categorical data points with low error, when it is not feasible to collect continuous phenotypes, is a strategy for improving predictive abilities. Our findings suggest the best approaches for effectively using visual scores traits to explore genetic information in breeding programs and highlight the importance of investing in the training of evaluator teams and in high-quality phenotyping.CAMILA FERREIRA AZEVEDO, UNIVERSIDADE FEDERAL DE VIÇOSA; LUIS FELIPE VENTORIM FERRÃO, UNIVERSITY OF FLORID; JULIANA BENEVENUTO, UNIVERSITY OF FLORID; MARCOS DEON VILELA DE RESENDE, CNPCa; MOYSES NASCIMENTO, UNIVERSIDADE FEDERAL DE VIÇOSA; ANA CAROLINA CAMPANA NASCIMENTO, UNIVERSIDADE FEDERAL DE VIÇOSA; PATRICIO R. MUNOZ, UNIVERSITY OF FLORID.AZEVEDO, C. F.FERRÃO, L. F. V.BENEVENUTO, J.RESENDE, M. D. V. deNASCIMENTO, M.NASCIMENTO, A. C. C.MUNOZ, P. R.2024-01-03T13:32:22Z2024-01-03T13:32:22Z2024-01-032024info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article16 p.Theoretical and Applied Genetics, v. 137, n. 1, 2024.http://www.alice.cnptia.embrapa.br/alice/handle/doc/1160409https://doi.org/10.1007/s00122-023-04512-wenginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2024-01-03T13:32:22Zoai:www.alice.cnptia.embrapa.br:doc/1160409Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542024-01-03T13:32:22Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false
dc.title.none.fl_str_mv	Using visual scores for genomic prediction of complex traits in breeding programs.
title	Using visual scores for genomic prediction of complex traits in breeding programs.
spellingShingle	Using visual scores for genomic prediction of complex traits in breeding programs. AZEVEDO, C. F. Plant breeding Animal breeding Bayesian theory Genome Inheritance (genetics) Phenotype
title_short	Using visual scores for genomic prediction of complex traits in breeding programs.
title_full	Using visual scores for genomic prediction of complex traits in breeding programs.
title_fullStr	Using visual scores for genomic prediction of complex traits in breeding programs.
title_full_unstemmed	Using visual scores for genomic prediction of complex traits in breeding programs.
title_sort	Using visual scores for genomic prediction of complex traits in breeding programs.
author	AZEVEDO, C. F.
author_facet	AZEVEDO, C. F. FERRÃO, L. F. V. BENEVENUTO, J. RESENDE, M. D. V. de NASCIMENTO, M. NASCIMENTO, A. C. C. MUNOZ, P. R.
author_role	author
author2	FERRÃO, L. F. V. BENEVENUTO, J. RESENDE, M. D. V. de NASCIMENTO, M. NASCIMENTO, A. C. C. MUNOZ, P. R.
author2_role	author author author author author author
dc.contributor.none.fl_str_mv	CAMILA FERREIRA AZEVEDO, UNIVERSIDADE FEDERAL DE VIÇOSA; LUIS FELIPE VENTORIM FERRÃO, UNIVERSITY OF FLORID; JULIANA BENEVENUTO, UNIVERSITY OF FLORID; MARCOS DEON VILELA DE RESENDE, CNPCa; MOYSES NASCIMENTO, UNIVERSIDADE FEDERAL DE VIÇOSA; ANA CAROLINA CAMPANA NASCIMENTO, UNIVERSIDADE FEDERAL DE VIÇOSA; PATRICIO R. MUNOZ, UNIVERSITY OF FLORID.
dc.contributor.author.fl_str_mv	AZEVEDO, C. F. FERRÃO, L. F. V. BENEVENUTO, J. RESENDE, M. D. V. de NASCIMENTO, M. NASCIMENTO, A. C. C. MUNOZ, P. R.
dc.subject.por.fl_str_mv	Plant breeding Animal breeding Bayesian theory Genome Inheritance (genetics) Phenotype
topic	Plant breeding Animal breeding Bayesian theory Genome Inheritance (genetics) Phenotype
description	An approach for handling visual scores with potential errors and subjectivity in scores was evaluated in simulated and blueberry recurrent selection breeding schemes to assist breeders in their decision-making. Most genomic prediction methods are based on assumptions of normality due to their simplicity and ease of implementation. However, in plant and animal breeding, continuous traits are often visually scored as categorical traits and analyzed as a Gaussian variable, thus violating the normality assumption, which could affect the prediction of breeding values and the estimation of genetic parameters. In this study, we examined the main challenges of visual scores for genomic prediction and genetic parameter estimation using mixed models, Bayesian, and machine learning methods. We evaluated these approaches using simulated and real breeding data sets. Our contribution in this study is a five-fold demonstration: (i) collecting data using an intermediate number of categories (1-3 and 1-5) is the best strategy, even considering errors associated with visual scores; (ii) Linear Mixed Models and Bayesian Linear Regression are robust to the normality violation, but marginal gains can be achieved when using Bayesian Ordinal Regression Models (BORM) and Random Forest Classification; (iii) genetic parameters are better estimated using BORM; (iv) our conclusions using simulated data are also applicable to real data in autotetraploid blueberry; and (v) a comparison of continuous and categorical phenotypes found that investing in the evaluation of 600-1000 categorical data points with low error, when it is not feasible to collect continuous phenotypes, is a strategy for improving predictive abilities. Our findings suggest the best approaches for effectively using visual scores traits to explore genetic information in breeding programs and highlight the importance of investing in the training of evaluator teams and in high-quality phenotyping.
publishDate	2024
dc.date.none.fl_str_mv	2024-01-03T13:32:22Z 2024-01-03T13:32:22Z 2024-01-03 2024
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	Theoretical and Applied Genetics, v. 137, n. 1, 2024. http://www.alice.cnptia.embrapa.br/alice/handle/doc/1160409 https://doi.org/10.1007/s00122-023-04512-w
identifier_str_mv	Theoretical and Applied Genetics, v. 137, n. 1, 2024.
url	http://www.alice.cnptia.embrapa.br/alice/handle/doc/1160409 https://doi.org/10.1007/s00122-023-04512-w
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	16 p.
dc.source.none.fl_str_mv	reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa) instacron:EMBRAPA
instname_str	Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron_str	EMBRAPA
institution	EMBRAPA
reponame_str	Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
collection	Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository.name.fl_str_mv	Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
repository.mail.fl_str_mv	cg-riaa@embrapa.br
_version_	1817695690955948032

Using visual scores for genomic prediction of complex traits in breeding programs.

Registros relacionados