Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill

Detalhes bibliográficos
Autor(a) principal: Rocha, Lucas Fernandes [UNESP]
Data de Publicação: 2022
Tipo de documento: Tese
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://hdl.handle.net/11449/234813
Resumo: Molecular markers that are widely distributed throughout the genome offer a fundamental tool to optimize forest tree breeding programs. This study aimed to evaluate the genetic architecture of quantitative genes and optimize genomic selection models for growth and wood-quality traits of Eucalyptus grandis. We evaluated an open-pollinated breeding population with 1,772 genotypes, composed of 27 different families, that was established using complete randomized block design with 20 replicates each. Individuals were genotyped using the Illumina Infinium EuCHIP60K chip and 12 different phenotypic variables were evaluated for growth traits (diameter at breast height, height, and volume evaluated at 3 and 6 years after planting) and wood-quality traits (pure cellulose yield, basic wood density, syringyl/guayacil, soluble lignin, total solids, and total extractives). First, we performed a genome-wide association study (GWAS) using the single-trait model (farmCPU) and multi-trait (MTMM) mixed models. Next, we searched for quantitative trait loci (QTLs) and their predicted functional effects using a database for Eucalyptus. Subsequently, the accuracy of the prediction ability, coincidence of selection, and selection gains of genomic selection models were analyzed based on the Genomic Best Linear Unbiased Prediction (GBLUP) method. We tested different approaches considering the additive variance, additive-dominant variance, optimization of training set, and multi-trait models. Finally, we analyzed the efficiency of using growth traits to increase the prediction ability of wood-quality traits considering a multi-trait model with optimization of training set methodology. After quality control, a total of 21,254 informative SNPs were found that have a wide distribution and a high linkage disequilibrium decay across the 11 chromosomes. For the GWAS analysis, the farmCPU model identified 43 and 38 small effect markers that are significantly associated with growth and wood quality traits, respectively. Similarly, pleiotropic SNPs were also discovered between growth (24) and wood quality traits (6) using the MTMM model. Through gene ontology analysis, we identified genes responsible for plant growth and related with hydric stress. For the genomic selection analysis, growth traits appeared to be more influenced by dominance than wood quality traits, meanwhile GBLUP models were effective in predicting wood quality traits. Although the results for CS appear to be low, SG values were relatively high. The optimization of the training set analysis effectively selected the best genotypes to be used as the training set. Additionally, the multi-trait and multi-trait with optimization of the training set were also able to increase the prediction ability of the GBLUP models. Thus, information from growth traits can be used to effectively increase the prediction ability of wood quality traits. Our study demonstrates the complex nature of quantitative traits, provides new evidence for the architecture of genes related to trait expression, and highlights the efficiency of genomic selection models to predict phenotypic expression in E. grandis.
id UNSP_1f3502b9190adbf00a78c762f9cad4e3
oai_identifier_str oai:repositorio.unesp.br:11449/234813
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. HillAssociação genômica ampla e otimização de acurácias de predição em modelos de seleção genômica para Eucalyptus grandis W. Hillgenome-wide associationgenomic selectionmulti-trait analysisforest breedingeucalyptassociação genômica amplaseleção genômicaanálise multi-traitmelhoramento florestaleucaliptoMolecular markers that are widely distributed throughout the genome offer a fundamental tool to optimize forest tree breeding programs. This study aimed to evaluate the genetic architecture of quantitative genes and optimize genomic selection models for growth and wood-quality traits of Eucalyptus grandis. We evaluated an open-pollinated breeding population with 1,772 genotypes, composed of 27 different families, that was established using complete randomized block design with 20 replicates each. Individuals were genotyped using the Illumina Infinium EuCHIP60K chip and 12 different phenotypic variables were evaluated for growth traits (diameter at breast height, height, and volume evaluated at 3 and 6 years after planting) and wood-quality traits (pure cellulose yield, basic wood density, syringyl/guayacil, soluble lignin, total solids, and total extractives). First, we performed a genome-wide association study (GWAS) using the single-trait model (farmCPU) and multi-trait (MTMM) mixed models. Next, we searched for quantitative trait loci (QTLs) and their predicted functional effects using a database for Eucalyptus. Subsequently, the accuracy of the prediction ability, coincidence of selection, and selection gains of genomic selection models were analyzed based on the Genomic Best Linear Unbiased Prediction (GBLUP) method. We tested different approaches considering the additive variance, additive-dominant variance, optimization of training set, and multi-trait models. Finally, we analyzed the efficiency of using growth traits to increase the prediction ability of wood-quality traits considering a multi-trait model with optimization of training set methodology. After quality control, a total of 21,254 informative SNPs were found that have a wide distribution and a high linkage disequilibrium decay across the 11 chromosomes. For the GWAS analysis, the farmCPU model identified 43 and 38 small effect markers that are significantly associated with growth and wood quality traits, respectively. Similarly, pleiotropic SNPs were also discovered between growth (24) and wood quality traits (6) using the MTMM model. Through gene ontology analysis, we identified genes responsible for plant growth and related with hydric stress. For the genomic selection analysis, growth traits appeared to be more influenced by dominance than wood quality traits, meanwhile GBLUP models were effective in predicting wood quality traits. Although the results for CS appear to be low, SG values were relatively high. The optimization of the training set analysis effectively selected the best genotypes to be used as the training set. Additionally, the multi-trait and multi-trait with optimization of the training set were also able to increase the prediction ability of the GBLUP models. Thus, information from growth traits can be used to effectively increase the prediction ability of wood quality traits. Our study demonstrates the complex nature of quantitative traits, provides new evidence for the architecture of genes related to trait expression, and highlights the efficiency of genomic selection models to predict phenotypic expression in E. grandis.O uso de marcadores moleculares amplamente distribuídos ao longo do genoma são uma ferramenta fundamental para otimizar programas de melhoramento florestal. Esta pesquisa teve como objetivos avaliar a arquitetura genética de caracteres quantitativos e otimizar modelos de seleção genômica de Eucalyptus grandis para variáveis de crescimento e qualidade da madeira. Dessa forma, foi avaliada uma população de polinização aberta de E. grandis composta por 1.772 genótipos e provenientes de 27 famílias estabelecidas usando um delineamento de blocos ao acaso com 20 plantas/parcela. Os indivíduos foram genotipados usando o chip Illumina Infinium EuCHIP60K e 12 caracteres fenotípicos foram avaliados e classificados em caracteres de crescimento (diâmetro à altura do peito, altura e volume aos três e seis anos após o plantio) e qualidade da madeira (produção de celulose pura, densidade básica da madeira, relação siringil/guayacil, lignina solúvel, sólidos totais e extrativos totais). Primeiramente, foi realizada uma análise de associação genômica ampla (GWAS) usando modelos mistos de single-trait (farmCPU) e multi-trait (MTMM). Em seguida, foram identificados lócus de caracteres quantitativos (QTLs) utilizando o banco de anotações para Eucalyptus. Posteriormente, a habilidade de predição, a coincidência de seleção e os ganhos de seleção de modelos de seleção genômica foram analisados utilizando a metodologia Genomic Best Linear Unbiased Prediction (GBLUP). Foram testadas diferentes abordagens considerando apenas a variância aditiva, as variâncias aditivas-dominantes, modelos de otimização da população de treinamento e modelos multi-trait. Finalmente, foi avaliado a efetividade da utilização de caracteres fenotípicos de crescimento para aumentar a habilidade de predição de caracteres de qualidade da madeira usando uma metodologia conjunta entre multi-trait e otimização da população de treinamento. Após o controle de qualidade, um total de 21.254 SNPs informativos foram encontrados com ampla distribuição e alto decaimento de desequilíbrio de ligação nos 11 cromossomos. Considerando a análise GWAS, o modelo farmCPU identificou 43 e 38 marcadores de pequeno efeito significativamente associados às variáveis nas classes crescimento e de qualidade da madeira, respectivamente. Semelhantemente, marcadores pleiotrópicos também foram identificados entre caracteres crescimento (24) e de qualidade da madeira (6) usando o modelo MTMM. A análise da ontologia genética identificou diversos genes responsáveis pelo crescimento celular e associados ao stress hídrico. Considerando a análise de seleção genômica, os caracteres de crescimento foram mais influenciados pela dominância. Por outro lado, os modelos GBLUP foram eficientes para predizer caracteres de qualidade da madeira. Embora a coincidência de seleção pareça ter valores baixos, os valores de ganhos de seleção encontrados foram relativamente altos. A análise otimização da população de treinamento foi eficiente para selecionar os melhores genótipos a serem utilizados como conjunto de treinamento. Adicionalmente, as análises multi-trait e multi-trait com otimização da população de treinamento também foram eficientes para aumentar a habilidade de predição dos modelos GBLUP. Dessa forma, o uso de informações do crescimento pode ser usado de forma eficiente para aumentar a habilidade de predição dos caracteres de qualidade da madeira. Nosso estudo demonstra que a natureza caracteres quantitativos fornece novas evidências para a arquitetura de genes relacionados à expressão de caracteres, bem como a eficiência de modelos seleção genômica para prever a expressão fenotípica em E. grandis.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Universidade Estadual Paulista (Unesp)Tambarussi, Evandro VagnerJamarillo, Juan Jose AcostaFritsche-Neto, RobertoBenatti, Thiago RomanosUniversidade Estadual Paulista (Unesp)Rocha, Lucas Fernandes [UNESP]2022-05-20T20:22:06Z2022-05-20T20:22:06Z2022-03-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://hdl.handle.net/11449/23481333004064082P6enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESP2024-05-03T12:48:39Zoai:repositorio.unesp.br:11449/234813Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462024-05-03T12:48:39Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
Associação genômica ampla e otimização de acurácias de predição em modelos de seleção genômica para Eucalyptus grandis W. Hill
title Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
spellingShingle Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
Rocha, Lucas Fernandes [UNESP]
genome-wide association
genomic selection
multi-trait analysis
forest breeding
eucalypt
associação genômica ampla
seleção genômica
análise multi-trait
melhoramento florestal
eucalipto
title_short Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
title_full Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
title_fullStr Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
title_full_unstemmed Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
title_sort Genome-wide association and optimization of prediction accuracies on genomic selection models for Eucalyptus grandis W. Hill
author Rocha, Lucas Fernandes [UNESP]
author_facet Rocha, Lucas Fernandes [UNESP]
author_role author
dc.contributor.none.fl_str_mv Tambarussi, Evandro Vagner
Jamarillo, Juan Jose Acosta
Fritsche-Neto, Roberto
Benatti, Thiago Romanos
Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Rocha, Lucas Fernandes [UNESP]
dc.subject.por.fl_str_mv genome-wide association
genomic selection
multi-trait analysis
forest breeding
eucalypt
associação genômica ampla
seleção genômica
análise multi-trait
melhoramento florestal
eucalipto
topic genome-wide association
genomic selection
multi-trait analysis
forest breeding
eucalypt
associação genômica ampla
seleção genômica
análise multi-trait
melhoramento florestal
eucalipto
description Molecular markers that are widely distributed throughout the genome offer a fundamental tool to optimize forest tree breeding programs. This study aimed to evaluate the genetic architecture of quantitative genes and optimize genomic selection models for growth and wood-quality traits of Eucalyptus grandis. We evaluated an open-pollinated breeding population with 1,772 genotypes, composed of 27 different families, that was established using complete randomized block design with 20 replicates each. Individuals were genotyped using the Illumina Infinium EuCHIP60K chip and 12 different phenotypic variables were evaluated for growth traits (diameter at breast height, height, and volume evaluated at 3 and 6 years after planting) and wood-quality traits (pure cellulose yield, basic wood density, syringyl/guayacil, soluble lignin, total solids, and total extractives). First, we performed a genome-wide association study (GWAS) using the single-trait model (farmCPU) and multi-trait (MTMM) mixed models. Next, we searched for quantitative trait loci (QTLs) and their predicted functional effects using a database for Eucalyptus. Subsequently, the accuracy of the prediction ability, coincidence of selection, and selection gains of genomic selection models were analyzed based on the Genomic Best Linear Unbiased Prediction (GBLUP) method. We tested different approaches considering the additive variance, additive-dominant variance, optimization of training set, and multi-trait models. Finally, we analyzed the efficiency of using growth traits to increase the prediction ability of wood-quality traits considering a multi-trait model with optimization of training set methodology. After quality control, a total of 21,254 informative SNPs were found that have a wide distribution and a high linkage disequilibrium decay across the 11 chromosomes. For the GWAS analysis, the farmCPU model identified 43 and 38 small effect markers that are significantly associated with growth and wood quality traits, respectively. Similarly, pleiotropic SNPs were also discovered between growth (24) and wood quality traits (6) using the MTMM model. Through gene ontology analysis, we identified genes responsible for plant growth and related with hydric stress. For the genomic selection analysis, growth traits appeared to be more influenced by dominance than wood quality traits, meanwhile GBLUP models were effective in predicting wood quality traits. Although the results for CS appear to be low, SG values were relatively high. The optimization of the training set analysis effectively selected the best genotypes to be used as the training set. Additionally, the multi-trait and multi-trait with optimization of the training set were also able to increase the prediction ability of the GBLUP models. Thus, information from growth traits can be used to effectively increase the prediction ability of wood quality traits. Our study demonstrates the complex nature of quantitative traits, provides new evidence for the architecture of genes related to trait expression, and highlights the efficiency of genomic selection models to predict phenotypic expression in E. grandis.
publishDate 2022
dc.date.none.fl_str_mv 2022-05-20T20:22:06Z
2022-05-20T20:22:06Z
2022-03-29
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/11449/234813
33004064082P6
url http://hdl.handle.net/11449/234813
identifier_str_mv 33004064082P6
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Estadual Paulista (Unesp)
publisher.none.fl_str_mv Universidade Estadual Paulista (Unesp)
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv repositoriounesp@unesp.br
_version_ 1826304669589176320