Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize

Detalhes bibliográficos
Autor(a) principal: Gevartosky, Raysa
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: https://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/
Resumo: Genomic prediction (GP) success is directly dependent on establishing a training population. Incorporating high-quality envirotyping data increases the efficiency of GP models, especially for multi-environment trials, and provides a better explanation of variation sources. Thus, it can help on multi-trait multi-environment trials (MTMET) by improving predictive ability (PA), selecting information more assertively, and capturing relationships between environments and genotypes. Therefore, in this study, we aimed to design optimized training sets for MTMET. The phenotypic labor is diminished due to lower but optimally selected population sizes while keeping the predictive ability at satisfactory levels. For that, we evaluated the predictive ability of five GP models using the Genomic Best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the gold standard and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET, for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Then, we performed GP and accessed its PA and genetic gain per amount invested. Subsequently, we compared benchmarks with OTS regarding the PA and genetic improvement per unit invested. Considering the best scenario for OTS, which included the GWT kernel, there was a reduction of up to 60% in terms of PA. On the other hand, it was possible to reduce the number of plot:traits to be phenotyped up to 98%. Furthermore, using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data are efficient in designing optimized training sets for genomic prediction and improve the genetic gains per dollar invested. Although, it is worth remembering that exist specific interactions within datasets that should not be ignored when using the proposed approach.
id USP_5ec0f65b4a725a59852d1d30e3d1d3b4
oai_identifier_str oai:teses.usp.br:tde-07012022-094055
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maizeKernels com base ambiental otimizam a alocação de recursos com predição genômica multi-características e multi-ambientes para milho tropicalCapacidade preditivaCaracterização ambientalEnvirotypingGenomic selectionOptimized training setPopulação de treinamento otimizadaPredictive abilityResponse to selectionResposta à seleçãoSeleção genômicaGenomic prediction (GP) success is directly dependent on establishing a training population. Incorporating high-quality envirotyping data increases the efficiency of GP models, especially for multi-environment trials, and provides a better explanation of variation sources. Thus, it can help on multi-trait multi-environment trials (MTMET) by improving predictive ability (PA), selecting information more assertively, and capturing relationships between environments and genotypes. Therefore, in this study, we aimed to design optimized training sets for MTMET. The phenotypic labor is diminished due to lower but optimally selected population sizes while keeping the predictive ability at satisfactory levels. For that, we evaluated the predictive ability of five GP models using the Genomic Best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the gold standard and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET, for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Then, we performed GP and accessed its PA and genetic gain per amount invested. Subsequently, we compared benchmarks with OTS regarding the PA and genetic improvement per unit invested. Considering the best scenario for OTS, which included the GWT kernel, there was a reduction of up to 60% in terms of PA. On the other hand, it was possible to reduce the number of plot:traits to be phenotyped up to 98%. Furthermore, using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data are efficient in designing optimized training sets for genomic prediction and improve the genetic gains per dollar invested. Although, it is worth remembering that exist specific interactions within datasets that should not be ignored when using the proposed approach.O sucesso da predição genômica (GP) é diretamente dependente do estabelecimento de uma população de treinamento. A incorporação de dados de caracterização ambiental de alta qualidade aumenta a eficiência dos modelos GP, especialmente para ensaios em múltiplos ambientes, e fornece uma melhor explicação de fontes de variação. Assim, a caracterização ambiental pode ajudar em ensaios multi-características e multi-ambientais (MTMET), melhorando a capacidade preditiva (PA), selecionando informações de forma mais assertiva e capturando relações entre ambientes e genótipos. Portanto, neste estudo, objetivamos formar populações de treinamento otimizadas para MTMET. O trabalho de fenotipagem é diminuído devido a tamanhos populacionais menores, mas com indivíduos selecionados de forma otimizada, mantendo a capacidade preditiva em níveis satisfatórios. Para isso, avaliamos a capacidade preditiva de cinco modelos de GP usando o modelo GBLUP com efeitos aditivos e de dominância (M1) como padrão e, em seguida, adicionando interação genótipo por ambiente (G × E) (M2) , dados ambientais (W) (M3), W + G × E (M4) e, finalmente, W + G × W (M5), onde G × W denota a interação entre o genótipo e dados ambientais. Além disso, consideramos ensaios multi-ambientais de característica única (STMET) e MTMET, para três características: produtividade de grãos (GY), altura da planta (PH) e altura da espiga (EH), com dois conjuntos de dados e dois esquemas de validação cruzada. Posteriormente, construímos dois kernels para a interação de genótipo por ambiente por característica (GET) e interação de genótipo por dados ambientais por característica (GWT) para aplicar algoritmos genéticos e selecionar combinações de genótipo: ambiente: característica que representam 98% da variação existente no conjunto de dados e então formar a população de treinamento otimizada (OTS). Em seguida, realizamos GP e avaliamos sua PA e ganho genético por valor investido. Posteriormente, comparamos o cenário padrão (MTMET CV2) com as OTS em relação à PA e ganho genético por valor investido. Considerando o melhor cenário para OTS, que incluí o kernel GWT, houve uma redução de até 60% em termos de PA. Por outro lado, foi possível reduzir o número de parcelas: características a serem fenotipadas em até 98%. Além disso, utilizando OTS com base em dados ambientais, foi possível aumentar a resposta à seleção por valor investido em 142%. Dessa forma, nossos resultados sugerem que algoritmos genéticos de otimização associados a dados genômicos e ambientais são eficientes em formar populações de treinamento otimizadas para predição genômica e melhorar as respostas à seleção por dólar investido. Porém, é importante lembrar que existem interações específicas dentro dos conjuntos de dados que não devem ser ignoradas ao utilizar a abordagem proposta.Biblioteca Digitais de Teses e Dissertações da USPFritsche Neto, RobertoGevartosky, Raysa2021-04-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2022-01-07T19:46:02Zoai:teses.usp.br:tde-07012022-094055Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212022-01-07T19:46:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
Kernels com base ambiental otimizam a alocação de recursos com predição genômica multi-características e multi-ambientes para milho tropical
title Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
spellingShingle Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
Gevartosky, Raysa
Capacidade preditiva
Caracterização ambiental
Envirotyping
Genomic selection
Optimized training set
População de treinamento otimizada
Predictive ability
Response to selection
Resposta à seleção
Seleção genômica
title_short Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
title_full Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
title_fullStr Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
title_full_unstemmed Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
title_sort Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
author Gevartosky, Raysa
author_facet Gevartosky, Raysa
author_role author
dc.contributor.none.fl_str_mv Fritsche Neto, Roberto
dc.contributor.author.fl_str_mv Gevartosky, Raysa
dc.subject.por.fl_str_mv Capacidade preditiva
Caracterização ambiental
Envirotyping
Genomic selection
Optimized training set
População de treinamento otimizada
Predictive ability
Response to selection
Resposta à seleção
Seleção genômica
topic Capacidade preditiva
Caracterização ambiental
Envirotyping
Genomic selection
Optimized training set
População de treinamento otimizada
Predictive ability
Response to selection
Resposta à seleção
Seleção genômica
description Genomic prediction (GP) success is directly dependent on establishing a training population. Incorporating high-quality envirotyping data increases the efficiency of GP models, especially for multi-environment trials, and provides a better explanation of variation sources. Thus, it can help on multi-trait multi-environment trials (MTMET) by improving predictive ability (PA), selecting information more assertively, and capturing relationships between environments and genotypes. Therefore, in this study, we aimed to design optimized training sets for MTMET. The phenotypic labor is diminished due to lower but optimally selected population sizes while keeping the predictive ability at satisfactory levels. For that, we evaluated the predictive ability of five GP models using the Genomic Best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the gold standard and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET, for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Then, we performed GP and accessed its PA and genetic gain per amount invested. Subsequently, we compared benchmarks with OTS regarding the PA and genetic improvement per unit invested. Considering the best scenario for OTS, which included the GWT kernel, there was a reduction of up to 60% in terms of PA. On the other hand, it was possible to reduce the number of plot:traits to be phenotyped up to 98%. Furthermore, using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data are efficient in designing optimized training sets for genomic prediction and improve the genetic gains per dollar invested. Although, it is worth remembering that exist specific interactions within datasets that should not be ignored when using the proposed approach.
publishDate 2021
dc.date.none.fl_str_mv 2021-04-20
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/
url https://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1809090468084973568