Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/ |
Resumo: | Genomic prediction (GP) success is directly dependent on establishing a training population. Incorporating high-quality envirotyping data increases the efficiency of GP models, especially for multi-environment trials, and provides a better explanation of variation sources. Thus, it can help on multi-trait multi-environment trials (MTMET) by improving predictive ability (PA), selecting information more assertively, and capturing relationships between environments and genotypes. Therefore, in this study, we aimed to design optimized training sets for MTMET. The phenotypic labor is diminished due to lower but optimally selected population sizes while keeping the predictive ability at satisfactory levels. For that, we evaluated the predictive ability of five GP models using the Genomic Best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the gold standard and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET, for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Then, we performed GP and accessed its PA and genetic gain per amount invested. Subsequently, we compared benchmarks with OTS regarding the PA and genetic improvement per unit invested. Considering the best scenario for OTS, which included the GWT kernel, there was a reduction of up to 60% in terms of PA. On the other hand, it was possible to reduce the number of plot:traits to be phenotyped up to 98%. Furthermore, using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data are efficient in designing optimized training sets for genomic prediction and improve the genetic gains per dollar invested. Although, it is worth remembering that exist specific interactions within datasets that should not be ignored when using the proposed approach. |
id |
USP_5ec0f65b4a725a59852d1d30e3d1d3b4 |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-07012022-094055 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maizeKernels com base ambiental otimizam a alocação de recursos com predição genômica multi-características e multi-ambientes para milho tropicalCapacidade preditivaCaracterização ambientalEnvirotypingGenomic selectionOptimized training setPopulação de treinamento otimizadaPredictive abilityResponse to selectionResposta à seleçãoSeleção genômicaGenomic prediction (GP) success is directly dependent on establishing a training population. Incorporating high-quality envirotyping data increases the efficiency of GP models, especially for multi-environment trials, and provides a better explanation of variation sources. Thus, it can help on multi-trait multi-environment trials (MTMET) by improving predictive ability (PA), selecting information more assertively, and capturing relationships between environments and genotypes. Therefore, in this study, we aimed to design optimized training sets for MTMET. The phenotypic labor is diminished due to lower but optimally selected population sizes while keeping the predictive ability at satisfactory levels. For that, we evaluated the predictive ability of five GP models using the Genomic Best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the gold standard and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET, for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Then, we performed GP and accessed its PA and genetic gain per amount invested. Subsequently, we compared benchmarks with OTS regarding the PA and genetic improvement per unit invested. Considering the best scenario for OTS, which included the GWT kernel, there was a reduction of up to 60% in terms of PA. On the other hand, it was possible to reduce the number of plot:traits to be phenotyped up to 98%. Furthermore, using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data are efficient in designing optimized training sets for genomic prediction and improve the genetic gains per dollar invested. Although, it is worth remembering that exist specific interactions within datasets that should not be ignored when using the proposed approach.O sucesso da predição genômica (GP) é diretamente dependente do estabelecimento de uma população de treinamento. A incorporação de dados de caracterização ambiental de alta qualidade aumenta a eficiência dos modelos GP, especialmente para ensaios em múltiplos ambientes, e fornece uma melhor explicação de fontes de variação. Assim, a caracterização ambiental pode ajudar em ensaios multi-características e multi-ambientais (MTMET), melhorando a capacidade preditiva (PA), selecionando informações de forma mais assertiva e capturando relações entre ambientes e genótipos. Portanto, neste estudo, objetivamos formar populações de treinamento otimizadas para MTMET. O trabalho de fenotipagem é diminuído devido a tamanhos populacionais menores, mas com indivíduos selecionados de forma otimizada, mantendo a capacidade preditiva em níveis satisfatórios. Para isso, avaliamos a capacidade preditiva de cinco modelos de GP usando o modelo GBLUP com efeitos aditivos e de dominância (M1) como padrão e, em seguida, adicionando interação genótipo por ambiente (G × E) (M2) , dados ambientais (W) (M3), W + G × E (M4) e, finalmente, W + G × W (M5), onde G × W denota a interação entre o genótipo e dados ambientais. Além disso, consideramos ensaios multi-ambientais de característica única (STMET) e MTMET, para três características: produtividade de grãos (GY), altura da planta (PH) e altura da espiga (EH), com dois conjuntos de dados e dois esquemas de validação cruzada. Posteriormente, construímos dois kernels para a interação de genótipo por ambiente por característica (GET) e interação de genótipo por dados ambientais por característica (GWT) para aplicar algoritmos genéticos e selecionar combinações de genótipo: ambiente: característica que representam 98% da variação existente no conjunto de dados e então formar a população de treinamento otimizada (OTS). Em seguida, realizamos GP e avaliamos sua PA e ganho genético por valor investido. Posteriormente, comparamos o cenário padrão (MTMET CV2) com as OTS em relação à PA e ganho genético por valor investido. Considerando o melhor cenário para OTS, que incluí o kernel GWT, houve uma redução de até 60% em termos de PA. Por outro lado, foi possível reduzir o número de parcelas: características a serem fenotipadas em até 98%. Além disso, utilizando OTS com base em dados ambientais, foi possível aumentar a resposta à seleção por valor investido em 142%. Dessa forma, nossos resultados sugerem que algoritmos genéticos de otimização associados a dados genômicos e ambientais são eficientes em formar populações de treinamento otimizadas para predição genômica e melhorar as respostas à seleção por dólar investido. Porém, é importante lembrar que existem interações específicas dentro dos conjuntos de dados que não devem ser ignoradas ao utilizar a abordagem proposta.Biblioteca Digitais de Teses e Dissertações da USPFritsche Neto, RobertoGevartosky, Raysa2021-04-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2022-01-07T19:46:02Zoai:teses.usp.br:tde-07012022-094055Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212022-01-07T19:46:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize Kernels com base ambiental otimizam a alocação de recursos com predição genômica multi-características e multi-ambientes para milho tropical |
title |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize |
spellingShingle |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize Gevartosky, Raysa Capacidade preditiva Caracterização ambiental Envirotyping Genomic selection Optimized training set População de treinamento otimizada Predictive ability Response to selection Resposta à seleção Seleção genômica |
title_short |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize |
title_full |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize |
title_fullStr |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize |
title_full_unstemmed |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize |
title_sort |
Enviromic-based kernels optimize resource allocation with multi-trait multi-environment genomic prediction for tropical maize |
author |
Gevartosky, Raysa |
author_facet |
Gevartosky, Raysa |
author_role |
author |
dc.contributor.none.fl_str_mv |
Fritsche Neto, Roberto |
dc.contributor.author.fl_str_mv |
Gevartosky, Raysa |
dc.subject.por.fl_str_mv |
Capacidade preditiva Caracterização ambiental Envirotyping Genomic selection Optimized training set População de treinamento otimizada Predictive ability Response to selection Resposta à seleção Seleção genômica |
topic |
Capacidade preditiva Caracterização ambiental Envirotyping Genomic selection Optimized training set População de treinamento otimizada Predictive ability Response to selection Resposta à seleção Seleção genômica |
description |
Genomic prediction (GP) success is directly dependent on establishing a training population. Incorporating high-quality envirotyping data increases the efficiency of GP models, especially for multi-environment trials, and provides a better explanation of variation sources. Thus, it can help on multi-trait multi-environment trials (MTMET) by improving predictive ability (PA), selecting information more assertively, and capturing relationships between environments and genotypes. Therefore, in this study, we aimed to design optimized training sets for MTMET. The phenotypic labor is diminished due to lower but optimally selected population sizes while keeping the predictive ability at satisfactory levels. For that, we evaluated the predictive ability of five GP models using the Genomic Best linear unbiased predictor model (GBLUP) with additive + dominance effects (M1) as the gold standard and then adding genotype by environment interaction (G × E) (M2), enviromic data (W) (M3), W+G × E (M4), and finally W+G × W (M5), where G × W denotes the genotype by enviromic interaction. Moreover, we considered single-trait multi-environment trials (STMET) and MTMET, for three traits: grain yield (GY), plant height (PH), and ear height (EH), with two datasets and two cross-validation schemes. Afterward, we built two kernels for genotype by environment by trait interaction (GET) and genotype by enviromic by trait interaction (GWT) to apply genetic algorithms to select genotype:environment:trait combinations that represent 98% of the variation of the whole dataset and composed the optimized training set (OTS). Then, we performed GP and accessed its PA and genetic gain per amount invested. Subsequently, we compared benchmarks with OTS regarding the PA and genetic improvement per unit invested. Considering the best scenario for OTS, which included the GWT kernel, there was a reduction of up to 60% in terms of PA. On the other hand, it was possible to reduce the number of plot:traits to be phenotyped up to 98%. Furthermore, using OTS based on enviromic data, it was possible to increase the response to selection per amount invested by 142%. Consequently, our results suggested that genetic algorithms of optimization associated with genomic and enviromic data are efficient in designing optimized training sets for genomic prediction and improve the genetic gains per dollar invested. Although, it is worth remembering that exist specific interactions within datasets that should not be ignored when using the proposed approach. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-04-20 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/ |
url |
https://www.teses.usp.br/teses/disponiveis/11/11137/tde-07012022-094055/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1815256738292563968 |