Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente
Autor(a) principal: | |
---|---|
Data de Publicação: | 2017 |
Tipo de documento: | Tese |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFLA |
Texto Completo: | http://repositorio.ufla.br/jspui/handle/1/12273 |
Resumo: | One of the main challenges in plant breeding programs is the efficient study of the genotypes x environments interaction (GEI). The presence of significant GE interaction hinders the work of the breeder for the recommendation and selection of superior genotypes. Among the various statistical procedures developed for this purpose, special emphasis should be given to those based on mixed models through factor analysis, commonly referred to as factor-analytic model (FA). This consists of a parsimonious approach and presents suggestive advantages when compared with classical methodologies, such as the great flexibility to deal with unbalanced data and heterogeneous variances. However, some problems are related to the factor analytic model: computational cost in analyzes with large number of environments and the Heywood cases, which makes the model unidentifiable. Moreover, the model representation in conventional biplot does not include any measure of uncertainty regarding the scores that describe GEI effect or Genotype (G) + GEI effects, plotted. The present proposal seeks to describe general forms of how heterogeneity of genetic and residual covariance can be modeled from the perspective of factorial analysis in mixed models, using spectral decomposition of the genetic effects within Bayesian approach, different from other procedures present in the literature in which the factor loads are directly sampled. In addition, the objective was to develop a procedure to incorporate inference to the biplot, through the construction of regions of credibility for genotypic and environmental scores. In this study, spherical distributions were assumed as prioris for eigenvectors and truncated normal distribution for singular values, as well as scaled inverse chi-squared distribution for residual variances and non-informative priori for the effect of genotypes. This approach differs from the Bayesian methods presented so far that assume the same constraints present in the mixed effects model. To exemplify the proposed method, we used simulated data and real data which study variable is the yield of spikes in t.ha-1. Samples for the inference process were obtained directly using the Gibbs sampler. Random unbalancing was performed in the data considering levels of 10%, 33% and 50% of losses of the genotype in the environment. According to the results, the FA analysis with two loads presented higher predictive capacity than the competing models. Unbalancing of 10% and 33% had mean values of correlation above 0.40 and with 50%, of 0.46. It was also observed that the performance of the model was better in the order of 50%, 33% and 10% of imbalance. We also verified that the analysis with the Bayesian FA model is robust under large levels of data unbalance. A relevant detail in this study concerns the selection of models, which proved not to be a trivial task in the case of real data, requiring additional criteria. In addition, the model proposed in this work showed greater predictive capacity than the equivalent frequentist model and the parameters were adequately estimated, being identifiable, without the need for rotationality of factor loads or imposition of restrictions, which represents a great advantage of the method proposed here. |
id |
UFLA_f8a484d24448966e1338a63cef50c9cf |
---|---|
oai_identifier_str |
oai:localhost:1/12273 |
network_acronym_str |
UFLA |
network_name_str |
Repositório Institucional da UFLA |
repository_id_str |
|
spelling |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambienteBayesian factor analytic model applied to multi-environment trialsPlantas – Melhoramento genético – Métodos estatísticosInteração genótipo-ambienteModelo fatorial analíticoTeoria bayesiana de decisão estatísticaPlant breeding – Statistical methodsGenotype-environment interactionFactor-analytic modelBayesian statistical decision theoryEstatísticaGenética VegetalOne of the main challenges in plant breeding programs is the efficient study of the genotypes x environments interaction (GEI). The presence of significant GE interaction hinders the work of the breeder for the recommendation and selection of superior genotypes. Among the various statistical procedures developed for this purpose, special emphasis should be given to those based on mixed models through factor analysis, commonly referred to as factor-analytic model (FA). This consists of a parsimonious approach and presents suggestive advantages when compared with classical methodologies, such as the great flexibility to deal with unbalanced data and heterogeneous variances. However, some problems are related to the factor analytic model: computational cost in analyzes with large number of environments and the Heywood cases, which makes the model unidentifiable. Moreover, the model representation in conventional biplot does not include any measure of uncertainty regarding the scores that describe GEI effect or Genotype (G) + GEI effects, plotted. The present proposal seeks to describe general forms of how heterogeneity of genetic and residual covariance can be modeled from the perspective of factorial analysis in mixed models, using spectral decomposition of the genetic effects within Bayesian approach, different from other procedures present in the literature in which the factor loads are directly sampled. In addition, the objective was to develop a procedure to incorporate inference to the biplot, through the construction of regions of credibility for genotypic and environmental scores. In this study, spherical distributions were assumed as prioris for eigenvectors and truncated normal distribution for singular values, as well as scaled inverse chi-squared distribution for residual variances and non-informative priori for the effect of genotypes. This approach differs from the Bayesian methods presented so far that assume the same constraints present in the mixed effects model. To exemplify the proposed method, we used simulated data and real data which study variable is the yield of spikes in t.ha-1. Samples for the inference process were obtained directly using the Gibbs sampler. Random unbalancing was performed in the data considering levels of 10%, 33% and 50% of losses of the genotype in the environment. According to the results, the FA analysis with two loads presented higher predictive capacity than the competing models. Unbalancing of 10% and 33% had mean values of correlation above 0.40 and with 50%, of 0.46. It was also observed that the performance of the model was better in the order of 50%, 33% and 10% of imbalance. We also verified that the analysis with the Bayesian FA model is robust under large levels of data unbalance. A relevant detail in this study concerns the selection of models, which proved not to be a trivial task in the case of real data, requiring additional criteria. In addition, the model proposed in this work showed greater predictive capacity than the equivalent frequentist model and the parameters were adequately estimated, being identifiable, without the need for rotationality of factor loads or imposition of restrictions, which represents a great advantage of the method proposed here.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Um dos principais desafios presentes em programas de melhoramento de plantas é o estudo eficiente da interação entre genótipos e ambientes (GEI). A presença de interação GE significativa dificulta o trabalho do meliorista para a seleção e recomendação ampla de genótipos superiores. Dentre os diversos procedimentos estatísticos desenvolvidos para esse fim, merecem especial destaque aqueles baseados em modelos mistos via análise de fatores, comumente referidos como modelo fatorial analítico (FA). Esse consiste em uma abordagem parcimoniosa e apresenta vantagens sugestivas quando comparadas com metodologias clássicas, como a grande flexibilidade para lidar com dados desbalanceados e com variâncias heterogêneas. Entretanto, alguns problemas estão relacionados ao modelo fatorial analítico: o custo computacional em análises com grande número de ambientes e os casos Heywood, que torna o modelo não identificável. Além do que, a representação do modelo em biplot convencional não comporta nenhuma medida da incerteza referente aos escores que descrevem o efeito da GEI ou efeitos de Genótipo (G) + GEI, plotados. A proposta aqui apresentada busca descrever formas gerais de como a heterogeneidade das covariâncias genéticas e residuais podem ser modeladas na perspectiva da análise fatorial em modelos mistos, utilizando-se decomposição espectral dos efeitos genéticos dentro da ótica Bayesiana, diferente de outros procedimentos presentes na literatura, em que as cargas fatoriais são amostradas diretamente. Além disso, objetivou-se desenvolver um procedimento para incorporar inferência ao biplot, por meio da construção de regiões de credibilidade para os escores genotípicos e ambientais. Neste estudo, foram assumidas distribuições esféricas como prioris para autovetores e normais truncadas para valores singulares, além de inversas qui-quadrados escaladas para as variâncias residuais e priori não informativa para o efeito de genótipos. Essa abordagem difere dos métodos bayesianos apresentados até o momento que assumem as mesmas restrições presentes no modelo para efeitos mistos. Para exemplificar o método proposto foram usados dados simulados e dados reais cuja variável em estudo é a produtividade de espigas em t.ha-1. As amostras para o processo de inferência foram obtidas diretamente, utilizando o amostrador de Gibbs. Realizaram-se desbalanceamentos aleatórios nos dados considerando níveis de 10%, 33% e 50% de perdas do genótipo no ambiente. De acordo com os resultados, a análise FA com duas cargas apresentou maior capacidade preditiva em relação aos modelos em competição. Desbalanceamentos de 10% e 33% apresentaram valores médios de correlação acima de 0,40 e com 50%, de 0,46. Observou-se também que o desempenho do modelo foi melhor na ordem de 50%, 33% e 10% de desbalanceamento. Verificou-se, ainda, que a análise com o modelo FA bayesiano é robusta sob grandes níveis de desbalanceamento dos dados. Um detalhe relevante nesse estudo diz respeito à seleção de modelos, que, no caso de dados reais, mostrou não ser uma tarefa trivial, necessitando de critérios adicionais. Além disso, o modelo proposto neste trabalho mostrou maior capacidade preditiva que o modelo frequentista equivalente e os parâmetros foram estimados adequadamente, sendo identificável, sem necessidade de rotacionalidade das cargas fatoriais ou de imposição de restrições, o que representa uma grande vantagem do método aqui proposto.Universidade Federal de LavrasPrograma de Pós-Graduação em Estatística e Experimentação AgropecuáriaUFLAbrasilDepartamento de Ciências ExatasBalestre, MárcioLima, Renato Ribeiro deBueno Filho, Júlio Sílvio de SouzaSafadi, ThelmaLima, Renato Ribeiro deToledo, Fernando H. R. BarrozoSilva, Alessandra Querino daNuvunga, Joel Jorge2017-02-16T11:34:45Z2017-02-16T11:34:45Z2017-02-152017-01-19info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfNUVUNGA, J. J. Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente. 2017. 129 p. Tese (Doutorado em Estatística e Experimentação Agropecuária)-Universidade Federal de Lavras, Lavras, 2017.http://repositorio.ufla.br/jspui/handle/1/12273porinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFLAinstname:Universidade Federal de Lavras (UFLA)instacron:UFLA2023-05-11T15:47:42Zoai:localhost:1/12273Repositório InstitucionalPUBhttp://repositorio.ufla.br/oai/requestnivaldo@ufla.br || repositorio.biblioteca@ufla.bropendoar:2023-05-11T15:47:42Repositório Institucional da UFLA - Universidade Federal de Lavras (UFLA)false |
dc.title.none.fl_str_mv |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente Bayesian factor analytic model applied to multi-environment trials |
title |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente |
spellingShingle |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente Nuvunga, Joel Jorge Plantas – Melhoramento genético – Métodos estatísticos Interação genótipo-ambiente Modelo fatorial analítico Teoria bayesiana de decisão estatística Plant breeding – Statistical methods Genotype-environment interaction Factor-analytic model Bayesian statistical decision theory Estatística Genética Vegetal |
title_short |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente |
title_full |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente |
title_fullStr |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente |
title_full_unstemmed |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente |
title_sort |
Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente |
author |
Nuvunga, Joel Jorge |
author_facet |
Nuvunga, Joel Jorge |
author_role |
author |
dc.contributor.none.fl_str_mv |
Balestre, Márcio Lima, Renato Ribeiro de Bueno Filho, Júlio Sílvio de Souza Safadi, Thelma Lima, Renato Ribeiro de Toledo, Fernando H. R. Barrozo Silva, Alessandra Querino da |
dc.contributor.author.fl_str_mv |
Nuvunga, Joel Jorge |
dc.subject.por.fl_str_mv |
Plantas – Melhoramento genético – Métodos estatísticos Interação genótipo-ambiente Modelo fatorial analítico Teoria bayesiana de decisão estatística Plant breeding – Statistical methods Genotype-environment interaction Factor-analytic model Bayesian statistical decision theory Estatística Genética Vegetal |
topic |
Plantas – Melhoramento genético – Métodos estatísticos Interação genótipo-ambiente Modelo fatorial analítico Teoria bayesiana de decisão estatística Plant breeding – Statistical methods Genotype-environment interaction Factor-analytic model Bayesian statistical decision theory Estatística Genética Vegetal |
description |
One of the main challenges in plant breeding programs is the efficient study of the genotypes x environments interaction (GEI). The presence of significant GE interaction hinders the work of the breeder for the recommendation and selection of superior genotypes. Among the various statistical procedures developed for this purpose, special emphasis should be given to those based on mixed models through factor analysis, commonly referred to as factor-analytic model (FA). This consists of a parsimonious approach and presents suggestive advantages when compared with classical methodologies, such as the great flexibility to deal with unbalanced data and heterogeneous variances. However, some problems are related to the factor analytic model: computational cost in analyzes with large number of environments and the Heywood cases, which makes the model unidentifiable. Moreover, the model representation in conventional biplot does not include any measure of uncertainty regarding the scores that describe GEI effect or Genotype (G) + GEI effects, plotted. The present proposal seeks to describe general forms of how heterogeneity of genetic and residual covariance can be modeled from the perspective of factorial analysis in mixed models, using spectral decomposition of the genetic effects within Bayesian approach, different from other procedures present in the literature in which the factor loads are directly sampled. In addition, the objective was to develop a procedure to incorporate inference to the biplot, through the construction of regions of credibility for genotypic and environmental scores. In this study, spherical distributions were assumed as prioris for eigenvectors and truncated normal distribution for singular values, as well as scaled inverse chi-squared distribution for residual variances and non-informative priori for the effect of genotypes. This approach differs from the Bayesian methods presented so far that assume the same constraints present in the mixed effects model. To exemplify the proposed method, we used simulated data and real data which study variable is the yield of spikes in t.ha-1. Samples for the inference process were obtained directly using the Gibbs sampler. Random unbalancing was performed in the data considering levels of 10%, 33% and 50% of losses of the genotype in the environment. According to the results, the FA analysis with two loads presented higher predictive capacity than the competing models. Unbalancing of 10% and 33% had mean values of correlation above 0.40 and with 50%, of 0.46. It was also observed that the performance of the model was better in the order of 50%, 33% and 10% of imbalance. We also verified that the analysis with the Bayesian FA model is robust under large levels of data unbalance. A relevant detail in this study concerns the selection of models, which proved not to be a trivial task in the case of real data, requiring additional criteria. In addition, the model proposed in this work showed greater predictive capacity than the equivalent frequentist model and the parameters were adequately estimated, being identifiable, without the need for rotationality of factor loads or imposition of restrictions, which represents a great advantage of the method proposed here. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-02-16T11:34:45Z 2017-02-16T11:34:45Z 2017-02-15 2017-01-19 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
NUVUNGA, J. J. Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente. 2017. 129 p. Tese (Doutorado em Estatística e Experimentação Agropecuária)-Universidade Federal de Lavras, Lavras, 2017. http://repositorio.ufla.br/jspui/handle/1/12273 |
identifier_str_mv |
NUVUNGA, J. J. Modelo fatorial analítico bayesiano aplicado à experimentos multi-ambiente. 2017. 129 p. Tese (Doutorado em Estatística e Experimentação Agropecuária)-Universidade Federal de Lavras, Lavras, 2017. |
url |
http://repositorio.ufla.br/jspui/handle/1/12273 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Federal de Lavras Programa de Pós-Graduação em Estatística e Experimentação Agropecuária UFLA brasil Departamento de Ciências Exatas |
publisher.none.fl_str_mv |
Universidade Federal de Lavras Programa de Pós-Graduação em Estatística e Experimentação Agropecuária UFLA brasil Departamento de Ciências Exatas |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFLA instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
Repositório Institucional da UFLA |
collection |
Repositório Institucional da UFLA |
repository.name.fl_str_mv |
Repositório Institucional da UFLA - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
nivaldo@ufla.br || repositorio.biblioteca@ufla.br |
_version_ |
1815439297444052992 |