Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.

Detalhes bibliográficos
Autor(a) principal: SOUSA, I. C. de
Data de Publicação: 2020
Outros Autores: NASCIMENTO, M., SILVA, G. N., NASCIMENTO, A. C. C., CRUZ, C. D., SILVA, F. F. e, ALMEIDA, D. P. de, PESTANA, K. N., AZEVEDO, C. F., ZAMBOLIM, L., CAIXETA, E. T.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
Texto Completo: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125524
http://dx.doi.org/10.1590/1678-992X-2020-0021
Resumo: Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature.
id EMBR_21fdd2add53a45b376933a999db273f6
oai_identifier_str oai:www.alice.cnptia.embrapa.br:doc/1125524
network_acronym_str EMBR
network_name_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository_id_str 2154
spelling Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.Statistical learningHemileia VastatrixPlant breedingArtificial intelligenceGenomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature.Ithalo Coelho de Sousa, Universidade Federal de Viçosa; Moysés Nascimento, Universidade Federal de Viçosa; Gabi Nunes Silva, Universidade Federal de Rondônia; Ana Carolina Campana Nascimento, Universidade Federal de Viçosa; Cosme Damião Cruz, Universidade Federal de Viçosa; Fabyano Fonseca e Silva, Universidade Federal de Viçosa; Dênia Pires de Almeida, Universidade Federal de Viçosa; Kátia Nogueira Pestana, Embrapa Mandioca e Fruticultura; Camila Ferreira Azevedo, Universidade Federal de Viçosa; Laércio Zambolim, Universidade Federal de Viçosa; EVELINE TEIXEIRA CAIXETA MOURA, CNPCa.SOUSA, I. C. deNASCIMENTO, M.SILVA, G. N.NASCIMENTO, A. C. C.CRUZ, C. D.SILVA, F. F. eALMEIDA, D. P. dePESTANA, K. N.AZEVEDO, C. F.ZAMBOLIM, L.CAIXETA, E. T.2020-10-16T09:14:16Z2020-10-16T09:14:16Z2020-10-152021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleScientia Agricola, v. 78, n. 4, e20200021, 2021.http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125524http://dx.doi.org/10.1590/1678-992X-2020-0021enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2020-10-16T09:14:23Zoai:www.alice.cnptia.embrapa.br:doc/1125524Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestopendoar:21542020-10-16T09:14:23falseRepositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542020-10-16T09:14:23Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false
dc.title.none.fl_str_mv Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
title Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
spellingShingle Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
SOUSA, I. C. de
Statistical learning
Hemileia Vastatrix
Plant breeding
Artificial intelligence
title_short Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
title_full Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
title_fullStr Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
title_full_unstemmed Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
title_sort Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms.
author SOUSA, I. C. de
author_facet SOUSA, I. C. de
NASCIMENTO, M.
SILVA, G. N.
NASCIMENTO, A. C. C.
CRUZ, C. D.
SILVA, F. F. e
ALMEIDA, D. P. de
PESTANA, K. N.
AZEVEDO, C. F.
ZAMBOLIM, L.
CAIXETA, E. T.
author_role author
author2 NASCIMENTO, M.
SILVA, G. N.
NASCIMENTO, A. C. C.
CRUZ, C. D.
SILVA, F. F. e
ALMEIDA, D. P. de
PESTANA, K. N.
AZEVEDO, C. F.
ZAMBOLIM, L.
CAIXETA, E. T.
author2_role author
author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Ithalo Coelho de Sousa, Universidade Federal de Viçosa; Moysés Nascimento, Universidade Federal de Viçosa; Gabi Nunes Silva, Universidade Federal de Rondônia; Ana Carolina Campana Nascimento, Universidade Federal de Viçosa; Cosme Damião Cruz, Universidade Federal de Viçosa; Fabyano Fonseca e Silva, Universidade Federal de Viçosa; Dênia Pires de Almeida, Universidade Federal de Viçosa; Kátia Nogueira Pestana, Embrapa Mandioca e Fruticultura; Camila Ferreira Azevedo, Universidade Federal de Viçosa; Laércio Zambolim, Universidade Federal de Viçosa; EVELINE TEIXEIRA CAIXETA MOURA, CNPCa.
dc.contributor.author.fl_str_mv SOUSA, I. C. de
NASCIMENTO, M.
SILVA, G. N.
NASCIMENTO, A. C. C.
CRUZ, C. D.
SILVA, F. F. e
ALMEIDA, D. P. de
PESTANA, K. N.
AZEVEDO, C. F.
ZAMBOLIM, L.
CAIXETA, E. T.
dc.subject.por.fl_str_mv Statistical learning
Hemileia Vastatrix
Plant breeding
Artificial intelligence
topic Statistical learning
Hemileia Vastatrix
Plant breeding
Artificial intelligence
description Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature.
publishDate 2020
dc.date.none.fl_str_mv 2020-10-16T09:14:16Z
2020-10-16T09:14:16Z
2020-10-15
2021
dc.type.driver.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv Scientia Agricola, v. 78, n. 4, e20200021, 2021.
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125524
http://dx.doi.org/10.1590/1678-992X-2020-0021
identifier_str_mv Scientia Agricola, v. 78, n. 4, e20200021, 2021.
url http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125524
http://dx.doi.org/10.1590/1678-992X-2020-0021
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron:EMBRAPA
instname_str Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron_str EMBRAPA
institution EMBRAPA
reponame_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
collection Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository.name.fl_str_mv Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
repository.mail.fl_str_mv cg-riaa@embrapa.br
_version_ 1794503496486092800