Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , , , , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Scientia Agrícola (Online) |
Texto Completo: | http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162021000401102 |
Resumo: | ABSTRACT Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature. |
id |
USP-18_4a66d42456180cb7550e237e256086e4 |
---|---|
oai_identifier_str |
oai:scielo:S0103-90162021000401102 |
network_acronym_str |
USP-18 |
network_name_str |
Scientia Agrícola (Online) |
repository_id_str |
|
spelling |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithmsHemileia vastatrixstatistical learningplant breedingartificial intelligenceABSTRACT Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature.Escola Superior de Agricultura "Luiz de Queiroz"2021-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162021000401102Scientia Agricola v.78 n.4 2021reponame:Scientia Agrícola (Online)instname:Universidade de São Paulo (USP)instacron:USP10.1590/1678-992x-2020-0021info:eu-repo/semantics/openAccessSousa,Ithalo Coelho deNascimento,MoysésSilva,Gabi NunesNascimento,Ana Carolina CampanaCruz,Cosme DamiãoSilva,Fabyano Fonseca eAlmeida,Dênia Pires dePestana,Kátia NogueiraAzevedo,Camila FerreiraZambolim,LaércioCaixeta,Eveline Teixeiraeng2020-07-06T00:00:00Zoai:scielo:S0103-90162021000401102Revistahttp://revistas.usp.br/sa/indexPUBhttps://old.scielo.br/oai/scielo-oai.phpscientia@usp.br||alleoni@usp.br1678-992X0103-9016opendoar:2020-07-06T00:00Scientia Agrícola (Online) - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
title |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
spellingShingle |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms Sousa,Ithalo Coelho de Hemileia vastatrix statistical learning plant breeding artificial intelligence |
title_short |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
title_full |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
title_fullStr |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
title_full_unstemmed |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
title_sort |
Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms |
author |
Sousa,Ithalo Coelho de |
author_facet |
Sousa,Ithalo Coelho de Nascimento,Moysés Silva,Gabi Nunes Nascimento,Ana Carolina Campana Cruz,Cosme Damião Silva,Fabyano Fonseca e Almeida,Dênia Pires de Pestana,Kátia Nogueira Azevedo,Camila Ferreira Zambolim,Laércio Caixeta,Eveline Teixeira |
author_role |
author |
author2 |
Nascimento,Moysés Silva,Gabi Nunes Nascimento,Ana Carolina Campana Cruz,Cosme Damião Silva,Fabyano Fonseca e Almeida,Dênia Pires de Pestana,Kátia Nogueira Azevedo,Camila Ferreira Zambolim,Laércio Caixeta,Eveline Teixeira |
author2_role |
author author author author author author author author author author |
dc.contributor.author.fl_str_mv |
Sousa,Ithalo Coelho de Nascimento,Moysés Silva,Gabi Nunes Nascimento,Ana Carolina Campana Cruz,Cosme Damião Silva,Fabyano Fonseca e Almeida,Dênia Pires de Pestana,Kátia Nogueira Azevedo,Camila Ferreira Zambolim,Laércio Caixeta,Eveline Teixeira |
dc.subject.por.fl_str_mv |
Hemileia vastatrix statistical learning plant breeding artificial intelligence |
topic |
Hemileia vastatrix statistical learning plant breeding artificial intelligence |
description |
ABSTRACT Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-01-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162021000401102 |
url |
http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162021000401102 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1590/1678-992x-2020-0021 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
text/html |
dc.publisher.none.fl_str_mv |
Escola Superior de Agricultura "Luiz de Queiroz" |
publisher.none.fl_str_mv |
Escola Superior de Agricultura "Luiz de Queiroz" |
dc.source.none.fl_str_mv |
Scientia Agricola v.78 n.4 2021 reponame:Scientia Agrícola (Online) instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Scientia Agrícola (Online) |
collection |
Scientia Agrícola (Online) |
repository.name.fl_str_mv |
Scientia Agrícola (Online) - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
scientia@usp.br||alleoni@usp.br |
_version_ |
1748936465661820928 |