The optimal number of partial least squares components in genomic selection for pork pH

Bibliographic Details
Main Author: Silveira,Fernanda Gomes da
Publication Date: 2017
Other Authors: Duarte,Darlene Ana Souza, Chaves,Lucas Monteiro, Silva,Fabyano Fonseca e, Filho,Ivan Carvalho, Duarte,Marcio de Souza, Lopes,Paulo Sávio, Guimarães,Simone Eliza Facioni
Format: Article
Language: eng
Source: Ciência Rural
Download full: http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-84782017000100652
Summary: ABSTRACT: The main application of genomic selection (GS) is the early identification of genetically superior animals for traits difficult-to-measure or lately evaluated, such as meat pH (measured after slaughter). Because the number of markers in GS is generally larger than the number of genotyped animals and these markers are highly correlated owing to linkage disequilibrium, statistical methods based on dimensionality reduction have been proposed. Among them, the partial least squares (PLS) technique stands out, because of its simplicity and high predictive accuracy. However, choosing the optimal number of components remains a relevant issue for PLS applications. Thus, we applied PLS (and principal component and traditional multiple regression) techniques to GS for pork pH traits (with pH measured at 45min and 24h after slaughter) and also identified the optimal number of PLS components based on the degree-of-freedom (DoF) and cross-validation (CV) methods. The PLS method out performs the principal component and traditional multiple regression techniques, enabling satisfactory predictions for pork pH traits using only genotypic data (low-density SNP panel). Furthermore, the SNP marker estimates from PLS revealed a relevant region on chromosome 4, which may affect these traits. The DoF and CV methods showed similar results for determining the optimal number of components in PLS analysis; thus, from the statistical viewpoint, the DoF method should be preferred because of its theoretical background (based on the "statistical information theory"), whereas CV is an empirical method based on computational effort.
id UFSM-2_1c12d4eacf0ca87e3ccb57c21d406cd0
oai_identifier_str oai:scielo:S0103-84782017000100652
network_acronym_str UFSM-2
network_name_str Ciência rural (Online)
repository_id_str
spelling The optimal number of partial least squares components in genomic selection for pork pHSNPgenomic predictionmeat qualityABSTRACT: The main application of genomic selection (GS) is the early identification of genetically superior animals for traits difficult-to-measure or lately evaluated, such as meat pH (measured after slaughter). Because the number of markers in GS is generally larger than the number of genotyped animals and these markers are highly correlated owing to linkage disequilibrium, statistical methods based on dimensionality reduction have been proposed. Among them, the partial least squares (PLS) technique stands out, because of its simplicity and high predictive accuracy. However, choosing the optimal number of components remains a relevant issue for PLS applications. Thus, we applied PLS (and principal component and traditional multiple regression) techniques to GS for pork pH traits (with pH measured at 45min and 24h after slaughter) and also identified the optimal number of PLS components based on the degree-of-freedom (DoF) and cross-validation (CV) methods. The PLS method out performs the principal component and traditional multiple regression techniques, enabling satisfactory predictions for pork pH traits using only genotypic data (low-density SNP panel). Furthermore, the SNP marker estimates from PLS revealed a relevant region on chromosome 4, which may affect these traits. The DoF and CV methods showed similar results for determining the optimal number of components in PLS analysis; thus, from the statistical viewpoint, the DoF method should be preferred because of its theoretical background (based on the "statistical information theory"), whereas CV is an empirical method based on computational effort.Universidade Federal de Santa Maria2017-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-84782017000100652Ciência Rural v.47 n.1 2017reponame:Ciência Ruralinstname:Universidade Federal de Santa Maria (UFSM)instacron:UFSM10.1590/0103-8478cr20151563info:eu-repo/semantics/openAccessSilveira,Fernanda Gomes daDuarte,Darlene Ana SouzaChaves,Lucas MonteiroSilva,Fabyano Fonseca eFilho,Ivan CarvalhoDuarte,Marcio de SouzaLopes,Paulo SávioGuimarães,Simone Eliza Facionieng2016-11-24T00:00:00ZRevista
dc.title.none.fl_str_mv The optimal number of partial least squares components in genomic selection for pork pH
title The optimal number of partial least squares components in genomic selection for pork pH
spellingShingle The optimal number of partial least squares components in genomic selection for pork pH
Silveira,Fernanda Gomes da
SNP
genomic prediction
meat quality
title_short The optimal number of partial least squares components in genomic selection for pork pH
title_full The optimal number of partial least squares components in genomic selection for pork pH
title_fullStr The optimal number of partial least squares components in genomic selection for pork pH
title_full_unstemmed The optimal number of partial least squares components in genomic selection for pork pH
title_sort The optimal number of partial least squares components in genomic selection for pork pH
author Silveira,Fernanda Gomes da
author_facet Silveira,Fernanda Gomes da
Duarte,Darlene Ana Souza
Chaves,Lucas Monteiro
Silva,Fabyano Fonseca e
Filho,Ivan Carvalho
Duarte,Marcio de Souza
Lopes,Paulo Sávio
Guimarães,Simone Eliza Facioni
author_role author
author2 Duarte,Darlene Ana Souza
Chaves,Lucas Monteiro
Silva,Fabyano Fonseca e
Filho,Ivan Carvalho
Duarte,Marcio de Souza
Lopes,Paulo Sávio
Guimarães,Simone Eliza Facioni
author2_role author
author
author
author
author
author
author
dc.contributor.author.fl_str_mv Silveira,Fernanda Gomes da
Duarte,Darlene Ana Souza
Chaves,Lucas Monteiro
Silva,Fabyano Fonseca e
Filho,Ivan Carvalho
Duarte,Marcio de Souza
Lopes,Paulo Sávio
Guimarães,Simone Eliza Facioni
dc.subject.por.fl_str_mv SNP
genomic prediction
meat quality
topic SNP
genomic prediction
meat quality
description ABSTRACT: The main application of genomic selection (GS) is the early identification of genetically superior animals for traits difficult-to-measure or lately evaluated, such as meat pH (measured after slaughter). Because the number of markers in GS is generally larger than the number of genotyped animals and these markers are highly correlated owing to linkage disequilibrium, statistical methods based on dimensionality reduction have been proposed. Among them, the partial least squares (PLS) technique stands out, because of its simplicity and high predictive accuracy. However, choosing the optimal number of components remains a relevant issue for PLS applications. Thus, we applied PLS (and principal component and traditional multiple regression) techniques to GS for pork pH traits (with pH measured at 45min and 24h after slaughter) and also identified the optimal number of PLS components based on the degree-of-freedom (DoF) and cross-validation (CV) methods. The PLS method out performs the principal component and traditional multiple regression techniques, enabling satisfactory predictions for pork pH traits using only genotypic data (low-density SNP panel). Furthermore, the SNP marker estimates from PLS revealed a relevant region on chromosome 4, which may affect these traits. The DoF and CV methods showed similar results for determining the optimal number of components in PLS analysis; thus, from the statistical viewpoint, the DoF method should be preferred because of its theoretical background (based on the "statistical information theory"), whereas CV is an empirical method based on computational effort.
publishDate 2017
dc.date.none.fl_str_mv 2017-01-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-84782017000100652
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-84782017000100652
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1590/0103-8478cr20151563
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv text/html
dc.publisher.none.fl_str_mv Universidade Federal de Santa Maria
publisher.none.fl_str_mv Universidade Federal de Santa Maria
dc.source.none.fl_str_mv Ciência Rural v.47 n.1 2017
reponame:Ciência Rural
instname:Universidade Federal de Santa Maria (UFSM)
instacron:UFSM
instname_str Universidade Federal de Santa Maria (UFSM)
instacron_str UFSM
institution UFSM
reponame_str Ciência Rural
collection Ciência Rural
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1749140550872727552