Imputation accuracy to whole-genome sequence in Nellore cattle
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , , , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1186/s12711-021-00622-5 http://hdl.handle.net/11449/208502 |
Resumo: | Background: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Methods: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. Results: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Conclusions: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification. |
id |
UNSP_8f6e0a0e38d6e7cc43fed0ae247b0e55 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/208502 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Imputation accuracy to whole-genome sequence in Nellore cattleBackground: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Methods: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. Results: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Conclusions: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)School of Agricultural and Veterinarian Sciences UNESPNational Council for Scientific and Technological Development CNPqOntario Veterinary College UGSelect Sires Inc.Queensland Alliance for Agriculture and Food Innovation UQSchool of Veterinary Medicine and Animal Science USPGenSys Associated ConsultantsSchool of Agricultural and Veterinarian Sciences UNESPFAPESP: 2017/10630-2Universidade Estadual Paulista (Unesp)CNPqUGSelect Sires Inc.UQUniversidade de São Paulo (USP)GenSys Associated ConsultantsFernandes Júnior, Gerardo A. [UNESP]Carvalheiro, Roberto [UNESP]de Oliveira, Henrique N. [UNESP]Sargolzaei, MehdiCostilla, RoyVentura, Ricardo V.Fonseca, Larissa F. S. [UNESP]Neves, Haroldo H. R.Hayes, Ben J.de Albuquerque, Lucia G. [UNESP]2021-06-25T11:13:09Z2021-06-25T11:13:09Z2021-12-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1186/s12711-021-00622-5Genetics Selection Evolution, v. 53, n. 1, 2021.1297-96860999-193Xhttp://hdl.handle.net/11449/20850210.1186/s12711-021-00622-52-s2.0-85102493654Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengGenetics Selection Evolutioninfo:eu-repo/semantics/openAccess2024-06-07T18:39:17Zoai:repositorio.unesp.br:11449/208502Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:57:49.911897Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Imputation accuracy to whole-genome sequence in Nellore cattle |
title |
Imputation accuracy to whole-genome sequence in Nellore cattle |
spellingShingle |
Imputation accuracy to whole-genome sequence in Nellore cattle Fernandes Júnior, Gerardo A. [UNESP] |
title_short |
Imputation accuracy to whole-genome sequence in Nellore cattle |
title_full |
Imputation accuracy to whole-genome sequence in Nellore cattle |
title_fullStr |
Imputation accuracy to whole-genome sequence in Nellore cattle |
title_full_unstemmed |
Imputation accuracy to whole-genome sequence in Nellore cattle |
title_sort |
Imputation accuracy to whole-genome sequence in Nellore cattle |
author |
Fernandes Júnior, Gerardo A. [UNESP] |
author_facet |
Fernandes Júnior, Gerardo A. [UNESP] Carvalheiro, Roberto [UNESP] de Oliveira, Henrique N. [UNESP] Sargolzaei, Mehdi Costilla, Roy Ventura, Ricardo V. Fonseca, Larissa F. S. [UNESP] Neves, Haroldo H. R. Hayes, Ben J. de Albuquerque, Lucia G. [UNESP] |
author_role |
author |
author2 |
Carvalheiro, Roberto [UNESP] de Oliveira, Henrique N. [UNESP] Sargolzaei, Mehdi Costilla, Roy Ventura, Ricardo V. Fonseca, Larissa F. S. [UNESP] Neves, Haroldo H. R. Hayes, Ben J. de Albuquerque, Lucia G. [UNESP] |
author2_role |
author author author author author author author author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) CNPq UG Select Sires Inc. UQ Universidade de São Paulo (USP) GenSys Associated Consultants |
dc.contributor.author.fl_str_mv |
Fernandes Júnior, Gerardo A. [UNESP] Carvalheiro, Roberto [UNESP] de Oliveira, Henrique N. [UNESP] Sargolzaei, Mehdi Costilla, Roy Ventura, Ricardo V. Fonseca, Larissa F. S. [UNESP] Neves, Haroldo H. R. Hayes, Ben J. de Albuquerque, Lucia G. [UNESP] |
description |
Background: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Methods: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. Results: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Conclusions: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-06-25T11:13:09Z 2021-06-25T11:13:09Z 2021-12-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1186/s12711-021-00622-5 Genetics Selection Evolution, v. 53, n. 1, 2021. 1297-9686 0999-193X http://hdl.handle.net/11449/208502 10.1186/s12711-021-00622-5 2-s2.0-85102493654 |
url |
http://dx.doi.org/10.1186/s12711-021-00622-5 http://hdl.handle.net/11449/208502 |
identifier_str_mv |
Genetics Selection Evolution, v. 53, n. 1, 2021. 1297-9686 0999-193X 10.1186/s12711-021-00622-5 2-s2.0-85102493654 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Genetics Selection Evolution |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128295593574400 |