Imputation accuracy to whole-genome sequence in Nellore cattle

Detalhes bibliográficos
Autor(a) principal: Fernandes Júnior, Gerardo A. [UNESP]
Data de Publicação: 2021
Outros Autores: Carvalheiro, Roberto [UNESP], de Oliveira, Henrique N. [UNESP], Sargolzaei, Mehdi, Costilla, Roy, Ventura, Ricardo V., Fonseca, Larissa F. S. [UNESP], Neves, Haroldo H. R., Hayes, Ben J., de Albuquerque, Lucia G. [UNESP]
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1186/s12711-021-00622-5
http://hdl.handle.net/11449/208502
Resumo: Background: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Methods: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. Results: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Conclusions: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification.
id UNSP_8f6e0a0e38d6e7cc43fed0ae247b0e55
oai_identifier_str oai:repositorio.unesp.br:11449/208502
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Imputation accuracy to whole-genome sequence in Nellore cattleBackground: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Methods: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. Results: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Conclusions: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)School of Agricultural and Veterinarian Sciences UNESPNational Council for Scientific and Technological Development CNPqOntario Veterinary College UGSelect Sires Inc.Queensland Alliance for Agriculture and Food Innovation UQSchool of Veterinary Medicine and Animal Science USPGenSys Associated ConsultantsSchool of Agricultural and Veterinarian Sciences UNESPFAPESP: 2017/10630-2Universidade Estadual Paulista (Unesp)CNPqUGSelect Sires Inc.UQUniversidade de São Paulo (USP)GenSys Associated ConsultantsFernandes Júnior, Gerardo A. [UNESP]Carvalheiro, Roberto [UNESP]de Oliveira, Henrique N. [UNESP]Sargolzaei, MehdiCostilla, RoyVentura, Ricardo V.Fonseca, Larissa F. S. [UNESP]Neves, Haroldo H. R.Hayes, Ben J.de Albuquerque, Lucia G. [UNESP]2021-06-25T11:13:09Z2021-06-25T11:13:09Z2021-12-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1186/s12711-021-00622-5Genetics Selection Evolution, v. 53, n. 1, 2021.1297-96860999-193Xhttp://hdl.handle.net/11449/20850210.1186/s12711-021-00622-52-s2.0-85102493654Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengGenetics Selection Evolutioninfo:eu-repo/semantics/openAccess2024-06-07T18:39:17Zoai:repositorio.unesp.br:11449/208502Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:57:49.911897Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Imputation accuracy to whole-genome sequence in Nellore cattle
title Imputation accuracy to whole-genome sequence in Nellore cattle
spellingShingle Imputation accuracy to whole-genome sequence in Nellore cattle
Fernandes Júnior, Gerardo A. [UNESP]
title_short Imputation accuracy to whole-genome sequence in Nellore cattle
title_full Imputation accuracy to whole-genome sequence in Nellore cattle
title_fullStr Imputation accuracy to whole-genome sequence in Nellore cattle
title_full_unstemmed Imputation accuracy to whole-genome sequence in Nellore cattle
title_sort Imputation accuracy to whole-genome sequence in Nellore cattle
author Fernandes Júnior, Gerardo A. [UNESP]
author_facet Fernandes Júnior, Gerardo A. [UNESP]
Carvalheiro, Roberto [UNESP]
de Oliveira, Henrique N. [UNESP]
Sargolzaei, Mehdi
Costilla, Roy
Ventura, Ricardo V.
Fonseca, Larissa F. S. [UNESP]
Neves, Haroldo H. R.
Hayes, Ben J.
de Albuquerque, Lucia G. [UNESP]
author_role author
author2 Carvalheiro, Roberto [UNESP]
de Oliveira, Henrique N. [UNESP]
Sargolzaei, Mehdi
Costilla, Roy
Ventura, Ricardo V.
Fonseca, Larissa F. S. [UNESP]
Neves, Haroldo H. R.
Hayes, Ben J.
de Albuquerque, Lucia G. [UNESP]
author2_role author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (Unesp)
CNPq
UG
Select Sires Inc.
UQ
Universidade de São Paulo (USP)
GenSys Associated Consultants
dc.contributor.author.fl_str_mv Fernandes Júnior, Gerardo A. [UNESP]
Carvalheiro, Roberto [UNESP]
de Oliveira, Henrique N. [UNESP]
Sargolzaei, Mehdi
Costilla, Roy
Ventura, Ricardo V.
Fonseca, Larissa F. S. [UNESP]
Neves, Haroldo H. R.
Hayes, Ben J.
de Albuquerque, Lucia G. [UNESP]
description Background: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. Methods: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. Results: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. Conclusions: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification.
publishDate 2021
dc.date.none.fl_str_mv 2021-06-25T11:13:09Z
2021-06-25T11:13:09Z
2021-12-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1186/s12711-021-00622-5
Genetics Selection Evolution, v. 53, n. 1, 2021.
1297-9686
0999-193X
http://hdl.handle.net/11449/208502
10.1186/s12711-021-00622-5
2-s2.0-85102493654
url http://dx.doi.org/10.1186/s12711-021-00622-5
http://hdl.handle.net/11449/208502
identifier_str_mv Genetics Selection Evolution, v. 53, n. 1, 2021.
1297-9686
0999-193X
10.1186/s12711-021-00622-5
2-s2.0-85102493654
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Genetics Selection Evolution
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128295593574400