Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome

Detalhes bibliográficos
Autor(a) principal: Hermisdorff, Isis da Costa
Data de Publicação: 2020
Outros Autores: Costa, Raphael Bermal, de Albuquerque, Lucia Galvão [UNESP], Pausch, Hubert, Kadri, Naveen Kumar
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1186/s12864-020-07184-8
http://hdl.handle.net/11449/206800
Resumo: Background: Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. Results: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Conclusions: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.
id UNSP_18cba1fecd0aa746838dbd3e04cbd3cb
oai_identifier_str oai:repositorio.unesp.br:11449/206800
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genomeBackground: Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. Results: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Conclusions: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)School of Veterinary Medicine and Animal Science Federal University of Bahia (UFBA)Animal Genomics ETH ZurichAnimal Science Department School of Agricultural and Veterinary Sciences São Paulo State University (Unesp), JaboticabalAnimal Science Department School of Agricultural and Veterinary Sciences São Paulo State University (Unesp), JaboticabalFAPESP: 2009/16118–5Universidade Federal da Bahia (UFBA)ETH ZurichUniversidade Estadual Paulista (Unesp)Hermisdorff, Isis da CostaCosta, Raphael Bermalde Albuquerque, Lucia Galvão [UNESP]Pausch, HubertKadri, Naveen Kumar2021-06-25T10:44:04Z2021-06-25T10:44:04Z2020-12-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1186/s12864-020-07184-8BMC Genomics, v. 21, n. 1, 2020.1471-2164http://hdl.handle.net/11449/20680010.1186/s12864-020-07184-82-s2.0-85095688967Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengBMC Genomicsinfo:eu-repo/semantics/openAccess2021-10-23T15:25:04Zoai:repositorio.unesp.br:11449/206800Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462021-10-23T15:25:04Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
title Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
spellingShingle Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
Hermisdorff, Isis da Costa
title_short Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
title_full Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
title_fullStr Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
title_full_unstemmed Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
title_sort Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
author Hermisdorff, Isis da Costa
author_facet Hermisdorff, Isis da Costa
Costa, Raphael Bermal
de Albuquerque, Lucia Galvão [UNESP]
Pausch, Hubert
Kadri, Naveen Kumar
author_role author
author2 Costa, Raphael Bermal
de Albuquerque, Lucia Galvão [UNESP]
Pausch, Hubert
Kadri, Naveen Kumar
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Universidade Federal da Bahia (UFBA)
ETH Zurich
Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Hermisdorff, Isis da Costa
Costa, Raphael Bermal
de Albuquerque, Lucia Galvão [UNESP]
Pausch, Hubert
Kadri, Naveen Kumar
description Background: Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. Results: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Conclusions: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-01
2021-06-25T10:44:04Z
2021-06-25T10:44:04Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1186/s12864-020-07184-8
BMC Genomics, v. 21, n. 1, 2020.
1471-2164
http://hdl.handle.net/11449/206800
10.1186/s12864-020-07184-8
2-s2.0-85095688967
url http://dx.doi.org/10.1186/s12864-020-07184-8
http://hdl.handle.net/11449/206800
identifier_str_mv BMC Genomics, v. 21, n. 1, 2020.
1471-2164
10.1186/s12864-020-07184-8
2-s2.0-85095688967
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv BMC Genomics
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1799965460596260864