Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1186/s12864-020-07184-8 http://hdl.handle.net/11449/206800 |
Resumo: | Background: Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. Results: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Conclusions: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants. |
id |
UNSP_18cba1fecd0aa746838dbd3e04cbd3cb |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/206800 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genomeBackground: Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. Results: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Conclusions: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)School of Veterinary Medicine and Animal Science Federal University of Bahia (UFBA)Animal Genomics ETH ZurichAnimal Science Department School of Agricultural and Veterinary Sciences São Paulo State University (Unesp), JaboticabalAnimal Science Department School of Agricultural and Veterinary Sciences São Paulo State University (Unesp), JaboticabalFAPESP: 2009/16118–5Universidade Federal da Bahia (UFBA)ETH ZurichUniversidade Estadual Paulista (Unesp)Hermisdorff, Isis da CostaCosta, Raphael Bermalde Albuquerque, Lucia Galvão [UNESP]Pausch, HubertKadri, Naveen Kumar2021-06-25T10:44:04Z2021-06-25T10:44:04Z2020-12-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1186/s12864-020-07184-8BMC Genomics, v. 21, n. 1, 2020.1471-2164http://hdl.handle.net/11449/20680010.1186/s12864-020-07184-82-s2.0-85095688967Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengBMC Genomicsinfo:eu-repo/semantics/openAccess2021-10-23T15:25:04Zoai:repositorio.unesp.br:11449/206800Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462021-10-23T15:25:04Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
title |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
spellingShingle |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome Hermisdorff, Isis da Costa |
title_short |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
title_full |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
title_fullStr |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
title_full_unstemmed |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
title_sort |
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome |
author |
Hermisdorff, Isis da Costa |
author_facet |
Hermisdorff, Isis da Costa Costa, Raphael Bermal de Albuquerque, Lucia Galvão [UNESP] Pausch, Hubert Kadri, Naveen Kumar |
author_role |
author |
author2 |
Costa, Raphael Bermal de Albuquerque, Lucia Galvão [UNESP] Pausch, Hubert Kadri, Naveen Kumar |
author2_role |
author author author author |
dc.contributor.none.fl_str_mv |
Universidade Federal da Bahia (UFBA) ETH Zurich Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Hermisdorff, Isis da Costa Costa, Raphael Bermal de Albuquerque, Lucia Galvão [UNESP] Pausch, Hubert Kadri, Naveen Kumar |
description |
Background: Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. Results: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. Conclusions: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-12-01 2021-06-25T10:44:04Z 2021-06-25T10:44:04Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1186/s12864-020-07184-8 BMC Genomics, v. 21, n. 1, 2020. 1471-2164 http://hdl.handle.net/11449/206800 10.1186/s12864-020-07184-8 2-s2.0-85095688967 |
url |
http://dx.doi.org/10.1186/s12864-020-07184-8 http://hdl.handle.net/11449/206800 |
identifier_str_mv |
BMC Genomics, v. 21, n. 1, 2020. 1471-2164 10.1186/s12864-020-07184-8 2-s2.0-85095688967 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
BMC Genomics |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1799965460596260864 |