Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
Texto Completo: | http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125019 |
Resumo: | Genotyping-by-sequencing (GBS) datasets typically feature high rates of missingness and heterozygote undercalling, prompting the use of data imputation. We compared the accuracy of four imputation methods?NPUTE, Beagle, knearest neighbors imputation (KNNI), and fast inbreed line library imputation (FILLIN)?using GBS data of maize (Zea mays L.) inbred lines, genotyped using different multiplexing levels. Two strategies for SNP-calling and genotype imputation were evaluated. First, only lines genotyped through 96-plex were used for single nucleotide polymorphism (SNP) discovery, whereas both 96- and 384-plex were simultaneously used in the second strategy. In the first genotype imputation strategy, only the 96-plex lines were imputed, then the remaining lines were appended (96-plex-imputed plus 384-plex) and then imputed. In the second imputation strategy, we jointly imputed both datasets. We also investigated the impacts of including heterozygous genotypes and distinct rates of missing genotypes per locus. The different SNP-calling strategies and percentage of missing data did not substantially affect the imputation accuracy. However, the different imputation strategies showed a substantial effect. Generally, imputations were less accurate for heterozygotes. The scenario 96-plex-imputed plus 384-plex showed accuracies similar to the 96-plex scenario. Beagle and NPUTE produced the highest accuracies. Our results indicate that combining SNP-calling and imputation strategies can enhance genotyping in a cost-effective manner, resulting in higher imputation accuracies. |
id |
EMBR_a25b5d7947a4d9e820a02cac99dfd1e7 |
---|---|
oai_identifier_str |
oai:www.alice.cnptia.embrapa.br:doc/1125019 |
network_acronym_str |
EMBR |
network_name_str |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
repository_id_str |
2154 |
spelling |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.GenotipagemImputaçãoGenética VegetalSeleção GenótipaGenótipoMelhoramento Genético VegetalMilhoPolimorfismoGenotyping-by-sequencing (GBS) datasets typically feature high rates of missingness and heterozygote undercalling, prompting the use of data imputation. We compared the accuracy of four imputation methods?NPUTE, Beagle, knearest neighbors imputation (KNNI), and fast inbreed line library imputation (FILLIN)?using GBS data of maize (Zea mays L.) inbred lines, genotyped using different multiplexing levels. Two strategies for SNP-calling and genotype imputation were evaluated. First, only lines genotyped through 96-plex were used for single nucleotide polymorphism (SNP) discovery, whereas both 96- and 384-plex were simultaneously used in the second strategy. In the first genotype imputation strategy, only the 96-plex lines were imputed, then the remaining lines were appended (96-plex-imputed plus 384-plex) and then imputed. In the second imputation strategy, we jointly imputed both datasets. We also investigated the impacts of including heterozygous genotypes and distinct rates of missing genotypes per locus. The different SNP-calling strategies and percentage of missing data did not substantially affect the imputation accuracy. However, the different imputation strategies showed a substantial effect. Generally, imputations were less accurate for heterozygotes. The scenario 96-plex-imputed plus 384-plex showed accuracies similar to the 96-plex scenario. Beagle and NPUTE produced the highest accuracies. Our results indicate that combining SNP-calling and imputation strategies can enhance genotyping in a cost-effective manner, resulting in higher imputation accuracies.Amanda Avelar de Oliveira, Escola Superior de Agricultura "Luiz de Queiroz"LAURO JOSE MOREIRA GUIMARAES, CNPMSCLAUDIA TEIXEIRA GUIMARAES, CNPMSPAULO EVARISTO DE O GUIMARAES, CNPMSMARCOS DE OLIVEIRA PINTO, CNPMSMARIA MARTA PASTINA, CNPMSGabriel Rodrigues Alves Margarido, Escola Superior de Agricultura "Luiz de Queiroz".OLIVEIRA, A. A. deGUIMARAES, L. J. M.GUIMARÃES, C. T.GUIMARAES, P. E. de O.PINTO, M. de O.PASTINA, M. M.MARGARIDO, G. R. A.2020-11-12T14:20:26Z2020-11-12T14:20:26Z2020-09-182020info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleCrop Science, v. 60, n. 6, p. 3066-3082, 2020.http://www.alice.cnptia.embrapa.br/alice/handle/doc/112501910.1002/csc2.20255enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2020-11-12T14:20:40Zoai:www.alice.cnptia.embrapa.br:doc/1125019Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestopendoar:21542020-11-12T14:20:40falseRepositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542020-11-12T14:20:40Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false |
dc.title.none.fl_str_mv |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
title |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
spellingShingle |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. OLIVEIRA, A. A. de Genotipagem Imputação Genética Vegetal Seleção Genótipa Genótipo Melhoramento Genético Vegetal Milho Polimorfismo |
title_short |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
title_full |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
title_fullStr |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
title_full_unstemmed |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
title_sort |
Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program. |
author |
OLIVEIRA, A. A. de |
author_facet |
OLIVEIRA, A. A. de GUIMARAES, L. J. M. GUIMARÃES, C. T. GUIMARAES, P. E. de O. PINTO, M. de O. PASTINA, M. M. MARGARIDO, G. R. A. |
author_role |
author |
author2 |
GUIMARAES, L. J. M. GUIMARÃES, C. T. GUIMARAES, P. E. de O. PINTO, M. de O. PASTINA, M. M. MARGARIDO, G. R. A. |
author2_role |
author author author author author author |
dc.contributor.none.fl_str_mv |
Amanda Avelar de Oliveira, Escola Superior de Agricultura "Luiz de Queiroz" LAURO JOSE MOREIRA GUIMARAES, CNPMS CLAUDIA TEIXEIRA GUIMARAES, CNPMS PAULO EVARISTO DE O GUIMARAES, CNPMS MARCOS DE OLIVEIRA PINTO, CNPMS MARIA MARTA PASTINA, CNPMS Gabriel Rodrigues Alves Margarido, Escola Superior de Agricultura "Luiz de Queiroz". |
dc.contributor.author.fl_str_mv |
OLIVEIRA, A. A. de GUIMARAES, L. J. M. GUIMARÃES, C. T. GUIMARAES, P. E. de O. PINTO, M. de O. PASTINA, M. M. MARGARIDO, G. R. A. |
dc.subject.por.fl_str_mv |
Genotipagem Imputação Genética Vegetal Seleção Genótipa Genótipo Melhoramento Genético Vegetal Milho Polimorfismo |
topic |
Genotipagem Imputação Genética Vegetal Seleção Genótipa Genótipo Melhoramento Genético Vegetal Milho Polimorfismo |
description |
Genotyping-by-sequencing (GBS) datasets typically feature high rates of missingness and heterozygote undercalling, prompting the use of data imputation. We compared the accuracy of four imputation methods?NPUTE, Beagle, knearest neighbors imputation (KNNI), and fast inbreed line library imputation (FILLIN)?using GBS data of maize (Zea mays L.) inbred lines, genotyped using different multiplexing levels. Two strategies for SNP-calling and genotype imputation were evaluated. First, only lines genotyped through 96-plex were used for single nucleotide polymorphism (SNP) discovery, whereas both 96- and 384-plex were simultaneously used in the second strategy. In the first genotype imputation strategy, only the 96-plex lines were imputed, then the remaining lines were appended (96-plex-imputed plus 384-plex) and then imputed. In the second imputation strategy, we jointly imputed both datasets. We also investigated the impacts of including heterozygous genotypes and distinct rates of missing genotypes per locus. The different SNP-calling strategies and percentage of missing data did not substantially affect the imputation accuracy. However, the different imputation strategies showed a substantial effect. Generally, imputations were less accurate for heterozygotes. The scenario 96-plex-imputed plus 384-plex showed accuracies similar to the 96-plex scenario. Beagle and NPUTE produced the highest accuracies. Our results indicate that combining SNP-calling and imputation strategies can enhance genotyping in a cost-effective manner, resulting in higher imputation accuracies. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-11-12T14:20:26Z 2020-11-12T14:20:26Z 2020-09-18 2020 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
Crop Science, v. 60, n. 6, p. 3066-3082, 2020. http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125019 10.1002/csc2.20255 |
identifier_str_mv |
Crop Science, v. 60, n. 6, p. 3066-3082, 2020. 10.1002/csc2.20255 |
url |
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125019 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa) instacron:EMBRAPA |
instname_str |
Empresa Brasileira de Pesquisa Agropecuária (Embrapa) |
instacron_str |
EMBRAPA |
institution |
EMBRAPA |
reponame_str |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
collection |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) |
repository.name.fl_str_mv |
Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa) |
repository.mail.fl_str_mv |
cg-riaa@embrapa.br |
_version_ |
1794503497585000448 |