Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.

Detalhes bibliográficos
Autor(a) principal: OLIVEIRA, A. A. de
Data de Publicação: 2020
Outros Autores: GUIMARAES, L. J. M., GUIMARÃES, C. T., GUIMARAES, P. E. de O., PINTO, M. de O., PASTINA, M. M., MARGARIDO, G. R. A.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
Texto Completo: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125019
Resumo: Genotyping-by-sequencing (GBS) datasets typically feature high rates of missingness and heterozygote undercalling, prompting the use of data imputation. We compared the accuracy of four imputation methods?NPUTE, Beagle, knearest neighbors imputation (KNNI), and fast inbreed line library imputation (FILLIN)?using GBS data of maize (Zea mays L.) inbred lines, genotyped using different multiplexing levels. Two strategies for SNP-calling and genotype imputation were evaluated. First, only lines genotyped through 96-plex were used for single nucleotide polymorphism (SNP) discovery, whereas both 96- and 384-plex were simultaneously used in the second strategy. In the first genotype imputation strategy, only the 96-plex lines were imputed, then the remaining lines were appended (96-plex-imputed plus 384-plex) and then imputed. In the second imputation strategy, we jointly imputed both datasets. We also investigated the impacts of including heterozygous genotypes and distinct rates of missing genotypes per locus. The different SNP-calling strategies and percentage of missing data did not substantially affect the imputation accuracy. However, the different imputation strategies showed a substantial effect. Generally, imputations were less accurate for heterozygotes. The scenario 96-plex-imputed plus 384-plex showed accuracies similar to the 96-plex scenario. Beagle and NPUTE produced the highest accuracies. Our results indicate that combining SNP-calling and imputation strategies can enhance genotyping in a cost-effective manner, resulting in higher imputation accuracies.
id EMBR_a25b5d7947a4d9e820a02cac99dfd1e7
oai_identifier_str oai:www.alice.cnptia.embrapa.br:doc/1125019
network_acronym_str EMBR
network_name_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository_id_str 2154
spelling Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.GenotipagemImputaçãoGenética VegetalSeleção GenótipaGenótipoMelhoramento Genético VegetalMilhoPolimorfismoGenotyping-by-sequencing (GBS) datasets typically feature high rates of missingness and heterozygote undercalling, prompting the use of data imputation. We compared the accuracy of four imputation methods?NPUTE, Beagle, knearest neighbors imputation (KNNI), and fast inbreed line library imputation (FILLIN)?using GBS data of maize (Zea mays L.) inbred lines, genotyped using different multiplexing levels. Two strategies for SNP-calling and genotype imputation were evaluated. First, only lines genotyped through 96-plex were used for single nucleotide polymorphism (SNP) discovery, whereas both 96- and 384-plex were simultaneously used in the second strategy. In the first genotype imputation strategy, only the 96-plex lines were imputed, then the remaining lines were appended (96-plex-imputed plus 384-plex) and then imputed. In the second imputation strategy, we jointly imputed both datasets. We also investigated the impacts of including heterozygous genotypes and distinct rates of missing genotypes per locus. The different SNP-calling strategies and percentage of missing data did not substantially affect the imputation accuracy. However, the different imputation strategies showed a substantial effect. Generally, imputations were less accurate for heterozygotes. The scenario 96-plex-imputed plus 384-plex showed accuracies similar to the 96-plex scenario. Beagle and NPUTE produced the highest accuracies. Our results indicate that combining SNP-calling and imputation strategies can enhance genotyping in a cost-effective manner, resulting in higher imputation accuracies.Amanda Avelar de Oliveira, Escola Superior de Agricultura "Luiz de Queiroz"LAURO JOSE MOREIRA GUIMARAES, CNPMSCLAUDIA TEIXEIRA GUIMARAES, CNPMSPAULO EVARISTO DE O GUIMARAES, CNPMSMARCOS DE OLIVEIRA PINTO, CNPMSMARIA MARTA PASTINA, CNPMSGabriel Rodrigues Alves Margarido, Escola Superior de Agricultura "Luiz de Queiroz".OLIVEIRA, A. A. deGUIMARAES, L. J. M.GUIMARÃES, C. T.GUIMARAES, P. E. de O.PINTO, M. de O.PASTINA, M. M.MARGARIDO, G. R. A.2020-11-12T14:20:26Z2020-11-12T14:20:26Z2020-09-182020info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleCrop Science, v. 60, n. 6, p. 3066-3082, 2020.http://www.alice.cnptia.embrapa.br/alice/handle/doc/112501910.1002/csc2.20255enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)instacron:EMBRAPA2020-11-12T14:20:40Zoai:www.alice.cnptia.embrapa.br:doc/1125019Repositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestopendoar:21542020-11-12T14:20:40falseRepositório InstitucionalPUBhttps://www.alice.cnptia.embrapa.br/oai/requestcg-riaa@embrapa.bropendoar:21542020-11-12T14:20:40Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)false
dc.title.none.fl_str_mv Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
title Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
spellingShingle Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
OLIVEIRA, A. A. de
Genotipagem
Imputação
Genética Vegetal
Seleção Genótipa
Genótipo
Melhoramento Genético Vegetal
Milho
Polimorfismo
title_short Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
title_full Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
title_fullStr Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
title_full_unstemmed Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
title_sort Single nucleotide polymorphism calling and imputation strategies for cost-effective genotyping in a tropical maize breeding program.
author OLIVEIRA, A. A. de
author_facet OLIVEIRA, A. A. de
GUIMARAES, L. J. M.
GUIMARÃES, C. T.
GUIMARAES, P. E. de O.
PINTO, M. de O.
PASTINA, M. M.
MARGARIDO, G. R. A.
author_role author
author2 GUIMARAES, L. J. M.
GUIMARÃES, C. T.
GUIMARAES, P. E. de O.
PINTO, M. de O.
PASTINA, M. M.
MARGARIDO, G. R. A.
author2_role author
author
author
author
author
author
dc.contributor.none.fl_str_mv Amanda Avelar de Oliveira, Escola Superior de Agricultura "Luiz de Queiroz"
LAURO JOSE MOREIRA GUIMARAES, CNPMS
CLAUDIA TEIXEIRA GUIMARAES, CNPMS
PAULO EVARISTO DE O GUIMARAES, CNPMS
MARCOS DE OLIVEIRA PINTO, CNPMS
MARIA MARTA PASTINA, CNPMS
Gabriel Rodrigues Alves Margarido, Escola Superior de Agricultura "Luiz de Queiroz".
dc.contributor.author.fl_str_mv OLIVEIRA, A. A. de
GUIMARAES, L. J. M.
GUIMARÃES, C. T.
GUIMARAES, P. E. de O.
PINTO, M. de O.
PASTINA, M. M.
MARGARIDO, G. R. A.
dc.subject.por.fl_str_mv Genotipagem
Imputação
Genética Vegetal
Seleção Genótipa
Genótipo
Melhoramento Genético Vegetal
Milho
Polimorfismo
topic Genotipagem
Imputação
Genética Vegetal
Seleção Genótipa
Genótipo
Melhoramento Genético Vegetal
Milho
Polimorfismo
description Genotyping-by-sequencing (GBS) datasets typically feature high rates of missingness and heterozygote undercalling, prompting the use of data imputation. We compared the accuracy of four imputation methods?NPUTE, Beagle, knearest neighbors imputation (KNNI), and fast inbreed line library imputation (FILLIN)?using GBS data of maize (Zea mays L.) inbred lines, genotyped using different multiplexing levels. Two strategies for SNP-calling and genotype imputation were evaluated. First, only lines genotyped through 96-plex were used for single nucleotide polymorphism (SNP) discovery, whereas both 96- and 384-plex were simultaneously used in the second strategy. In the first genotype imputation strategy, only the 96-plex lines were imputed, then the remaining lines were appended (96-plex-imputed plus 384-plex) and then imputed. In the second imputation strategy, we jointly imputed both datasets. We also investigated the impacts of including heterozygous genotypes and distinct rates of missing genotypes per locus. The different SNP-calling strategies and percentage of missing data did not substantially affect the imputation accuracy. However, the different imputation strategies showed a substantial effect. Generally, imputations were less accurate for heterozygotes. The scenario 96-plex-imputed plus 384-plex showed accuracies similar to the 96-plex scenario. Beagle and NPUTE produced the highest accuracies. Our results indicate that combining SNP-calling and imputation strategies can enhance genotyping in a cost-effective manner, resulting in higher imputation accuracies.
publishDate 2020
dc.date.none.fl_str_mv 2020-11-12T14:20:26Z
2020-11-12T14:20:26Z
2020-09-18
2020
dc.type.driver.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv Crop Science, v. 60, n. 6, p. 3066-3082, 2020.
http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125019
10.1002/csc2.20255
identifier_str_mv Crop Science, v. 60, n. 6, p. 3066-3082, 2020.
10.1002/csc2.20255
url http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125019
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
instname:Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron:EMBRAPA
instname_str Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
instacron_str EMBRAPA
institution EMBRAPA
reponame_str Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
collection Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice)
repository.name.fl_str_mv Repositório Institucional da EMBRAPA (Repository Open Access to Scientific Information from EMBRAPA - Alice) - Empresa Brasileira de Pesquisa Agropecuária (Embrapa)
repository.mail.fl_str_mv cg-riaa@embrapa.br
_version_ 1794503497585000448