TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data

Detalhes bibliográficos
Autor(a) principal: Fernandes, Daniel
Data de Publicação: 2021
Outros Autores: Cheronet, Olivia, Gelabert, Pere, Pinhasi, Ron
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10316/105458
https://doi.org/10.1038/s41598-021-00581-3
Resumo: Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.
id RCAP_dc5496117ada88736e3c6eee4195cbb1
oai_identifier_str oai:estudogeral.uc.pt:10316/105458
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun dataAlgorithmsComputational BiologyDatabases, GeneticGenetic VariationHumansModels, GeneticPolymorphism, Single NucleotideAllelesDNA, AncientGenome, HumanGenomicsEstimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.Springer Nature2021-10-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/105458http://hdl.handle.net/10316/105458https://doi.org/10.1038/s41598-021-00581-3eng2045-2322Fernandes, DanielCheronet, OliviaGelabert, PerePinhasi, Roninfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-01T10:20:52Zoai:estudogeral.uc.pt:10316/105458Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:22:01.805484Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
spellingShingle TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
Fernandes, Daniel
Algorithms
Computational Biology
Databases, Genetic
Genetic Variation
Humans
Models, Genetic
Polymorphism, Single Nucleotide
Alleles
DNA, Ancient
Genome, Human
Genomics
title_short TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_full TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_fullStr TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_full_unstemmed TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_sort TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
author Fernandes, Daniel
author_facet Fernandes, Daniel
Cheronet, Olivia
Gelabert, Pere
Pinhasi, Ron
author_role author
author2 Cheronet, Olivia
Gelabert, Pere
Pinhasi, Ron
author2_role author
author
author
dc.contributor.author.fl_str_mv Fernandes, Daniel
Cheronet, Olivia
Gelabert, Pere
Pinhasi, Ron
dc.subject.por.fl_str_mv Algorithms
Computational Biology
Databases, Genetic
Genetic Variation
Humans
Models, Genetic
Polymorphism, Single Nucleotide
Alleles
DNA, Ancient
Genome, Human
Genomics
topic Algorithms
Computational Biology
Databases, Genetic
Genetic Variation
Humans
Models, Genetic
Polymorphism, Single Nucleotide
Alleles
DNA, Ancient
Genome, Human
Genomics
description Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.
publishDate 2021
dc.date.none.fl_str_mv 2021-10-28
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10316/105458
http://hdl.handle.net/10316/105458
https://doi.org/10.1038/s41598-021-00581-3
url http://hdl.handle.net/10316/105458
https://doi.org/10.1038/s41598-021-00581-3
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2045-2322
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Springer Nature
publisher.none.fl_str_mv Springer Nature
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134110250172416