TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10316/105458 https://doi.org/10.1038/s41598-021-00581-3 |
Resumo: | Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced. |
id |
RCAP_dc5496117ada88736e3c6eee4195cbb1 |
---|---|
oai_identifier_str |
oai:estudogeral.uc.pt:10316/105458 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun dataAlgorithmsComputational BiologyDatabases, GeneticGenetic VariationHumansModels, GeneticPolymorphism, Single NucleotideAllelesDNA, AncientGenome, HumanGenomicsEstimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.Springer Nature2021-10-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/105458http://hdl.handle.net/10316/105458https://doi.org/10.1038/s41598-021-00581-3eng2045-2322Fernandes, DanielCheronet, OliviaGelabert, PerePinhasi, Roninfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-01T10:20:52Zoai:estudogeral.uc.pt:10316/105458Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:22:01.805484Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
title |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
spellingShingle |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data Fernandes, Daniel Algorithms Computational Biology Databases, Genetic Genetic Variation Humans Models, Genetic Polymorphism, Single Nucleotide Alleles DNA, Ancient Genome, Human Genomics |
title_short |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
title_full |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
title_fullStr |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
title_full_unstemmed |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
title_sort |
TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data |
author |
Fernandes, Daniel |
author_facet |
Fernandes, Daniel Cheronet, Olivia Gelabert, Pere Pinhasi, Ron |
author_role |
author |
author2 |
Cheronet, Olivia Gelabert, Pere Pinhasi, Ron |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Fernandes, Daniel Cheronet, Olivia Gelabert, Pere Pinhasi, Ron |
dc.subject.por.fl_str_mv |
Algorithms Computational Biology Databases, Genetic Genetic Variation Humans Models, Genetic Polymorphism, Single Nucleotide Alleles DNA, Ancient Genome, Human Genomics |
topic |
Algorithms Computational Biology Databases, Genetic Genetic Variation Humans Models, Genetic Polymorphism, Single Nucleotide Alleles DNA, Ancient Genome, Human Genomics |
description |
Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-10-28 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10316/105458 http://hdl.handle.net/10316/105458 https://doi.org/10.1038/s41598-021-00581-3 |
url |
http://hdl.handle.net/10316/105458 https://doi.org/10.1038/s41598-021-00581-3 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
2045-2322 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Springer Nature |
publisher.none.fl_str_mv |
Springer Nature |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134110250172416 |