TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data

Fernandes, Daniel; Cheronet, Olivia; Gelabert, Pere; Pinhasi, Ron

TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data

Detalhes bibliográficos
Autor(a) principal:	Fernandes, Daniel
Data de Publicação:	2021
Outros Autores:	Cheronet, Olivia, Gelabert, Pere, Pinhasi, Ron
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10316/105458 https://doi.org/10.1038/s41598-021-00581-3
Resumo:	Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.

Metadados do item

id	RCAP_dc5496117ada88736e3c6eee4195cbb1
oai_identifier_str	oai:estudogeral.uc.pt:10316/105458
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun dataAlgorithmsComputational BiologyDatabases, GeneticGenetic VariationHumansModels, GeneticPolymorphism, Single NucleotideAllelesDNA, AncientGenome, HumanGenomicsEstimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.Springer Nature2021-10-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/105458http://hdl.handle.net/10316/105458https://doi.org/10.1038/s41598-021-00581-3eng2045-2322Fernandes, DanielCheronet, OliviaGelabert, PerePinhasi, Roninfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-01T10:20:52Zoai:estudogeral.uc.pt:10316/105458Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:22:01.805484Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
spellingShingle	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data Fernandes, Daniel Algorithms Computational Biology Databases, Genetic Genetic Variation Humans Models, Genetic Polymorphism, Single Nucleotide Alleles DNA, Ancient Genome, Human Genomics
title_short	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_full	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_fullStr	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_full_unstemmed	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
title_sort	TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data
author	Fernandes, Daniel
author_facet	Fernandes, Daniel Cheronet, Olivia Gelabert, Pere Pinhasi, Ron
author_role	author
author2	Cheronet, Olivia Gelabert, Pere Pinhasi, Ron
author2_role	author author author
dc.contributor.author.fl_str_mv	Fernandes, Daniel Cheronet, Olivia Gelabert, Pere Pinhasi, Ron
dc.subject.por.fl_str_mv	Algorithms Computational Biology Databases, Genetic Genetic Variation Humans Models, Genetic Polymorphism, Single Nucleotide Alleles DNA, Ancient Genome, Human Genomics
topic	Algorithms Computational Biology Databases, Genetic Genetic Variation Humans Models, Genetic Polymorphism, Single Nucleotide Alleles DNA, Ancient Genome, Human Genomics
description	Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1-1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp-four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.
publishDate	2021
dc.date.none.fl_str_mv	2021-10-28
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10316/105458 http://hdl.handle.net/10316/105458 https://doi.org/10.1038/s41598-021-00581-3
url	http://hdl.handle.net/10316/105458 https://doi.org/10.1038/s41598-021-00581-3
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	2045-2322
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Springer Nature
publisher.none.fl_str_mv	Springer Nature
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799134110250172416

TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data

Registros relacionados