Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses

Detalhes bibliográficos
Autor(a) principal: Zhang, Jie
Data de Publicação: 2023
Outros Autores: Pina Martins, Francisco, Jin, Zu‐Shi, Cha, Yong‐Peng, Liu, Zu‐Yao, Peng, Jun‐Chu, Zhao, Jian‐Li, Li, Qing‐Jun
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10451/58353
Resumo: Techniques of reduced-representation sequencing (RRS) have revolutionized ecological and evolutionary genomics studies. Precise establishment of orthologs is a critical challenge for RRS, especially when a reference genome is absent. The proportion of shared heterozygous sites across samples is an alternative criterion for filtering paralogs. In the prevailing pipeline for variant calling of RRS data – PYRAD/IPYRAD, maxSH is an often overlooked parameter with implications to detecting and filtering paralogs according to shared heterozygosity. Using empirical genotyping by sequencing data of two primroses (Primula alpicola Stapf and Primula florindae Ward) and their putative hybrids, and extra data sets of Californian golden cup oaks, we explore the impact of maxSH on filtering paralogs and further downstream analyses. Our study sheds light on the simultaneous validity and risk of filtering paralogs using maxSH, and its significant effects on downstream analyses of outlier detection, population assignment, and demographic modeling, emphasizing the importance of attention to detail during bioinformatic processes. The mutual confirmation between results of population assignment and demographic modeling in this study suggested maxSH = 0.10 has a potentially excessive and asymmetrical effect on the removal of truly shared heterozygous sites as paralogs. These results indicate that hybridization origin hypotheses of putative hybrids represented by results with maxSH = 0.25 and 0.50 are more credible. In conclusion, we revealed the critical hazard of paralogs filtration according to sharing heterozygosity at first, so that we propose to use specific protocols, rather than maxSH, to filter potential paralogs for closely related lineages.
id RCAP_b462a22bd898ba27d00546d34c7622d2
oai_identifier_str oai:repositorio.ul.pt:10451/58353
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primrosesTechniques of reduced-representation sequencing (RRS) have revolutionized ecological and evolutionary genomics studies. Precise establishment of orthologs is a critical challenge for RRS, especially when a reference genome is absent. The proportion of shared heterozygous sites across samples is an alternative criterion for filtering paralogs. In the prevailing pipeline for variant calling of RRS data – PYRAD/IPYRAD, maxSH is an often overlooked parameter with implications to detecting and filtering paralogs according to shared heterozygosity. Using empirical genotyping by sequencing data of two primroses (Primula alpicola Stapf and Primula florindae Ward) and their putative hybrids, and extra data sets of Californian golden cup oaks, we explore the impact of maxSH on filtering paralogs and further downstream analyses. Our study sheds light on the simultaneous validity and risk of filtering paralogs using maxSH, and its significant effects on downstream analyses of outlier detection, population assignment, and demographic modeling, emphasizing the importance of attention to detail during bioinformatic processes. The mutual confirmation between results of population assignment and demographic modeling in this study suggested maxSH = 0.10 has a potentially excessive and asymmetrical effect on the removal of truly shared heterozygous sites as paralogs. These results indicate that hybridization origin hypotheses of putative hybrids represented by results with maxSH = 0.25 and 0.50 are more credible. In conclusion, we revealed the critical hazard of paralogs filtration according to sharing heterozygosity at first, so that we propose to use specific protocols, rather than maxSH, to filter potential paralogs for closely related lineages.WileyRepositório da Universidade de LisboaZhang, JiePina Martins, FranciscoJin, Zu‐ShiCha, Yong‐PengLiu, Zu‐YaoPeng, Jun‐ChuZhao, Jian‐LiLi, Qing‐Jun2023-112024-12-01T00:00:00Z2023-11-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10451/58353engZhang, J., Pina-Martins, F., Jin, Z.-S., Cha, Y.-P., Liu, Z.-Y., Peng, J.-C., Zhao, J.-L. and Li, Q.-J. (2023), Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses. J. Syst. Evol. https://doi.org/10.1111/jse.1292810.1111/jse.12928info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T17:06:46Zoai:repositorio.ul.pt:10451/58353Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:08:23.329755Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
title Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
spellingShingle Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
Zhang, Jie
title_short Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
title_full Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
title_fullStr Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
title_full_unstemmed Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
title_sort Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses
author Zhang, Jie
author_facet Zhang, Jie
Pina Martins, Francisco
Jin, Zu‐Shi
Cha, Yong‐Peng
Liu, Zu‐Yao
Peng, Jun‐Chu
Zhao, Jian‐Li
Li, Qing‐Jun
author_role author
author2 Pina Martins, Francisco
Jin, Zu‐Shi
Cha, Yong‐Peng
Liu, Zu‐Yao
Peng, Jun‐Chu
Zhao, Jian‐Li
Li, Qing‐Jun
author2_role author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Zhang, Jie
Pina Martins, Francisco
Jin, Zu‐Shi
Cha, Yong‐Peng
Liu, Zu‐Yao
Peng, Jun‐Chu
Zhao, Jian‐Li
Li, Qing‐Jun
description Techniques of reduced-representation sequencing (RRS) have revolutionized ecological and evolutionary genomics studies. Precise establishment of orthologs is a critical challenge for RRS, especially when a reference genome is absent. The proportion of shared heterozygous sites across samples is an alternative criterion for filtering paralogs. In the prevailing pipeline for variant calling of RRS data – PYRAD/IPYRAD, maxSH is an often overlooked parameter with implications to detecting and filtering paralogs according to shared heterozygosity. Using empirical genotyping by sequencing data of two primroses (Primula alpicola Stapf and Primula florindae Ward) and their putative hybrids, and extra data sets of Californian golden cup oaks, we explore the impact of maxSH on filtering paralogs and further downstream analyses. Our study sheds light on the simultaneous validity and risk of filtering paralogs using maxSH, and its significant effects on downstream analyses of outlier detection, population assignment, and demographic modeling, emphasizing the importance of attention to detail during bioinformatic processes. The mutual confirmation between results of population assignment and demographic modeling in this study suggested maxSH = 0.10 has a potentially excessive and asymmetrical effect on the removal of truly shared heterozygous sites as paralogs. These results indicate that hybridization origin hypotheses of putative hybrids represented by results with maxSH = 0.25 and 0.50 are more credible. In conclusion, we revealed the critical hazard of paralogs filtration according to sharing heterozygosity at first, so that we propose to use specific protocols, rather than maxSH, to filter potential paralogs for closely related lineages.
publishDate 2023
dc.date.none.fl_str_mv 2023-11
2023-11-01T00:00:00Z
2024-12-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10451/58353
url http://hdl.handle.net/10451/58353
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Zhang, J., Pina-Martins, F., Jin, Z.-S., Cha, Y.-P., Liu, Z.-Y., Peng, J.-C., Zhao, J.-L. and Li, Q.-J. (2023), Excessive and asymmetrical removal of heterozygous sites by maxSH biases downstream population genetic inference: Implications for hybridization between two primroses. J. Syst. Evol. https://doi.org/10.1111/jse.12928
10.1111/jse.12928
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Wiley
publisher.none.fl_str_mv Wiley
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134638482915328