Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies

Detalhes bibliográficos
Autor(a) principal: Machado, Moara
Data de Publicação: 2011
Outros Autores: Magalhães, Wagner Carlos Santos, Sene, Allan, Araújo, Bruno, Campos, Alessandra Conceicao Faria Aguiar, Chanock, Stephen J, Scott, Leandro, Oliveira, Guilherme Corrêa, Santos, Eduardo Tarazona, Rodrigues, Maíra Ribeiro
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da FIOCRUZ (ARCA)
Texto Completo: https://www.arca.fiocruz.br/handle/icict/8836
Resumo: Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.
id CRUZ_e94ba4c5899f140568b6a48c747b832d
oai_identifier_str oai:www.arca.fiocruz.br:icict/8836
network_acronym_str CRUZ
network_name_str Repositório Institucional da FIOCRUZ (ARCA)
repository_id_str 2135
spelling Machado, MoaraMagalhães, Wagner Carlos SantosSene, AllanAraújo, BrunoCampos, Alessandra Conceicao Faria AguiarChanock, Stephen JScott, LeandroOliveira, Guilherme CorrêaSantos, Eduardo TarazonaRodrigues, Maíra Ribeiro2014-11-13T17:11:44Z2014-11-13T17:11:44Z2011MACHADO, Moara et al. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies. Investigative Genetics, volume 2, fasciculo 1 p. 3, 2011.2041-2223https://www.arca.fiocruz.br/handle/icict/883610.1186/2041-2223-2-3engBioMed Central Ltd.Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studiesinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleUniversidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Exatas. Departamento de Ciência da Computação. Belo Horizonte, MG, Brazil.National Institutes of Health. National Cancer Institute. Division of Cancer Epidemiology and Genetics. Laboratory of Translational Genomics. Gaithersburg, MD, USA/Grovemont Circle Advanced Technology Center. Gaithersburg, MD, USAFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Grupo de Genômica e Biologia Computacional e Centro de Excelência em Bioinformática.Belo Horizonte, MG, Brazil.Fundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Grupo de Genômica e Biologia Computacional e Centro de Excelência em Bioinformática.Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.BACKGROUND: Targeted re-sequencing is one of the most powerful and widely used strategies for population genetics studies because it allows an unbiased screening for variation that is suitable for a wide variety of organisms. Examples of studies that require re-sequencing data are evolutionary inferences, epidemiological studies designed to capture rare polymorphisms responsible for complex traits and screenings for mutations in families and small populations with high incidences of specific genetic diseases. Despite the advent of next-generation sequencing technologies, Sanger sequencing is still the most popular approach in population genetics studies because of the widespread availability of automatic sequencers based on capillary electrophoresis and because it is still less prone to sequencing errors, which is critical in population genetics studies. Two popular software applications for re-sequencing studies are Phred-Phrap-Consed-Polyphred, which performs base calling, alignment, graphical edition and genotype calling and DNAsp, which performs a set of population genetics analyses. These independent tools are the start and end points of basic analyses. In between the use of these tools, there is a set of basic but error-prone tasks to be performed with re-sequencing data. RESULTS: In order to assist with these intermediate tasks, we developed a pipeline that facilitates data handling typical of re-sequencing studies. Our pipeline: (1) consolidates different outputs produced by distinct Phred-Phrap-Consed contigs sharing a reference sequence; (2) checks for genotyping inconsistencies; (3) reformats genotyping data produced by Polyphred into a matrix of genotypes with individuals as rows and segregating sites as columns; (4) prepares input files for haplotype inferences using the popular software PHASE; and (5) handles PHASE output files that contain only polymorphic sites to reconstruct the inferred haplotypes including polymorphic and monomorphic sites as required by population genetics software for re-sequencing data such as DNAsp. CONCLUSION: We tested the pipeline in re-sequencing studies of haploid and diploid data in humans, plants, animals and microorganisms and observed that it allowed a substantial decrease in the time required for sequencing analyses, as well as being a more controlled process that eliminates several classes of error that may occur when handling datasets. The pipeline is also useful for investigators using other tools for sequencing and population genetics analyses.polymorphismspopulation geneticsinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da FIOCRUZ (ARCA)instname:Fundação Oswaldo Cruz (FIOCRUZ)instacron:FIOCRUZLICENSElicense.txtlicense.txttext/plain; charset=utf-81914https://www.arca.fiocruz.br/bitstream/icict/8836/1/license.txt7d48279ffeed55da8dfe2f8e81f3b81fMD51ORIGINALPhred-Phrap package to analyses tools.pdfPhred-Phrap package to analyses tools.pdfapplication/pdf975636https://www.arca.fiocruz.br/bitstream/icict/8836/2/Phred-Phrap%20package%20to%20analyses%20tools.pdfffb181fe105ba86cd6d27c806243dcdaMD52TEXTPhred-Phrap package to analyses tools.pdf.txtPhred-Phrap package to analyses tools.pdf.txtExtracted texttext/plain32772https://www.arca.fiocruz.br/bitstream/icict/8836/3/Phred-Phrap%20package%20to%20analyses%20tools.pdf.txt381d57769789d4f290b7e7b3297ae083MD53icict/88362019-06-19 10:11:34.556oai:www.arca.fiocruz.br:icict/8836TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkFvIGNvbmNvcmRhciBlIGFjZWl0YXIgZXN0YSBsaWNlbsOnYSB2b2PDqiAoYXV0b3Igb3UgZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzKToKCmEpIERlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zDrXRpY2EgZGUgY29weXJpZ2h0IGRhIGVkaXRvcmEgZG8gc2V1IGRvY3VtZW50by4KCmIpIERlY2xhcmEgcXVlIGNvbmhlY2UgZSBhY2VpdGEgYXMgRGlyZXRyaXplcyBwYXJhIG8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgRnVuZGHDp8OjbyBPc3dhbGRvIENydXogKEZJT0NSVVopLgoKYykgQ29uY2VkZSDDoCBGSU9DUlVaIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSBhcnF1aXZhciwgcmVwcm9kdXppciwgY29udmVydGVyIChjb21vIGRlZmluaWRvIGEgc2VndWlyKSwgY29tdW5pY2FyCiAKZS9vdSBkaXN0cmlidWlyIG5vIFJlcG9zaXTDs3JpbyBkYSBGSU9DUlVaLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgCgpwb3IgcXVhbHF1ZXIgb3V0cm8gbWVpby4KCmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgRklPQ1JVWiBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgCgpwYXJhIHF1YWxxdWVyIGZvcm1hdG8gZGUgYXJxdWl2bywgbWVpbyBvdSBzdXBvcnRlLCBwYXJhIGVmZWl0b3MgZGUgc2VndXJhbsOnYSwgcHJlc2VydmHDp8OjbyAoYmFja3VwKSBlIGFjZXNzby4KCmUpIERlY2xhcmEgcXVlIG8gZG9jdW1lbnRvIHN1Ym1ldGlkbyDDqSBvIHNldSB0cmFiYWxobyBvcmlnaW5hbCwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyAKCmNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBEZWNsYXJhIHRhbWLDqW0gcXVlIGEgZW50cmVnYSBkbyBkb2N1bWVudG8gbsOjbyBpbmZyaW5nZSBvcyBkaXJlaXRvcyBkZSBxdWFscXVlciBvdXRyYSBwZXNzb2Egb3UgZW50aWRhZGUuCgpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlIGF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIAoKaXJyZXN0cml0YSBkbyByZXNwZWN0aXZvIGRldGVudG9yIGRlc3NlcyBkaXJlaXRvcywgcGFyYSBjZWRlciBhIEZJT0NSVVogb3MgZGlyZWl0b3MgcmVxdWVyaWRvcyBwb3IgZXN0YSBMaWNlbsOnYSBlIGF1dG9yaXphciBhIAoKdXRpbGl6w6EtbG9zIGxlZ2FsbWVudGUuIERlY2xhcmEgdGFtYsOpbSBxdWUgZXNzZSBtYXRlcmlhbCBjdWpvcyBkaXJlaXRvcyBzw6NvIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIGlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIAoKbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZS4KCmcpIFNFIE8gRE9DVU1FTlRPIEVOVFJFR1VFIMOJIEJBU0VBRE8gRU0gVFJBQkFMSE8gRklOQU5DSUFETyBPVSBBUE9JQURPIFBPUiBPVVRSQSBJTlNUSVRVScOHw4NPIFFVRSBOw4NPIEEgRklPQ1JVWiwgREVDTEFSQSBRVUUgQ1VNUFJJVSAKClFVQUlTUVVFUiBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUEVMTyBSRVNQRUNUSVZPIENPTlRSQVRPIE9VIEFDT1JETy4gQSBGSU9DUlVaIGlkZW50aWZpY2Fyw6EgY2xhcmFtZW50ZSBvKHMpIG5vbWUocykgZG8ocykgYXV0b3IoZXMpIGRvcyAKCmRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://www.arca.fiocruz.br/oai/requestrepositorio.arca@fiocruz.bropendoar:21352019-06-19T13:11:34Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)false
dc.title.pt_BR.fl_str_mv Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
title Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
spellingShingle Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
Machado, Moara
polymorphisms
population genetics
title_short Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
title_full Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
title_fullStr Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
title_full_unstemmed Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
title_sort Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
author Machado, Moara
author_facet Machado, Moara
Magalhães, Wagner Carlos Santos
Sene, Allan
Araújo, Bruno
Campos, Alessandra Conceicao Faria Aguiar
Chanock, Stephen J
Scott, Leandro
Oliveira, Guilherme Corrêa
Santos, Eduardo Tarazona
Rodrigues, Maíra Ribeiro
author_role author
author2 Magalhães, Wagner Carlos Santos
Sene, Allan
Araújo, Bruno
Campos, Alessandra Conceicao Faria Aguiar
Chanock, Stephen J
Scott, Leandro
Oliveira, Guilherme Corrêa
Santos, Eduardo Tarazona
Rodrigues, Maíra Ribeiro
author2_role author
author
author
author
author
author
author
author
author
dc.contributor.author.fl_str_mv Machado, Moara
Magalhães, Wagner Carlos Santos
Sene, Allan
Araújo, Bruno
Campos, Alessandra Conceicao Faria Aguiar
Chanock, Stephen J
Scott, Leandro
Oliveira, Guilherme Corrêa
Santos, Eduardo Tarazona
Rodrigues, Maíra Ribeiro
dc.subject.en.pt_BR.fl_str_mv polymorphisms
population genetics
topic polymorphisms
population genetics
description Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.
publishDate 2011
dc.date.issued.fl_str_mv 2011
dc.date.accessioned.fl_str_mv 2014-11-13T17:11:44Z
dc.date.available.fl_str_mv 2014-11-13T17:11:44Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.citation.fl_str_mv MACHADO, Moara et al. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies. Investigative Genetics, volume 2, fasciculo 1 p. 3, 2011.
dc.identifier.uri.fl_str_mv https://www.arca.fiocruz.br/handle/icict/8836
dc.identifier.issn.none.fl_str_mv 2041-2223
dc.identifier.doi.none.fl_str_mv 10.1186/2041-2223-2-3
identifier_str_mv MACHADO, Moara et al. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies. Investigative Genetics, volume 2, fasciculo 1 p. 3, 2011.
2041-2223
10.1186/2041-2223-2-3
url https://www.arca.fiocruz.br/handle/icict/8836
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv BioMed Central Ltd.
publisher.none.fl_str_mv BioMed Central Ltd.
dc.source.none.fl_str_mv reponame:Repositório Institucional da FIOCRUZ (ARCA)
instname:Fundação Oswaldo Cruz (FIOCRUZ)
instacron:FIOCRUZ
instname_str Fundação Oswaldo Cruz (FIOCRUZ)
instacron_str FIOCRUZ
institution FIOCRUZ
reponame_str Repositório Institucional da FIOCRUZ (ARCA)
collection Repositório Institucional da FIOCRUZ (ARCA)
bitstream.url.fl_str_mv https://www.arca.fiocruz.br/bitstream/icict/8836/1/license.txt
https://www.arca.fiocruz.br/bitstream/icict/8836/2/Phred-Phrap%20package%20to%20analyses%20tools.pdf
https://www.arca.fiocruz.br/bitstream/icict/8836/3/Phred-Phrap%20package%20to%20analyses%20tools.pdf.txt
bitstream.checksum.fl_str_mv 7d48279ffeed55da8dfe2f8e81f3b81f
ffb181fe105ba86cd6d27c806243dcda
381d57769789d4f290b7e7b3297ae083
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)
repository.mail.fl_str_mv repositorio.arca@fiocruz.br
_version_ 1798324664885313536