Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies
Autor(a) principal: | |
---|---|
Data de Publicação: | 2011 |
Outros Autores: | , , , , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da FIOCRUZ (ARCA) |
Texto Completo: | https://www.arca.fiocruz.br/handle/icict/8836 |
Resumo: | Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil. |
id |
CRUZ_e94ba4c5899f140568b6a48c747b832d |
---|---|
oai_identifier_str |
oai:www.arca.fiocruz.br:icict/8836 |
network_acronym_str |
CRUZ |
network_name_str |
Repositório Institucional da FIOCRUZ (ARCA) |
repository_id_str |
2135 |
spelling |
Machado, MoaraMagalhães, Wagner Carlos SantosSene, AllanAraújo, BrunoCampos, Alessandra Conceicao Faria AguiarChanock, Stephen JScott, LeandroOliveira, Guilherme CorrêaSantos, Eduardo TarazonaRodrigues, Maíra Ribeiro2014-11-13T17:11:44Z2014-11-13T17:11:44Z2011MACHADO, Moara et al. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies. Investigative Genetics, volume 2, fasciculo 1 p. 3, 2011.2041-2223https://www.arca.fiocruz.br/handle/icict/883610.1186/2041-2223-2-3engBioMed Central Ltd.Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studiesinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleUniversidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Exatas. Departamento de Ciência da Computação. Belo Horizonte, MG, Brazil.National Institutes of Health. National Cancer Institute. Division of Cancer Epidemiology and Genetics. Laboratory of Translational Genomics. Gaithersburg, MD, USA/Grovemont Circle Advanced Technology Center. Gaithersburg, MD, USAFundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Grupo de Genômica e Biologia Computacional e Centro de Excelência em Bioinformática.Belo Horizonte, MG, Brazil.Fundação Oswaldo Cruz. Centro de Pesquisas René Rachou. Grupo de Genômica e Biologia Computacional e Centro de Excelência em Bioinformática.Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil.BACKGROUND: Targeted re-sequencing is one of the most powerful and widely used strategies for population genetics studies because it allows an unbiased screening for variation that is suitable for a wide variety of organisms. Examples of studies that require re-sequencing data are evolutionary inferences, epidemiological studies designed to capture rare polymorphisms responsible for complex traits and screenings for mutations in families and small populations with high incidences of specific genetic diseases. Despite the advent of next-generation sequencing technologies, Sanger sequencing is still the most popular approach in population genetics studies because of the widespread availability of automatic sequencers based on capillary electrophoresis and because it is still less prone to sequencing errors, which is critical in population genetics studies. Two popular software applications for re-sequencing studies are Phred-Phrap-Consed-Polyphred, which performs base calling, alignment, graphical edition and genotype calling and DNAsp, which performs a set of population genetics analyses. These independent tools are the start and end points of basic analyses. In between the use of these tools, there is a set of basic but error-prone tasks to be performed with re-sequencing data. RESULTS: In order to assist with these intermediate tasks, we developed a pipeline that facilitates data handling typical of re-sequencing studies. Our pipeline: (1) consolidates different outputs produced by distinct Phred-Phrap-Consed contigs sharing a reference sequence; (2) checks for genotyping inconsistencies; (3) reformats genotyping data produced by Polyphred into a matrix of genotypes with individuals as rows and segregating sites as columns; (4) prepares input files for haplotype inferences using the popular software PHASE; and (5) handles PHASE output files that contain only polymorphic sites to reconstruct the inferred haplotypes including polymorphic and monomorphic sites as required by population genetics software for re-sequencing data such as DNAsp. CONCLUSION: We tested the pipeline in re-sequencing studies of haploid and diploid data in humans, plants, animals and microorganisms and observed that it allowed a substantial decrease in the time required for sequencing analyses, as well as being a more controlled process that eliminates several classes of error that may occur when handling datasets. The pipeline is also useful for investigators using other tools for sequencing and population genetics analyses.polymorphismspopulation geneticsinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da FIOCRUZ (ARCA)instname:Fundação Oswaldo Cruz (FIOCRUZ)instacron:FIOCRUZLICENSElicense.txtlicense.txttext/plain; charset=utf-81914https://www.arca.fiocruz.br/bitstream/icict/8836/1/license.txt7d48279ffeed55da8dfe2f8e81f3b81fMD51ORIGINALPhred-Phrap package to analyses tools.pdfPhred-Phrap package to analyses tools.pdfapplication/pdf975636https://www.arca.fiocruz.br/bitstream/icict/8836/2/Phred-Phrap%20package%20to%20analyses%20tools.pdfffb181fe105ba86cd6d27c806243dcdaMD52TEXTPhred-Phrap package to analyses tools.pdf.txtPhred-Phrap package to analyses tools.pdf.txtExtracted texttext/plain32772https://www.arca.fiocruz.br/bitstream/icict/8836/3/Phred-Phrap%20package%20to%20analyses%20tools.pdf.txt381d57769789d4f290b7e7b3297ae083MD53icict/88362019-06-19 10:11:34.556oai:www.arca.fiocruz.br:icict/8836TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkFvIGNvbmNvcmRhciBlIGFjZWl0YXIgZXN0YSBsaWNlbsOnYSB2b2PDqiAoYXV0b3Igb3UgZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzKToKCmEpIERlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zDrXRpY2EgZGUgY29weXJpZ2h0IGRhIGVkaXRvcmEgZG8gc2V1IGRvY3VtZW50by4KCmIpIERlY2xhcmEgcXVlIGNvbmhlY2UgZSBhY2VpdGEgYXMgRGlyZXRyaXplcyBwYXJhIG8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgRnVuZGHDp8OjbyBPc3dhbGRvIENydXogKEZJT0NSVVopLgoKYykgQ29uY2VkZSDDoCBGSU9DUlVaIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSBhcnF1aXZhciwgcmVwcm9kdXppciwgY29udmVydGVyIChjb21vIGRlZmluaWRvIGEgc2VndWlyKSwgY29tdW5pY2FyCiAKZS9vdSBkaXN0cmlidWlyIG5vIFJlcG9zaXTDs3JpbyBkYSBGSU9DUlVaLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgCgpwb3IgcXVhbHF1ZXIgb3V0cm8gbWVpby4KCmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgRklPQ1JVWiBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgCgpwYXJhIHF1YWxxdWVyIGZvcm1hdG8gZGUgYXJxdWl2bywgbWVpbyBvdSBzdXBvcnRlLCBwYXJhIGVmZWl0b3MgZGUgc2VndXJhbsOnYSwgcHJlc2VydmHDp8OjbyAoYmFja3VwKSBlIGFjZXNzby4KCmUpIERlY2xhcmEgcXVlIG8gZG9jdW1lbnRvIHN1Ym1ldGlkbyDDqSBvIHNldSB0cmFiYWxobyBvcmlnaW5hbCwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyAKCmNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBEZWNsYXJhIHRhbWLDqW0gcXVlIGEgZW50cmVnYSBkbyBkb2N1bWVudG8gbsOjbyBpbmZyaW5nZSBvcyBkaXJlaXRvcyBkZSBxdWFscXVlciBvdXRyYSBwZXNzb2Egb3UgZW50aWRhZGUuCgpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlIGF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIAoKaXJyZXN0cml0YSBkbyByZXNwZWN0aXZvIGRldGVudG9yIGRlc3NlcyBkaXJlaXRvcywgcGFyYSBjZWRlciBhIEZJT0NSVVogb3MgZGlyZWl0b3MgcmVxdWVyaWRvcyBwb3IgZXN0YSBMaWNlbsOnYSBlIGF1dG9yaXphciBhIAoKdXRpbGl6w6EtbG9zIGxlZ2FsbWVudGUuIERlY2xhcmEgdGFtYsOpbSBxdWUgZXNzZSBtYXRlcmlhbCBjdWpvcyBkaXJlaXRvcyBzw6NvIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIGlkZW50aWZpY2FkbyBlIHJlY29uaGVjaWRvIAoKbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZS4KCmcpIFNFIE8gRE9DVU1FTlRPIEVOVFJFR1VFIMOJIEJBU0VBRE8gRU0gVFJBQkFMSE8gRklOQU5DSUFETyBPVSBBUE9JQURPIFBPUiBPVVRSQSBJTlNUSVRVScOHw4NPIFFVRSBOw4NPIEEgRklPQ1JVWiwgREVDTEFSQSBRVUUgQ1VNUFJJVSAKClFVQUlTUVVFUiBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUEVMTyBSRVNQRUNUSVZPIENPTlRSQVRPIE9VIEFDT1JETy4gQSBGSU9DUlVaIGlkZW50aWZpY2Fyw6EgY2xhcmFtZW50ZSBvKHMpIG5vbWUocykgZG8ocykgYXV0b3IoZXMpIGRvcyAKCmRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://www.arca.fiocruz.br/oai/requestrepositorio.arca@fiocruz.bropendoar:21352019-06-19T13:11:34Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)false |
dc.title.pt_BR.fl_str_mv |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
title |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
spellingShingle |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies Machado, Moara polymorphisms population genetics |
title_short |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
title_full |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
title_fullStr |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
title_full_unstemmed |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
title_sort |
Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies |
author |
Machado, Moara |
author_facet |
Machado, Moara Magalhães, Wagner Carlos Santos Sene, Allan Araújo, Bruno Campos, Alessandra Conceicao Faria Aguiar Chanock, Stephen J Scott, Leandro Oliveira, Guilherme Corrêa Santos, Eduardo Tarazona Rodrigues, Maíra Ribeiro |
author_role |
author |
author2 |
Magalhães, Wagner Carlos Santos Sene, Allan Araújo, Bruno Campos, Alessandra Conceicao Faria Aguiar Chanock, Stephen J Scott, Leandro Oliveira, Guilherme Corrêa Santos, Eduardo Tarazona Rodrigues, Maíra Ribeiro |
author2_role |
author author author author author author author author author |
dc.contributor.author.fl_str_mv |
Machado, Moara Magalhães, Wagner Carlos Santos Sene, Allan Araújo, Bruno Campos, Alessandra Conceicao Faria Aguiar Chanock, Stephen J Scott, Leandro Oliveira, Guilherme Corrêa Santos, Eduardo Tarazona Rodrigues, Maíra Ribeiro |
dc.subject.en.pt_BR.fl_str_mv |
polymorphisms population genetics |
topic |
polymorphisms population genetics |
description |
Universidade Federal de Minas Gerais. Instituto de Ciências Biológicas. Departamento de Biologia Geral. Belo Horizonte, MG, Brazil. |
publishDate |
2011 |
dc.date.issued.fl_str_mv |
2011 |
dc.date.accessioned.fl_str_mv |
2014-11-13T17:11:44Z |
dc.date.available.fl_str_mv |
2014-11-13T17:11:44Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
MACHADO, Moara et al. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies. Investigative Genetics, volume 2, fasciculo 1 p. 3, 2011. |
dc.identifier.uri.fl_str_mv |
https://www.arca.fiocruz.br/handle/icict/8836 |
dc.identifier.issn.none.fl_str_mv |
2041-2223 |
dc.identifier.doi.none.fl_str_mv |
10.1186/2041-2223-2-3 |
identifier_str_mv |
MACHADO, Moara et al. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies. Investigative Genetics, volume 2, fasciculo 1 p. 3, 2011. 2041-2223 10.1186/2041-2223-2-3 |
url |
https://www.arca.fiocruz.br/handle/icict/8836 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
BioMed Central Ltd. |
publisher.none.fl_str_mv |
BioMed Central Ltd. |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da FIOCRUZ (ARCA) instname:Fundação Oswaldo Cruz (FIOCRUZ) instacron:FIOCRUZ |
instname_str |
Fundação Oswaldo Cruz (FIOCRUZ) |
instacron_str |
FIOCRUZ |
institution |
FIOCRUZ |
reponame_str |
Repositório Institucional da FIOCRUZ (ARCA) |
collection |
Repositório Institucional da FIOCRUZ (ARCA) |
bitstream.url.fl_str_mv |
https://www.arca.fiocruz.br/bitstream/icict/8836/1/license.txt https://www.arca.fiocruz.br/bitstream/icict/8836/2/Phred-Phrap%20package%20to%20analyses%20tools.pdf https://www.arca.fiocruz.br/bitstream/icict/8836/3/Phred-Phrap%20package%20to%20analyses%20tools.pdf.txt |
bitstream.checksum.fl_str_mv |
7d48279ffeed55da8dfe2f8e81f3b81f ffb181fe105ba86cd6d27c806243dcda 381d57769789d4f290b7e7b3297ae083 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ) |
repository.mail.fl_str_mv |
repositorio.arca@fiocruz.br |
_version_ |
1813008880932749312 |