SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Outros Autores: | , , , , , , , , , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1093/nar/gkw655 http://hdl.handle.net/11449/173790 |
Resumo: | SNPs (Single Nucleotide Polymorphisms) are genetic markers whose precise identification is a prerequisite for association studies. Methods to identify them are currently well developed for model species, but rely on the availability of a (good) reference genome, and therefore cannot be applied to non-model species. They are also mostly tailored for whole genome (re-)sequencing experiments, whereas in many cases, transcriptome sequencing can be used as a cheaper alternative which already enables to identify SNPs located in transcribed regions. In this paper, we propose a method that identifies, quantifies and annotates SNPs without any reference genome, using RNA-seq data only. Individuals can be pooled prior to sequencing, if not enough material is available from one individual. Using pooled human RNA-seq data, we clarify the precision and recall of our method and discuss them with respect to other methods which use a reference genome or an assembled transcriptome. We then validate experimentally the predictions of our method using RNA-seq data from two non-model species. The method can be used for any species to annotate SNPs and predict their impact on the protein sequence. We further enable to test for the association of the identified SNPs with a phenotype of interest. |
id |
UNSP_6381dce8375f410bddf8bb85f4d8547a |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/173790 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequenceSNPs (Single Nucleotide Polymorphisms) are genetic markers whose precise identification is a prerequisite for association studies. Methods to identify them are currently well developed for model species, but rely on the availability of a (good) reference genome, and therefore cannot be applied to non-model species. They are also mostly tailored for whole genome (re-)sequencing experiments, whereas in many cases, transcriptome sequencing can be used as a cheaper alternative which already enables to identify SNPs located in transcribed regions. In this paper, we propose a method that identifies, quantifies and annotates SNPs without any reference genome, using RNA-seq data only. Individuals can be pooled prior to sequencing, if not enough material is available from one individual. Using pooled human RNA-seq data, we clarify the precision and recall of our method and discuss them with respect to other methods which use a reference genome or an assembled transcriptome. We then validate experimentally the predictions of our method using RNA-seq data from two non-model species. The method can be used for any species to annotate SNPs and predict their impact on the protein sequence. We further enable to test for the association of the identified SNPs with a phenotype of interest.European Research CouncilAgence Nationale de la RechercheUniversité de LyonUniversité Lyon 1 CNRS UMR5558 Laboratoire de Biométrie et Biologie EvolutiveEPI ERABLE - Inria GrenoblePT Génomique et Transcriptomique BIOASTERUniversité de RennesÉquipe GenScale IRISASynergie-Lyon-Cancer Universite Lyon 1 Centre Leon BerardDepartment of Biology UNESP São Paulo State University, São José do Rio PretoDepartment of Biology UNESP São Paulo State University, São José do Rio PretoEuropean Research Council: 247073]10Agence Nationale de la Recherche: ANR-11-BINF-0001-06Agence Nationale de la Recherche: ANR-12-BS02-0008Agence Nationale de la Recherche: ANR-2010-BLAN-170101European Research Council: FP7 /2007-2013Université de LyonLaboratoire de Biométrie et Biologie EvolutiveEPI ERABLE - Inria GrenobleBIOASTERUniversité de RennesIRISACentre Leon BerardUniversidade Estadual Paulista (Unesp)Lopez-Maestre, HélèneBrinza, LiliaMarchet, CamilleKielbassa, JaniceBastien, SylvèreBoutigny, MathildeMonnin, DavidFilali, Adil ElCarareto, Claudia Marcia [UNESP]Vieira, CristinaPicard, FranckKremer, NatachaVavre, FabriceSagot, Marie-FranceLacroix, Vincent2018-12-11T17:07:46Z2018-12-11T17:07:46Z2016-11-02info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://dx.doi.org/10.1093/nar/gkw655Nucleic Acids Research, v. 44, n. 19, 2016.1362-49620305-1048http://hdl.handle.net/11449/17379010.1093/nar/gkw6552-s2.0-849949083152-s2.0-84994908315.pdf34257729983192160000-0002-0298-1354Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengNucleic Acids Research9,0259,025info:eu-repo/semantics/openAccess2023-11-09T06:09:48Zoai:repositorio.unesp.br:11449/173790Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T17:13:04.549445Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
title |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
spellingShingle |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence Lopez-Maestre, Hélène |
title_short |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
title_full |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
title_fullStr |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
title_full_unstemmed |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
title_sort |
SNP calling from RNA-seq data without a reference genome: Identification, quantification, differential analysis and impact on the protein sequence |
author |
Lopez-Maestre, Hélène |
author_facet |
Lopez-Maestre, Hélène Brinza, Lilia Marchet, Camille Kielbassa, Janice Bastien, Sylvère Boutigny, Mathilde Monnin, David Filali, Adil El Carareto, Claudia Marcia [UNESP] Vieira, Cristina Picard, Franck Kremer, Natacha Vavre, Fabrice Sagot, Marie-France Lacroix, Vincent |
author_role |
author |
author2 |
Brinza, Lilia Marchet, Camille Kielbassa, Janice Bastien, Sylvère Boutigny, Mathilde Monnin, David Filali, Adil El Carareto, Claudia Marcia [UNESP] Vieira, Cristina Picard, Franck Kremer, Natacha Vavre, Fabrice Sagot, Marie-France Lacroix, Vincent |
author2_role |
author author author author author author author author author author author author author author |
dc.contributor.none.fl_str_mv |
Université de Lyon Laboratoire de Biométrie et Biologie Evolutive EPI ERABLE - Inria Grenoble BIOASTER Université de Rennes IRISA Centre Leon Berard Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Lopez-Maestre, Hélène Brinza, Lilia Marchet, Camille Kielbassa, Janice Bastien, Sylvère Boutigny, Mathilde Monnin, David Filali, Adil El Carareto, Claudia Marcia [UNESP] Vieira, Cristina Picard, Franck Kremer, Natacha Vavre, Fabrice Sagot, Marie-France Lacroix, Vincent |
description |
SNPs (Single Nucleotide Polymorphisms) are genetic markers whose precise identification is a prerequisite for association studies. Methods to identify them are currently well developed for model species, but rely on the availability of a (good) reference genome, and therefore cannot be applied to non-model species. They are also mostly tailored for whole genome (re-)sequencing experiments, whereas in many cases, transcriptome sequencing can be used as a cheaper alternative which already enables to identify SNPs located in transcribed regions. In this paper, we propose a method that identifies, quantifies and annotates SNPs without any reference genome, using RNA-seq data only. Individuals can be pooled prior to sequencing, if not enough material is available from one individual. Using pooled human RNA-seq data, we clarify the precision and recall of our method and discuss them with respect to other methods which use a reference genome or an assembled transcriptome. We then validate experimentally the predictions of our method using RNA-seq data from two non-model species. The method can be used for any species to annotate SNPs and predict their impact on the protein sequence. We further enable to test for the association of the identified SNPs with a phenotype of interest. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-11-02 2018-12-11T17:07:46Z 2018-12-11T17:07:46Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1093/nar/gkw655 Nucleic Acids Research, v. 44, n. 19, 2016. 1362-4962 0305-1048 http://hdl.handle.net/11449/173790 10.1093/nar/gkw655 2-s2.0-84994908315 2-s2.0-84994908315.pdf 3425772998319216 0000-0002-0298-1354 |
url |
http://dx.doi.org/10.1093/nar/gkw655 http://hdl.handle.net/11449/173790 |
identifier_str_mv |
Nucleic Acids Research, v. 44, n. 19, 2016. 1362-4962 0305-1048 10.1093/nar/gkw655 2-s2.0-84994908315 2-s2.0-84994908315.pdf 3425772998319216 0000-0002-0298-1354 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Nucleic Acids Research 9,025 9,025 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128774433144832 |