FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline

Detalhes bibliográficos
Autor(a) principal: Eddie Luidy Imada
Data de Publicação: 2019
Tipo de documento: Tese
Idioma: eng
Título da fonte: Repositório Institucional da UFMG
Texto Completo: http://hdl.handle.net/1843/30015
Resumo: In recent years, in depth exploration of genomes structure and function has revealed a central role for non-coding RNAs (ncRNAs) in orchestrating key biological and cellular processes through the fine tuning of gene expression regulation. Most importantly, the understanding of the role for ncRNAs has also started to emerge in human disease pathogenesis. This further speaks to the importance of an in-depth characterization of ncRNA involvement in diseases, including cancer. In this work, we have built a comprehensive atlas of gene expression, named FC-R2, across the human transcriptome containing over 100,000 genes by leveraging two publicly available resources: the FANTOM CAGE Associated Transcriptome (FANTOM-CAT), and recount2. The FANTOM-CAT is a comprehensive meta-assembly of the human transcriptome encompassing coding and non-coding genes, including promoters, enhancers, and lncRNAs. recount2 is the largest, available collection of human RNA-seq data processed and quantified using a unified pipeline, containing over 4.4 trillion reads from over 70,000 human samples from the SRA, GTEx and TCGA projects. Using FC-R2 gene expression summaries across human tissue samples from the GTEx project, we validated our approach by reproducing key findings recently described by the FANTOM consortium and the TCGA Pan-Cancer atlas. We also demonstrated the power and usability of the FC-R2 by performing two case studies in prostate cancer highlighting potential “novel” lncRNAs players involved in the clinically relevant prostate cancer phenotype. Finally, we make the FC-R2 atlas available as a public tool to empower other researchers to study important biological and clinical phenotypes and identify new candidate ncRNAs for further investigation.
id UFMG_9af04a959cb7c577123055891fdbf623
oai_identifier_str oai:repositorio.ufmg.br:1843/30015
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling GLÓRIA REGINA FRANCOhttp://lattes.cnpq.br/7543542253155919LUIGI MARCHIONNIEMMANUAL DIAS-NETOWAGNER CARLOS SANTOS MAGALHÃESJOÃO TRINDADE MARQUESRENATO SANTANA DE AGUIARhttp://lattes.cnpq.br/9002266697627178Eddie Luidy Imada2019-09-11T18:20:02Z2019-09-11T18:20:02Z2019-05-23http://hdl.handle.net/1843/30015In recent years, in depth exploration of genomes structure and function has revealed a central role for non-coding RNAs (ncRNAs) in orchestrating key biological and cellular processes through the fine tuning of gene expression regulation. Most importantly, the understanding of the role for ncRNAs has also started to emerge in human disease pathogenesis. This further speaks to the importance of an in-depth characterization of ncRNA involvement in diseases, including cancer. In this work, we have built a comprehensive atlas of gene expression, named FC-R2, across the human transcriptome containing over 100,000 genes by leveraging two publicly available resources: the FANTOM CAGE Associated Transcriptome (FANTOM-CAT), and recount2. The FANTOM-CAT is a comprehensive meta-assembly of the human transcriptome encompassing coding and non-coding genes, including promoters, enhancers, and lncRNAs. recount2 is the largest, available collection of human RNA-seq data processed and quantified using a unified pipeline, containing over 4.4 trillion reads from over 70,000 human samples from the SRA, GTEx and TCGA projects. Using FC-R2 gene expression summaries across human tissue samples from the GTEx project, we validated our approach by reproducing key findings recently described by the FANTOM consortium and the TCGA Pan-Cancer atlas. We also demonstrated the power and usability of the FC-R2 by performing two case studies in prostate cancer highlighting potential “novel” lncRNAs players involved in the clinically relevant prostate cancer phenotype. Finally, we make the FC-R2 atlas available as a public tool to empower other researchers to study important biological and clinical phenotypes and identify new candidate ncRNAs for further investigation.Recentemente, estudos à fundo das funções e estrutura de genomas revelou que RNAs não codificadores desempenham um papel essencial no controle e regulação de processos biológicos e celulares através da regulação da expressão gênica. Estes mecanismos também começaram a ser elucidados em doenças humanas, destacando a importância da caracterização dos papéis desempenhados pelos RNAs não-codificadores em doenças, como o câncer. Neste trabalho, nós construímos um atlas de expressão gênica do transcriptoma humano contendo mais de 100.000 genes fazendo uso de dois recursos públicos: o transcriptoma associado à CAGE do projeto FANTOM (do inglês FANTOM-CAT) e o recount2, denominado FC-R2. O FANTOM-CAT é uma meta-montagem completa do transcriptoma humano contendo ambos genes codificadores e não-codificadores, incluindo promotores, enhancers e RNAs não codificadores longos. Recount2 é a maior coleção disponível de dados de RNA-seq humano processados e quantificados utilizando um pipeline unificado contendo mais de 4,4 trilhões de bases e mais de 70.000 amostras humanas derivadas do SRA e dos projetos TCGA e GTEx. Utilizando dados do GTEx derivados do FC-R2, nós validamos nossa abordagem ao reproduzir diversas descobertas importantes descritas recentemente pelo projeto FANTOM e do Pan-cancer atlas do TCGA. Em dois estudos de caso, nós também demonstramos a utilidade e capacidade do FC-R2 em recuperar novos RNAs não-codificadores longos potencialmente envolvidos em fenótipos de importância clínica. Concluindo, nós disponibilizamos o atlas FC-R2 como uma ferramenta publica para permitir que outros pesquisadores sejam capazes de identificar novos RNAs não-codificadores em fenótipos de interesse.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoFAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em BioinformaticaUFMGBrasilICB - DEPARTAMENTO DE BIOQUÍMICA E IMUNOLOGIALncRNAFANTOMRecountDatabaseExpressionFC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipelineFC-R2: Um atlas completo de expressão de ARNs não codificadores utilizando um pipeline padronizadoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALFinal.pdfFinal.pdfTeseapplication/pdf9401729https://repositorio.ufmg.br/bitstream/1843/30015/1/Final.pdf09211c448aa44290f69b06e98335b3dcMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82119https://repositorio.ufmg.br/bitstream/1843/30015/2/license.txt34badce4be7e31e3adb4575ae96af679MD52TEXTFinal.pdf.txtFinal.pdf.txtExtracted texttext/plain155246https://repositorio.ufmg.br/bitstream/1843/30015/3/Final.pdf.txt29fb61608c67763662a85722f7e2fa34MD531843/300152020-01-24 16:29:48.713oai:repositorio.ufmg.br:1843/30015TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KCg==Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2020-01-24T19:29:48Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
dc.title.alternative.pt_BR.fl_str_mv FC-R2: Um atlas completo de expressão de ARNs não codificadores utilizando um pipeline padronizado
title FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
spellingShingle FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
Eddie Luidy Imada
LncRNA
FANTOM
Recount
Database
Expression
title_short FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
title_full FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
title_fullStr FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
title_full_unstemmed FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
title_sort FC-R2: A comprehensive atlas of human long non-coding RNAs expression using a standardized pipeline
author Eddie Luidy Imada
author_facet Eddie Luidy Imada
author_role author
dc.contributor.advisor1.fl_str_mv GLÓRIA REGINA FRANCO
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/7543542253155919
dc.contributor.advisor-co1.fl_str_mv LUIGI MARCHIONNI
dc.contributor.referee1.fl_str_mv EMMANUAL DIAS-NETO
dc.contributor.referee2.fl_str_mv WAGNER CARLOS SANTOS MAGALHÃES
dc.contributor.referee3.fl_str_mv JOÃO TRINDADE MARQUES
dc.contributor.referee4.fl_str_mv RENATO SANTANA DE AGUIAR
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/9002266697627178
dc.contributor.author.fl_str_mv Eddie Luidy Imada
contributor_str_mv GLÓRIA REGINA FRANCO
LUIGI MARCHIONNI
EMMANUAL DIAS-NETO
WAGNER CARLOS SANTOS MAGALHÃES
JOÃO TRINDADE MARQUES
RENATO SANTANA DE AGUIAR
dc.subject.por.fl_str_mv LncRNA
FANTOM
Recount
Database
Expression
topic LncRNA
FANTOM
Recount
Database
Expression
description In recent years, in depth exploration of genomes structure and function has revealed a central role for non-coding RNAs (ncRNAs) in orchestrating key biological and cellular processes through the fine tuning of gene expression regulation. Most importantly, the understanding of the role for ncRNAs has also started to emerge in human disease pathogenesis. This further speaks to the importance of an in-depth characterization of ncRNA involvement in diseases, including cancer. In this work, we have built a comprehensive atlas of gene expression, named FC-R2, across the human transcriptome containing over 100,000 genes by leveraging two publicly available resources: the FANTOM CAGE Associated Transcriptome (FANTOM-CAT), and recount2. The FANTOM-CAT is a comprehensive meta-assembly of the human transcriptome encompassing coding and non-coding genes, including promoters, enhancers, and lncRNAs. recount2 is the largest, available collection of human RNA-seq data processed and quantified using a unified pipeline, containing over 4.4 trillion reads from over 70,000 human samples from the SRA, GTEx and TCGA projects. Using FC-R2 gene expression summaries across human tissue samples from the GTEx project, we validated our approach by reproducing key findings recently described by the FANTOM consortium and the TCGA Pan-Cancer atlas. We also demonstrated the power and usability of the FC-R2 by performing two case studies in prostate cancer highlighting potential “novel” lncRNAs players involved in the clinically relevant prostate cancer phenotype. Finally, we make the FC-R2 atlas available as a public tool to empower other researchers to study important biological and clinical phenotypes and identify new candidate ncRNAs for further investigation.
publishDate 2019
dc.date.accessioned.fl_str_mv 2019-09-11T18:20:02Z
dc.date.available.fl_str_mv 2019-09-11T18:20:02Z
dc.date.issued.fl_str_mv 2019-05-23
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1843/30015
url http://hdl.handle.net/1843/30015
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Bioinformatica
dc.publisher.initials.fl_str_mv UFMG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv ICB - DEPARTAMENTO DE BIOQUÍMICA E IMUNOLOGIA
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
bitstream.url.fl_str_mv https://repositorio.ufmg.br/bitstream/1843/30015/1/Final.pdf
https://repositorio.ufmg.br/bitstream/1843/30015/2/license.txt
https://repositorio.ufmg.br/bitstream/1843/30015/3/Final.pdf.txt
bitstream.checksum.fl_str_mv 09211c448aa44290f69b06e98335b3dc
34badce4be7e31e3adb4575ae96af679
29fb61608c67763662a85722f7e2fa34
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_ 1803589166026457088