Deep Learning for genomic data analysis

Detalhes bibliográficos
Autor(a) principal: Vítor Filipe Oliveira Teixeira
Data de Publicação: 2017
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://repositorio-aberto.up.pt/handle/10216/106492
Resumo: Since the Human Genome Project, the availability of genomic data has largely increased. In the last years, genome sequencing technologies and techniques have been improving at a fast rate, resulting in a cheaper and faster genome sequencing. Such amount of data enables both more complex analysis and advances in research. However, a sequencing process quite often produces a huge amount of data that is highly complex. A considerable computational power and efficient algorithms are mandatory in order to extract useful information and perform it in reasonable time, which can represent a constraint on the extraction and comprehension of such information.In this work, we focus on the biological aspects of RNA-Seq and its analysis using traditional Machine Learning and Deep learning methods. We divided our study into two branches. First, we built and compared the accuracy of classifiers that were able distinguish the RNA-seq samples of thyroid cancer patients from samples of healthy persons. Secondly, we have investigated the possibility of building comprehensible descriptions for the differences in the RNA-Seq data by using Denoising Autoencoders and Stacked Denoising Autoencoders as base classifiers and then devising post-processing techniques to extract comprehensible and biologically meaningful descriptions out of the constructed models.
id RCAP_f06a6d1e125b5f9434777ef464985aaf
oai_identifier_str oai:repositorio-aberto.up.pt:10216/106492
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Deep Learning for genomic data analysisEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringSince the Human Genome Project, the availability of genomic data has largely increased. In the last years, genome sequencing technologies and techniques have been improving at a fast rate, resulting in a cheaper and faster genome sequencing. Such amount of data enables both more complex analysis and advances in research. However, a sequencing process quite often produces a huge amount of data that is highly complex. A considerable computational power and efficient algorithms are mandatory in order to extract useful information and perform it in reasonable time, which can represent a constraint on the extraction and comprehension of such information.In this work, we focus on the biological aspects of RNA-Seq and its analysis using traditional Machine Learning and Deep learning methods. We divided our study into two branches. First, we built and compared the accuracy of classifiers that were able distinguish the RNA-seq samples of thyroid cancer patients from samples of healthy persons. Secondly, we have investigated the possibility of building comprehensible descriptions for the differences in the RNA-Seq data by using Denoising Autoencoders and Stacked Denoising Autoencoders as base classifiers and then devising post-processing techniques to extract comprehensible and biologically meaningful descriptions out of the constructed models.2017-07-142017-07-14T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://repositorio-aberto.up.pt/handle/10216/106492TID:201804042porVítor Filipe Oliveira Teixeirainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T13:08:00Zoai:repositorio-aberto.up.pt:10216/106492Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:34:11.119597Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Deep Learning for genomic data analysis
title Deep Learning for genomic data analysis
spellingShingle Deep Learning for genomic data analysis
Vítor Filipe Oliveira Teixeira
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Deep Learning for genomic data analysis
title_full Deep Learning for genomic data analysis
title_fullStr Deep Learning for genomic data analysis
title_full_unstemmed Deep Learning for genomic data analysis
title_sort Deep Learning for genomic data analysis
author Vítor Filipe Oliveira Teixeira
author_facet Vítor Filipe Oliveira Teixeira
author_role author
dc.contributor.author.fl_str_mv Vítor Filipe Oliveira Teixeira
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description Since the Human Genome Project, the availability of genomic data has largely increased. In the last years, genome sequencing technologies and techniques have been improving at a fast rate, resulting in a cheaper and faster genome sequencing. Such amount of data enables both more complex analysis and advances in research. However, a sequencing process quite often produces a huge amount of data that is highly complex. A considerable computational power and efficient algorithms are mandatory in order to extract useful information and perform it in reasonable time, which can represent a constraint on the extraction and comprehension of such information.In this work, we focus on the biological aspects of RNA-Seq and its analysis using traditional Machine Learning and Deep learning methods. We divided our study into two branches. First, we built and compared the accuracy of classifiers that were able distinguish the RNA-seq samples of thyroid cancer patients from samples of healthy persons. Secondly, we have investigated the possibility of building comprehensible descriptions for the differences in the RNA-Seq data by using Denoising Autoencoders and Stacked Denoising Autoencoders as base classifiers and then devising post-processing techniques to extract comprehensible and biologically meaningful descriptions out of the constructed models.
publishDate 2017
dc.date.none.fl_str_mv 2017-07-14
2017-07-14T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio-aberto.up.pt/handle/10216/106492
TID:201804042
url https://repositorio-aberto.up.pt/handle/10216/106492
identifier_str_mv TID:201804042
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135653449957376