Análise de sequências de DNA através de códigos corretores de erros

Bassi, Mariana Venezian Musto [UNESP]

Análise de sequências de DNA através de códigos corretores de erros

Detalhes bibliográficos
Autor(a) principal:	Bassi, Mariana Venezian Musto [UNESP]
Data de Publicação:	2019
Tipo de documento:	Trabalho de conclusão de curso
Idioma:	por
Título da fonte:	Repositório Institucional da UNESP
Texto Completo:	http://hdl.handle.net/11449/203138 http://www.athena.biblioteca.unesp.br/exlibris/bd/capelo/2019-03-19/000913029.pdf
Resumo:	Information and coding theory as well as genetics are concerned with the transfer and storage of information. For decades, scientists have studied the integration of these theories, but there is a great difficulty in determining a mathematical structure related to the structure of DNA (deoxyribonucleic acid). In the present work, based on a genetic import system model proposed in [ROCHA 2010] through BCH codes (Bose-Chaudhuri-Hocquenghem) on the Galois ring extension, we implemented an algorithm capable of identifying and reproducing two sequences of DNA, with different biological functions and length of 63 nucleotides, using for both the same six primitive polynomials and generators of degree 6. For this, we need to associate the nitrogen bases of the DNA (adenine, thymine, guanine and cytosine) to the elements of the alphabet of the finite ring Z4 = f0; 1; 2; 3g. This process is called labeling and, in the two results obtained, applying the same generator polynomial, we find 8 codewords with the same labeling. These codewords differ a nucleotide from the original sequence, where the exchanges of nitrogen base occurred in different positions, causing different bases, codons and amino acids. The algorithm is also capable of analyzing mutations in DNA sequences. To exemplify this application, we used the sequence related to exon 14 of the BRCA1 gene (Breast Cancer1) with length 127 analyzing nonsense and missense point mutations through a generation polynomial of degree 7. From the identification and reproduction functions, we find codewords to be used as reference in these analysis. Subsequently, applying each mutation punctually, we observe that the code is able to retrieve the original sequence. Pointing to a mathematical structure associated with error-correcting codes for single strand of DNA, this algorithm can contribute to the development of a methodology that can reduce laboratory time and costs...

Metadados do item

id	UNSP_9786d4488193ee691d52732fdf77a9cd
oai_identifier_str	oai:repositorio.unesp.br:11449/203138
network_acronym_str	UNSP
network_name_str	Repositório Institucional da UNESP
repository_id_str	2946
spelling	Análise de sequências de DNA através de códigos corretores de errosAnalysis of DNA sequences through error-correcting codesCódigos corretores de erros (Teoria da informação)Codigo genéticoDNA - AnáliseTelecomunicaçõesDNA - AnalysisError-correcting codes (Information theory)Genetic codeTelecommunicationInformation and coding theory as well as genetics are concerned with the transfer and storage of information. For decades, scientists have studied the integration of these theories, but there is a great difficulty in determining a mathematical structure related to the structure of DNA (deoxyribonucleic acid). In the present work, based on a genetic import system model proposed in [ROCHA 2010] through BCH codes (Bose-Chaudhuri-Hocquenghem) on the Galois ring extension, we implemented an algorithm capable of identifying and reproducing two sequences of DNA, with different biological functions and length of 63 nucleotides, using for both the same six primitive polynomials and generators of degree 6. For this, we need to associate the nitrogen bases of the DNA (adenine, thymine, guanine and cytosine) to the elements of the alphabet of the finite ring Z4 = f0; 1; 2; 3g. This process is called labeling and, in the two results obtained, applying the same generator polynomial, we find 8 codewords with the same labeling. These codewords differ a nucleotide from the original sequence, where the exchanges of nitrogen base occurred in different positions, causing different bases, codons and amino acids. The algorithm is also capable of analyzing mutations in DNA sequences. To exemplify this application, we used the sequence related to exon 14 of the BRCA1 gene (Breast Cancer1) with length 127 analyzing nonsense and missense point mutations through a generation polynomial of degree 7. From the identification and reproduction functions, we find codewords to be used as reference in these analysis. Subsequently, applying each mutation punctually, we observe that the code is able to retrieve the original sequence. Pointing to a mathematical structure associated with error-correcting codes for single strand of DNA, this algorithm can contribute to the development of a methodology that can reduce laboratory time and costs...A teoria da informação e codificação, bem como, a genética preocupam-se com a transferência e armazenamento de informações. Há décadas, os cientistas estudam o casamento dessas teorias, porém, há uma grande dificuldade em determinar uma estrutura matemática relacionada à estrutura do DNA (ácido desoxirribonucleico). No presente trabalho, baseado em um modelo de sistema para importação genética proposto em [ROCHA 2010] através de códigos BCH (Bose-Chaudhuri-Hocquenghem) sobre a extensão de anel de Galois, implementamos um algoritmo capaz de identificar e reproduzir duas sequências de DNA, com funções biológicas distintas e comprimento de 63 nucleotídeos, utilizando para ambas os mesmos seis polinômios primitivos e geradores de grau 6. Para isso, precisamos associar as bases nitrogenadas do DNA (adenina, timina, guanina e citosina) aos elementos do alfabeto do anel finito Z4 = f0; 1; 2; 3g. Esse processo denomina-se rotulamento e, nas duas sequências analisadas de comprimento 63, aplicando um mesmo polinômio gerador, encontramos 8 palavras-código ambas com o mesmo tipo de rotulamento. Essas palavras-código distam um nucleotídeo da sequência original, onde as trocas de base nitrogenada ocorreram em posições distintas, ocasionando diferentes bases, códons e aminoácidos. O algoritmo também é capaz de analisar mutações em sequências de DNA. Para exemplificar esta aplicação, utilizamos a sequência relacionada ao éxon 14 do gene BRCA1 (Breast Cancer 1) com comprimento 127 analisando mutações pontuais nonsense e missense através de um polinômio gerador de grau 7. A partir das funções de identificação e reprodução, encontramos palavras-código para serem utilizadas como referência nessas análises. Posteriormente, aplicando cada mutação pontualmente, observamos que o código é capaz de recuperar a sequência original. Apontando uma estrutura matemática...Universidade Estadual Paulista (Unesp)Benedito, Cintya Wink de Oliveira [UNESP]Universidade Estadual Paulista (Unesp)Bassi, Mariana Venezian Musto [UNESP]2021-03-10T12:55:42Z2021-03-10T12:55:42Z2019info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesis84 f.application/pdfBASSI, Mariana Venezian Musto. Análise de sequências de DNA através de códigos corretores de erros. 2019. 84 f. Trabalho de conclusão de curso (bacharelado - Engenharia de Telecomunicações) - Universidade Estadual Paulista Julio de Mesquita Filho, Câmpus Experimental de São João da Boa Vista, 2019.http://hdl.handle.net/11449/203138990009130290206341http://www.athena.biblioteca.unesp.br/exlibris/bd/capelo/2019-03-19/000913029.pdf79163755740508210000-0002-4806-3399Almareponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPporinfo:eu-repo/semantics/openAccess2024-08-06T14:18:21Zoai:repositorio.unesp.br:11449/203138Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-06T14:18:21Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv	Análise de sequências de DNA através de códigos corretores de erros Analysis of DNA sequences through error-correcting codes
title	Análise de sequências de DNA através de códigos corretores de erros
spellingShingle	Análise de sequências de DNA através de códigos corretores de erros Bassi, Mariana Venezian Musto [UNESP] Códigos corretores de erros (Teoria da informação) Codigo genético DNA - Análise Telecomunicações DNA - Analysis Error-correcting codes (Information theory) Genetic code Telecommunication
title_short	Análise de sequências de DNA através de códigos corretores de erros
title_full	Análise de sequências de DNA através de códigos corretores de erros
title_fullStr	Análise de sequências de DNA através de códigos corretores de erros
title_full_unstemmed	Análise de sequências de DNA através de códigos corretores de erros
title_sort	Análise de sequências de DNA através de códigos corretores de erros
author	Bassi, Mariana Venezian Musto [UNESP]
author_facet	Bassi, Mariana Venezian Musto [UNESP]
author_role	author
dc.contributor.none.fl_str_mv	Benedito, Cintya Wink de Oliveira [UNESP] Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv	Bassi, Mariana Venezian Musto [UNESP]
dc.subject.por.fl_str_mv	Códigos corretores de erros (Teoria da informação) Codigo genético DNA - Análise Telecomunicações DNA - Analysis Error-correcting codes (Information theory) Genetic code Telecommunication
topic	Códigos corretores de erros (Teoria da informação) Codigo genético DNA - Análise Telecomunicações DNA - Analysis Error-correcting codes (Information theory) Genetic code Telecommunication
description	Information and coding theory as well as genetics are concerned with the transfer and storage of information. For decades, scientists have studied the integration of these theories, but there is a great difficulty in determining a mathematical structure related to the structure of DNA (deoxyribonucleic acid). In the present work, based on a genetic import system model proposed in [ROCHA 2010] through BCH codes (Bose-Chaudhuri-Hocquenghem) on the Galois ring extension, we implemented an algorithm capable of identifying and reproducing two sequences of DNA, with different biological functions and length of 63 nucleotides, using for both the same six primitive polynomials and generators of degree 6. For this, we need to associate the nitrogen bases of the DNA (adenine, thymine, guanine and cytosine) to the elements of the alphabet of the finite ring Z4 = f0; 1; 2; 3g. This process is called labeling and, in the two results obtained, applying the same generator polynomial, we find 8 codewords with the same labeling. These codewords differ a nucleotide from the original sequence, where the exchanges of nitrogen base occurred in different positions, causing different bases, codons and amino acids. The algorithm is also capable of analyzing mutations in DNA sequences. To exemplify this application, we used the sequence related to exon 14 of the BRCA1 gene (Breast Cancer1) with length 127 analyzing nonsense and missense point mutations through a generation polynomial of degree 7. From the identification and reproduction functions, we find codewords to be used as reference in these analysis. Subsequently, applying each mutation punctually, we observe that the code is able to retrieve the original sequence. Pointing to a mathematical structure associated with error-correcting codes for single strand of DNA, this algorithm can contribute to the development of a methodology that can reduce laboratory time and costs...
publishDate	2019
dc.date.none.fl_str_mv	2019 2021-03-10T12:55:42Z 2021-03-10T12:55:42Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/bachelorThesis
format	bachelorThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	BASSI, Mariana Venezian Musto. Análise de sequências de DNA através de códigos corretores de erros. 2019. 84 f. Trabalho de conclusão de curso (bacharelado - Engenharia de Telecomunicações) - Universidade Estadual Paulista Julio de Mesquita Filho, Câmpus Experimental de São João da Boa Vista, 2019. http://hdl.handle.net/11449/203138 990009130290206341 http://www.athena.biblioteca.unesp.br/exlibris/bd/capelo/2019-03-19/000913029.pdf 7916375574050821 0000-0002-4806-3399
identifier_str_mv	BASSI, Mariana Venezian Musto. Análise de sequências de DNA através de códigos corretores de erros. 2019. 84 f. Trabalho de conclusão de curso (bacharelado - Engenharia de Telecomunicações) - Universidade Estadual Paulista Julio de Mesquita Filho, Câmpus Experimental de São João da Boa Vista, 2019. 990009130290206341 7916375574050821 0000-0002-4806-3399
url	http://hdl.handle.net/11449/203138 http://www.athena.biblioteca.unesp.br/exlibris/bd/capelo/2019-03-19/000913029.pdf
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	84 f. application/pdf
dc.publisher.none.fl_str_mv	Universidade Estadual Paulista (Unesp)
publisher.none.fl_str_mv	Universidade Estadual Paulista (Unesp)
dc.source.none.fl_str_mv	Alma reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP
instname_str	Universidade Estadual Paulista (UNESP)
instacron_str	UNESP
institution	UNESP
reponame_str	Repositório Institucional da UNESP
collection	Repositório Institucional da UNESP
repository.name.fl_str_mv	Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_	1808128205521944576

Análise de sequências de DNA através de códigos corretores de erros

Registros relacionados