Medianas em genômica comparativa
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFMS |
Texto Completo: | https://repositorio.ufms.br/handle/123456789/4856 |
Resumo: | Ancestral genome inference is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the SCJ median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, two exacts ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives. |
id |
UFMS_ed16e1d14dac57d71d0be5a2d61289b4 |
---|---|
oai_identifier_str |
oai:repositorio.ufms.br:123456789/4856 |
network_acronym_str |
UFMS |
network_name_str |
Repositório Institucional da UFMS |
repository_id_str |
2124 |
spelling |
2022-06-24T17:13:35Z2022-06-24T17:13:35Z2022https://repositorio.ufms.br/handle/123456789/4856Ancestral genome inference is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the SCJ median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, two exacts ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives.A inferência de genomas ancestrais é uma tarefa clássica em genômica comparativa. Aqui, estudamos o problema da mediana de genomas tal que, dado um conjunto de três ou mais genomas, queremos encontrar um novo genoma que minimize a soma das distâncias par a par entre esse e os genomas dados. A distância representa a quantidade de evolução observada no nível do genoma, para a qual determinamos o número mínimo de operações de rearranjos necessárias para transformar um genoma em outro. Para quase todas as operações de rearranjo conhecidas, o problema da mediana é NP-difícil, com exceção da operação single-cut-or-join (SCJ) que pode ser resolvido eficientemente para genomas multicromossomais circulares e mistos. Neste projeto, estudamos o problema da mediana sob uma medida de rearranjo restrita chamada distância-c4, que é estreitamente relacionada à distância SCJ e à DCJ (double-cut-and-join). Identificamos limitantes precisos e decomposers da mediana-c4 e implementamos algoritmos para a sua construção, dois algoritmo exatos baseados em PLI (Programação Linear Inteira) e três heurísticas combinatórias. Posteriormente, realizamos experimentos com conjunto de dados simulados. Nossos resultados sugerem que a distância-c4 é útil para estudo do problema da mediana de genomas, de perspectiva teórica e prática.Fundação Universidade Federal de Mato Grosso do SulUFMSBrasilAlgoritmos, Biologia Computacional, Rearranjo de GenomasMedianas em genômica comparativainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisFabio Henrique Viduani MartinezHELMUTH OSSINAGA MARTINES DA SILVAinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMSinstname:Universidade Federal de Mato Grosso do Sul (UFMS)instacron:UFMSORIGINALDissertacao_FACOM_UFMS.pdfDissertacao_FACOM_UFMS.pdfapplication/pdf1402302https://repositorio.ufms.br/bitstream/123456789/4856/-1/Dissertacao_FACOM_UFMS.pdf66f2e4f5eaf4a989a9835bd4f4a579ddMD5-1123456789/48562022-06-24 13:13:35.989oai:repositorio.ufms.br:123456789/4856Repositório InstitucionalPUBhttps://repositorio.ufms.br/oai/requestri.prograd@ufms.bropendoar:21242022-06-24T17:13:35Repositório Institucional da UFMS - Universidade Federal de Mato Grosso do Sul (UFMS)false |
dc.title.pt_BR.fl_str_mv |
Medianas em genômica comparativa |
title |
Medianas em genômica comparativa |
spellingShingle |
Medianas em genômica comparativa HELMUTH OSSINAGA MARTINES DA SILVA Algoritmos, Biologia Computacional, Rearranjo de Genomas |
title_short |
Medianas em genômica comparativa |
title_full |
Medianas em genômica comparativa |
title_fullStr |
Medianas em genômica comparativa |
title_full_unstemmed |
Medianas em genômica comparativa |
title_sort |
Medianas em genômica comparativa |
author |
HELMUTH OSSINAGA MARTINES DA SILVA |
author_facet |
HELMUTH OSSINAGA MARTINES DA SILVA |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Fabio Henrique Viduani Martinez |
dc.contributor.author.fl_str_mv |
HELMUTH OSSINAGA MARTINES DA SILVA |
contributor_str_mv |
Fabio Henrique Viduani Martinez |
dc.subject.por.fl_str_mv |
Algoritmos, Biologia Computacional, Rearranjo de Genomas |
topic |
Algoritmos, Biologia Computacional, Rearranjo de Genomas |
description |
Ancestral genome inference is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the SCJ median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, two exacts ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives. |
publishDate |
2022 |
dc.date.accessioned.fl_str_mv |
2022-06-24T17:13:35Z |
dc.date.available.fl_str_mv |
2022-06-24T17:13:35Z |
dc.date.issued.fl_str_mv |
2022 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufms.br/handle/123456789/4856 |
url |
https://repositorio.ufms.br/handle/123456789/4856 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Fundação Universidade Federal de Mato Grosso do Sul |
dc.publisher.initials.fl_str_mv |
UFMS |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Fundação Universidade Federal de Mato Grosso do Sul |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMS instname:Universidade Federal de Mato Grosso do Sul (UFMS) instacron:UFMS |
instname_str |
Universidade Federal de Mato Grosso do Sul (UFMS) |
instacron_str |
UFMS |
institution |
UFMS |
reponame_str |
Repositório Institucional da UFMS |
collection |
Repositório Institucional da UFMS |
bitstream.url.fl_str_mv |
https://repositorio.ufms.br/bitstream/123456789/4856/-1/Dissertacao_FACOM_UFMS.pdf |
bitstream.checksum.fl_str_mv |
66f2e4f5eaf4a989a9835bd4f4a579dd |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFMS - Universidade Federal de Mato Grosso do Sul (UFMS) |
repository.mail.fl_str_mv |
ri.prograd@ufms.br |
_version_ |
1815447974492241920 |