Medianas em genômica comparativa

Detalhes bibliográficos
Autor(a) principal: HELMUTH OSSINAGA MARTINES DA SILVA
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Institucional da UFMS
Texto Completo: https://repositorio.ufms.br/handle/123456789/4856
Resumo: Ancestral genome inference is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the SCJ median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, two exacts ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives.
id UFMS_ed16e1d14dac57d71d0be5a2d61289b4
oai_identifier_str oai:repositorio.ufms.br:123456789/4856
network_acronym_str UFMS
network_name_str Repositório Institucional da UFMS
repository_id_str 2124
spelling 2022-06-24T17:13:35Z2022-06-24T17:13:35Z2022https://repositorio.ufms.br/handle/123456789/4856Ancestral genome inference is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the SCJ median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, two exacts ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives.A inferência de genomas ancestrais é uma tarefa clássica em genômica comparativa. Aqui, estudamos o problema da mediana de genomas tal que, dado um conjunto de três ou mais genomas, queremos encontrar um novo genoma que minimize a soma das distâncias par a par entre esse e os genomas dados. A distância representa a quantidade de evolução observada no nível do genoma, para a qual determinamos o número mínimo de operações de rearranjos necessárias para transformar um genoma em outro. Para quase todas as operações de rearranjo conhecidas, o problema da mediana é NP-difícil, com exceção da operação single-cut-or-join (SCJ) que pode ser resolvido eficientemente para genomas multicromossomais circulares e mistos. Neste projeto, estudamos o problema da mediana sob uma medida de rearranjo restrita chamada distância-c4, que é estreitamente relacionada à distância SCJ e à DCJ (double-cut-and-join). Identificamos limitantes precisos e decomposers da mediana-c4 e implementamos algoritmos para a sua construção, dois algoritmo exatos baseados em PLI (Programação Linear Inteira) e três heurísticas combinatórias. Posteriormente, realizamos experimentos com conjunto de dados simulados. Nossos resultados sugerem que a distância-c4 é útil para estudo do problema da mediana de genomas, de perspectiva teórica e prática.Fundação Universidade Federal de Mato Grosso do SulUFMSBrasilAlgoritmos, Biologia Computacional, Rearranjo de GenomasMedianas em genômica comparativainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisFabio Henrique Viduani MartinezHELMUTH OSSINAGA MARTINES DA SILVAinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMSinstname:Universidade Federal de Mato Grosso do Sul (UFMS)instacron:UFMSORIGINALDissertacao_FACOM_UFMS.pdfDissertacao_FACOM_UFMS.pdfapplication/pdf1402302https://repositorio.ufms.br/bitstream/123456789/4856/-1/Dissertacao_FACOM_UFMS.pdf66f2e4f5eaf4a989a9835bd4f4a579ddMD5-1123456789/48562022-06-24 13:13:35.989oai:repositorio.ufms.br:123456789/4856Repositório InstitucionalPUBhttps://repositorio.ufms.br/oai/requestri.prograd@ufms.bropendoar:21242022-06-24T17:13:35Repositório Institucional da UFMS - Universidade Federal de Mato Grosso do Sul (UFMS)false
dc.title.pt_BR.fl_str_mv Medianas em genômica comparativa
title Medianas em genômica comparativa
spellingShingle Medianas em genômica comparativa
HELMUTH OSSINAGA MARTINES DA SILVA
Algoritmos, Biologia Computacional, Rearranjo de Genomas
title_short Medianas em genômica comparativa
title_full Medianas em genômica comparativa
title_fullStr Medianas em genômica comparativa
title_full_unstemmed Medianas em genômica comparativa
title_sort Medianas em genômica comparativa
author HELMUTH OSSINAGA MARTINES DA SILVA
author_facet HELMUTH OSSINAGA MARTINES DA SILVA
author_role author
dc.contributor.advisor1.fl_str_mv Fabio Henrique Viduani Martinez
dc.contributor.author.fl_str_mv HELMUTH OSSINAGA MARTINES DA SILVA
contributor_str_mv Fabio Henrique Viduani Martinez
dc.subject.por.fl_str_mv Algoritmos, Biologia Computacional, Rearranjo de Genomas
topic Algoritmos, Biologia Computacional, Rearranjo de Genomas
description Ancestral genome inference is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the SCJ median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, two exacts ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives.
publishDate 2022
dc.date.accessioned.fl_str_mv 2022-06-24T17:13:35Z
dc.date.available.fl_str_mv 2022-06-24T17:13:35Z
dc.date.issued.fl_str_mv 2022
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio.ufms.br/handle/123456789/4856
url https://repositorio.ufms.br/handle/123456789/4856
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Fundação Universidade Federal de Mato Grosso do Sul
dc.publisher.initials.fl_str_mv UFMS
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Fundação Universidade Federal de Mato Grosso do Sul
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMS
instname:Universidade Federal de Mato Grosso do Sul (UFMS)
instacron:UFMS
instname_str Universidade Federal de Mato Grosso do Sul (UFMS)
instacron_str UFMS
institution UFMS
reponame_str Repositório Institucional da UFMS
collection Repositório Institucional da UFMS
bitstream.url.fl_str_mv https://repositorio.ufms.br/bitstream/123456789/4856/-1/Dissertacao_FACOM_UFMS.pdf
bitstream.checksum.fl_str_mv 66f2e4f5eaf4a989a9835bd4f4a579dd
bitstream.checksumAlgorithm.fl_str_mv MD5
repository.name.fl_str_mv Repositório Institucional da UFMS - Universidade Federal de Mato Grosso do Sul (UFMS)
repository.mail.fl_str_mv ri.prograd@ufms.br
_version_ 1815447974492241920