Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error

Detalhes bibliográficos
Autor(a) principal: Livia Maria Silva Moura
Data de Publicação: 2022
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: https://doi.org/10.11606/T.95.2022.tde-19072022-153242
Resumo: On a daily based, there is a huge amount of biological data being generated that brings deeper insight into our world. This exponential growth of information has brought many challenges, especially when dealing with metagenomic data. Thousands of metagenomic-assembled genomes (MAG) are recovered annually, and many methods try to achieve the best output possible through assembly and binning approaches. But the question that every time is more frequent is: Are these MAGs reliable? The importance of having good quality data is undeniable. Retrieving reliable and complete, or nearly-complete MAGs, provides important data for further analysis and research purposes. Within this scenario, we\'d like to know if it is possible to improve the MAGs metrics quality when compared to the state-of-the art in time series data. In addition, we\'d like to develop a tool capable of locating and fixing assembly errors, thereby assisting in genomic curation. Here, we describe an approach of MAGs recovery from time-serial data called the \"screening method\", drastically reducing the recovery from potentially unreliable MAGs. In addition, we present FixAME, a tool capable of helping curate assembled sequences of metagenome, single genome, a set of bins, phages, archaea, or any organism that contains single double-stranded DNA.
id USP_0b2f7fa71f29a2b261a52a69bbb759f2
oai_identifier_str oai:teses.usp.br:tde-19072022-153242
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error Recuperação de MAGs de alta qualidade em dados metagenômicos tempo seriado e desenvolvimento de FixAME: uma ferramenta auxiliar de curadoria de erros de montagem genômica 2022-04-14João Carlos SetubalNalvo Franco de Almeida JuniorChristian HoffmannAlessandro de Mello VaraniLivia Maria Silva MouraUniversidade de São PauloBioinformáticaUSPBR Bioinformática Bioinformatics Curadoria Curation Ferramenta Genoma Genome Metagenome Metagenomica Microbiologia Microbiology Tool On a daily based, there is a huge amount of biological data being generated that brings deeper insight into our world. This exponential growth of information has brought many challenges, especially when dealing with metagenomic data. Thousands of metagenomic-assembled genomes (MAG) are recovered annually, and many methods try to achieve the best output possible through assembly and binning approaches. But the question that every time is more frequent is: Are these MAGs reliable? The importance of having good quality data is undeniable. Retrieving reliable and complete, or nearly-complete MAGs, provides important data for further analysis and research purposes. Within this scenario, we\'d like to know if it is possible to improve the MAGs metrics quality when compared to the state-of-the art in time series data. In addition, we\'d like to develop a tool capable of locating and fixing assembly errors, thereby assisting in genomic curation. Here, we describe an approach of MAGs recovery from time-serial data called the \"screening method\", drastically reducing the recovery from potentially unreliable MAGs. In addition, we present FixAME, a tool capable of helping curate assembled sequences of metagenome, single genome, a set of bins, phages, archaea, or any organism that contains single double-stranded DNA. Diariamente, há uma grande quantidade de dados biológicos sendo gerados que trazem uma visão mais profunda do nosso mundo. Esse crescimento exponencial de informações trouxe muitos desafios, principalmente quando se trata de dados metagenômicos. Milhares de genomas montados a partir de dados metagenômicos (MAG) são recuperados anualmente, e muitos métodos tentam alcançar o melhor resultado possível por meio de abordagens de montagem e binning. Mas a pergunta que cada vez é mais frequente é: esses MAGS são confiáveis? A importância de ter dados de boa qualidade é inegável. A recuperação de MAGs confiáveis e completos, ou quase completos, fornece dados importantes para análises futuras e propósitos de pesquisa. Dentro desse cenário, gostaríamos de saber se é possível melhorar a qualidade das métricas dos MAGs quando comparadas ao estado da arte em dados seriados no tempo. Além disso, gostaríamos de desenvolver uma ferramenta capaz de localizar e corrigir erros de montagem, dessa forma auxiliando na curadoria genômica. Aqui, nós descrevemos uma abordagem de MAGs recuperados de dados seriados no tempo chamado \"método de triagem\", reduzindo drasticamente a recuperação de MAGs potencialmente não confiáveis. Além disso, apresentamos FixAME, uma ferramenta capaz de auxiliar na curadoria de sequências montadas de metagenoma, genoma único, conjunto de bins, fagos, arqueias ou qualquer organismo que contenha DNA de fita simples. https://doi.org/10.11606/T.95.2022.tde-19072022-153242info:eu-repo/semantics/openAccessengreponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USP2023-12-21T19:25:20Zoai:teses.usp.br:tde-19072022-153242Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212023-12-22T12:55:19.212693Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.en.fl_str_mv Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
dc.title.alternative.pt.fl_str_mv Recuperação de MAGs de alta qualidade em dados metagenômicos tempo seriado e desenvolvimento de FixAME: uma ferramenta auxiliar de curadoria de erros de montagem genômica
title Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
spellingShingle Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
Livia Maria Silva Moura
title_short Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
title_full Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
title_fullStr Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
title_full_unstemmed Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
title_sort Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
author Livia Maria Silva Moura
author_facet Livia Maria Silva Moura
author_role author
dc.contributor.advisor1.fl_str_mv João Carlos Setubal
dc.contributor.referee1.fl_str_mv Nalvo Franco de Almeida Junior
dc.contributor.referee2.fl_str_mv Christian Hoffmann
dc.contributor.referee3.fl_str_mv Alessandro de Mello Varani
dc.contributor.author.fl_str_mv Livia Maria Silva Moura
contributor_str_mv João Carlos Setubal
Nalvo Franco de Almeida Junior
Christian Hoffmann
Alessandro de Mello Varani
description On a daily based, there is a huge amount of biological data being generated that brings deeper insight into our world. This exponential growth of information has brought many challenges, especially when dealing with metagenomic data. Thousands of metagenomic-assembled genomes (MAG) are recovered annually, and many methods try to achieve the best output possible through assembly and binning approaches. But the question that every time is more frequent is: Are these MAGs reliable? The importance of having good quality data is undeniable. Retrieving reliable and complete, or nearly-complete MAGs, provides important data for further analysis and research purposes. Within this scenario, we\'d like to know if it is possible to improve the MAGs metrics quality when compared to the state-of-the art in time series data. In addition, we\'d like to develop a tool capable of locating and fixing assembly errors, thereby assisting in genomic curation. Here, we describe an approach of MAGs recovery from time-serial data called the \"screening method\", drastically reducing the recovery from potentially unreliable MAGs. In addition, we present FixAME, a tool capable of helping curate assembled sequences of metagenome, single genome, a set of bins, phages, archaea, or any organism that contains single double-stranded DNA.
publishDate 2022
dc.date.issued.fl_str_mv 2022-04-14
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.11606/T.95.2022.tde-19072022-153242
url https://doi.org/10.11606/T.95.2022.tde-19072022-153242
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade de São Paulo
dc.publisher.program.fl_str_mv Bioinformática
dc.publisher.initials.fl_str_mv USP
dc.publisher.country.fl_str_mv BR
publisher.none.fl_str_mv Universidade de São Paulo
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1794502865507581952