Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://doi.org/10.11606/T.95.2022.tde-19072022-153242 |
Resumo: | On a daily based, there is a huge amount of biological data being generated that brings deeper insight into our world. This exponential growth of information has brought many challenges, especially when dealing with metagenomic data. Thousands of metagenomic-assembled genomes (MAG) are recovered annually, and many methods try to achieve the best output possible through assembly and binning approaches. But the question that every time is more frequent is: Are these MAGs reliable? The importance of having good quality data is undeniable. Retrieving reliable and complete, or nearly-complete MAGs, provides important data for further analysis and research purposes. Within this scenario, we\'d like to know if it is possible to improve the MAGs metrics quality when compared to the state-of-the art in time series data. In addition, we\'d like to develop a tool capable of locating and fixing assembly errors, thereby assisting in genomic curation. Here, we describe an approach of MAGs recovery from time-serial data called the \"screening method\", drastically reducing the recovery from potentially unreliable MAGs. In addition, we present FixAME, a tool capable of helping curate assembled sequences of metagenome, single genome, a set of bins, phages, archaea, or any organism that contains single double-stranded DNA. |
id |
USP_0b2f7fa71f29a2b261a52a69bbb759f2 |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-19072022-153242 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error Recuperação de MAGs de alta qualidade em dados metagenômicos tempo seriado e desenvolvimento de FixAME: uma ferramenta auxiliar de curadoria de erros de montagem genômica 2022-04-14João Carlos SetubalNalvo Franco de Almeida JuniorChristian HoffmannAlessandro de Mello VaraniLivia Maria Silva MouraUniversidade de São PauloBioinformáticaUSPBR Bioinformática Bioinformatics Curadoria Curation Ferramenta Genoma Genome Metagenome Metagenomica Microbiologia Microbiology Tool On a daily based, there is a huge amount of biological data being generated that brings deeper insight into our world. This exponential growth of information has brought many challenges, especially when dealing with metagenomic data. Thousands of metagenomic-assembled genomes (MAG) are recovered annually, and many methods try to achieve the best output possible through assembly and binning approaches. But the question that every time is more frequent is: Are these MAGs reliable? The importance of having good quality data is undeniable. Retrieving reliable and complete, or nearly-complete MAGs, provides important data for further analysis and research purposes. Within this scenario, we\'d like to know if it is possible to improve the MAGs metrics quality when compared to the state-of-the art in time series data. In addition, we\'d like to develop a tool capable of locating and fixing assembly errors, thereby assisting in genomic curation. Here, we describe an approach of MAGs recovery from time-serial data called the \"screening method\", drastically reducing the recovery from potentially unreliable MAGs. In addition, we present FixAME, a tool capable of helping curate assembled sequences of metagenome, single genome, a set of bins, phages, archaea, or any organism that contains single double-stranded DNA. Diariamente, há uma grande quantidade de dados biológicos sendo gerados que trazem uma visão mais profunda do nosso mundo. Esse crescimento exponencial de informações trouxe muitos desafios, principalmente quando se trata de dados metagenômicos. Milhares de genomas montados a partir de dados metagenômicos (MAG) são recuperados anualmente, e muitos métodos tentam alcançar o melhor resultado possível por meio de abordagens de montagem e binning. Mas a pergunta que cada vez é mais frequente é: esses MAGS são confiáveis? A importância de ter dados de boa qualidade é inegável. A recuperação de MAGs confiáveis e completos, ou quase completos, fornece dados importantes para análises futuras e propósitos de pesquisa. Dentro desse cenário, gostaríamos de saber se é possível melhorar a qualidade das métricas dos MAGs quando comparadas ao estado da arte em dados seriados no tempo. Além disso, gostaríamos de desenvolver uma ferramenta capaz de localizar e corrigir erros de montagem, dessa forma auxiliando na curadoria genômica. Aqui, nós descrevemos uma abordagem de MAGs recuperados de dados seriados no tempo chamado \"método de triagem\", reduzindo drasticamente a recuperação de MAGs potencialmente não confiáveis. Além disso, apresentamos FixAME, uma ferramenta capaz de auxiliar na curadoria de sequências montadas de metagenoma, genoma único, conjunto de bins, fagos, arqueias ou qualquer organismo que contenha DNA de fita simples. https://doi.org/10.11606/T.95.2022.tde-19072022-153242info:eu-repo/semantics/openAccessengreponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USP2023-12-21T19:25:20Zoai:teses.usp.br:tde-19072022-153242Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212023-12-22T12:55:19.212693Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.en.fl_str_mv |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
dc.title.alternative.pt.fl_str_mv |
Recuperação de MAGs de alta qualidade em dados metagenômicos tempo seriado e desenvolvimento de FixAME: uma ferramenta auxiliar de curadoria de erros de montagem genômica |
title |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
spellingShingle |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error Livia Maria Silva Moura |
title_short |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
title_full |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
title_fullStr |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
title_full_unstemmed |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
title_sort |
Recovery of high-quality MAGs in time-serial metagenomic data and FixAME development: an assisting curation tool of genomic assembly error |
author |
Livia Maria Silva Moura |
author_facet |
Livia Maria Silva Moura |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
João Carlos Setubal |
dc.contributor.referee1.fl_str_mv |
Nalvo Franco de Almeida Junior |
dc.contributor.referee2.fl_str_mv |
Christian Hoffmann |
dc.contributor.referee3.fl_str_mv |
Alessandro de Mello Varani |
dc.contributor.author.fl_str_mv |
Livia Maria Silva Moura |
contributor_str_mv |
João Carlos Setubal Nalvo Franco de Almeida Junior Christian Hoffmann Alessandro de Mello Varani |
description |
On a daily based, there is a huge amount of biological data being generated that brings deeper insight into our world. This exponential growth of information has brought many challenges, especially when dealing with metagenomic data. Thousands of metagenomic-assembled genomes (MAG) are recovered annually, and many methods try to achieve the best output possible through assembly and binning approaches. But the question that every time is more frequent is: Are these MAGs reliable? The importance of having good quality data is undeniable. Retrieving reliable and complete, or nearly-complete MAGs, provides important data for further analysis and research purposes. Within this scenario, we\'d like to know if it is possible to improve the MAGs metrics quality when compared to the state-of-the art in time series data. In addition, we\'d like to develop a tool capable of locating and fixing assembly errors, thereby assisting in genomic curation. Here, we describe an approach of MAGs recovery from time-serial data called the \"screening method\", drastically reducing the recovery from potentially unreliable MAGs. In addition, we present FixAME, a tool capable of helping curate assembled sequences of metagenome, single genome, a set of bins, phages, archaea, or any organism that contains single double-stranded DNA. |
publishDate |
2022 |
dc.date.issued.fl_str_mv |
2022-04-14 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://doi.org/10.11606/T.95.2022.tde-19072022-153242 |
url |
https://doi.org/10.11606/T.95.2022.tde-19072022-153242 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade de São Paulo |
dc.publisher.program.fl_str_mv |
Bioinformática |
dc.publisher.initials.fl_str_mv |
USP |
dc.publisher.country.fl_str_mv |
BR |
publisher.none.fl_str_mv |
Universidade de São Paulo |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1794502865507581952 |