Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides

Detalhes bibliográficos
Autor(a) principal: Igor Andrade Figueiredo de Souza
Data de Publicação: 2020
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFMG
Texto Completo: http://hdl.handle.net/1843/41503
Resumo: The emergence and development of antimicrobial resistance (AMR) against conventional antimicrobials are the main fears of modern medicine and food security, leading both pharmaceutical and food industries to renew their interest in the discovery of microbial natural products with antimicrobial properties. With the advent of the massive amount and freely available metagenomic data, containing sequences from cultured and uncultured species, the field of natural product discovery is transforming, and new strategies such as peptide mining are being developed. Here, we present the first pattern-matching tool created to prospect lasso peptide directly from short reads of metagenomic data, which is a constraint for the classical genome mining approach based on sequence similarity. The tool receives as input FASTQ or FASTA format files and query patterns. A control test was performed on a mock community containing 27 genomes from known lasso peptide producers and 9 negative genomes. The query patterns were designed based on 35 lasso sequences associated with 27 genomes. The sequences were randomly divided into a group termed training, containing 21 sequences and a testing group with 14 sequences. For the training group, a distance matrix was obtained by multidimensional scaling technique based on the plot of three groups. For each group, a consensus sequence was obtained, and a query pattern was used to screen potential lasso peptides in rumen metagenomics data resulting in 3 new peptides. To validate the potential new lasso peptides, a user-friendly E. coli expression vector containing optimized genes to produce lasso peptides in high yield was designed and synthesized. The vector was able to produce the lasso peptide microcin J25 with the ability to inhibit both gram-negative and gram-positive bacteria. The study presents the potential to access data from whole communities, which are a potentially unique resource for novel antimicrobial peptides.
id UFMG_e706b7a562343a9082450edc6058cad9
oai_identifier_str oai:repositorio.ufmg.br:1843/41503
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Tiago Antônio de Oliveira Mendeshttp://lattes.cnpq.br/6003001459902104Hilário Cuquetto MantovaniRaquel Cardoso de Melo MinardiSabrina de Azevedo SilveiraDanielle Biscaro Pedrollihttp://lattes.cnpq.br/3463385819845841Igor Andrade Figueiredo de Souza2022-05-10T14:59:04Z2022-05-10T14:59:04Z2020-10-30http://hdl.handle.net/1843/41503The emergence and development of antimicrobial resistance (AMR) against conventional antimicrobials are the main fears of modern medicine and food security, leading both pharmaceutical and food industries to renew their interest in the discovery of microbial natural products with antimicrobial properties. With the advent of the massive amount and freely available metagenomic data, containing sequences from cultured and uncultured species, the field of natural product discovery is transforming, and new strategies such as peptide mining are being developed. Here, we present the first pattern-matching tool created to prospect lasso peptide directly from short reads of metagenomic data, which is a constraint for the classical genome mining approach based on sequence similarity. The tool receives as input FASTQ or FASTA format files and query patterns. A control test was performed on a mock community containing 27 genomes from known lasso peptide producers and 9 negative genomes. The query patterns were designed based on 35 lasso sequences associated with 27 genomes. The sequences were randomly divided into a group termed training, containing 21 sequences and a testing group with 14 sequences. For the training group, a distance matrix was obtained by multidimensional scaling technique based on the plot of three groups. For each group, a consensus sequence was obtained, and a query pattern was used to screen potential lasso peptides in rumen metagenomics data resulting in 3 new peptides. To validate the potential new lasso peptides, a user-friendly E. coli expression vector containing optimized genes to produce lasso peptides in high yield was designed and synthesized. The vector was able to produce the lasso peptide microcin J25 with the ability to inhibit both gram-negative and gram-positive bacteria. The study presents the potential to access data from whole communities, which are a potentially unique resource for novel antimicrobial peptides.O surgimento e o desenvolvimento da resistência antimicrobiana (AMR) contra os antimicrobianos convencionais são os principais temores da medicina moderna e da segurança alimentar, levando ambas as indústrias farmacêutica e alimentícia a renovar seu interesse na descoberta de produtos naturais microbianos com propriedades antimicrobianas. Com o advento de abundante quantidade de dados metagenômicos disponíveis, contendo sequências de espécies cultivadas e não-cultivadas, o campo da descoberta de produtos naturais está se transformando, e novas estratégias, como a mineração de peptídeos, estão sendo desenvolvidas. Aqui, apresentamos a primeira ferramenta de correspondência de padrões criada para prospectar peptídeo laço diretamente a partir de short reads de dados metagenômicos, que atualmente é uma restrição para a abordagem clássica de mineração genômica com base em similaridade de sequências. A ferramenta recebe como entrada arquivos de formato FASTQ ou FASTA e os padrões de consulta. Um teste controle foi realizado em uma comunidade simulada, contendo 27 genomas de produtores conhecidos de peptídeos laço e 9 genomas não produtores. Os padrões de consulta foram obtidos a partir de 35 sequências peptídeos laço de 27 genomas distintos. As sequências foram divididas aleatoriamente em grupo treino, contendo 21 sequências e um grupo teste com 14 sequências. Para o grupo reino, uma matriz de distância foi obtida pela técnica de dimensionamento multidimensional e na separação de três grupos. Para cada grupo, uma sequência consenso foi obtida, e um padrão de consulta foi usado para procurar potenciais peptídeos laço em dados metagenômicos de rúmen, levando a descoberta de 3 novos potenciais peptídeos. Para validar os potenciais peptídeos laço, um vetor de expressão para E. Coli contendo genes otimizados para produção de peptídeos laço em alto rendimento foi projetado e sintetizado. O vetor foi capaz de produzir o peptídeo laço microcina J25, com a capacidade de inibir bactérias gram-negativas e gram-positivas. O estudo demonstra o potencial de acessar dados de comunidades bacterianas, um recurso potencialmente único para novos peptídeos antimicrobianos.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoFAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorOutra AgênciaengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em BioinformaticaUFMGBrasilICB - INSTITUTO DE CIÊNCIAS BIOLOGICASBiologia ComputacionalAnti-InfecciososGenômicaBioprospecçãoPeptídeosantimicrobialsgenomicsbioprospectingcomputational screeninglasso peptidesDevelopment of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptidesinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALversao_final_repositorio.pdfversao_final_repositorio.pdfapplication/pdf1541824https://repositorio.ufmg.br/bitstream/1843/41503/1/versao_final_repositorio.pdf0e1ffcb6cc15aac55ccc727680a32729MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/41503/2/license.txtcda590c95a0b51b4d15f60c9642ca272MD521843/415032022-05-10 11:59:04.963oai:repositorio.ufmg.br:1843/41503TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2022-05-10T14:59:04Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
title Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
spellingShingle Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
Igor Andrade Figueiredo de Souza
antimicrobials
genomics
bioprospecting
computational screening
lasso peptides
Biologia Computacional
Anti-Infecciosos
Genômica
Bioprospecção
Peptídeos
title_short Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
title_full Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
title_fullStr Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
title_full_unstemmed Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
title_sort Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
author Igor Andrade Figueiredo de Souza
author_facet Igor Andrade Figueiredo de Souza
author_role author
dc.contributor.advisor1.fl_str_mv Tiago Antônio de Oliveira Mendes
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/6003001459902104
dc.contributor.advisor-co1.fl_str_mv Hilário Cuquetto Mantovani
dc.contributor.referee1.fl_str_mv Raquel Cardoso de Melo Minardi
dc.contributor.referee2.fl_str_mv Sabrina de Azevedo Silveira
dc.contributor.referee3.fl_str_mv Danielle Biscaro Pedrolli
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/3463385819845841
dc.contributor.author.fl_str_mv Igor Andrade Figueiredo de Souza
contributor_str_mv Tiago Antônio de Oliveira Mendes
Hilário Cuquetto Mantovani
Raquel Cardoso de Melo Minardi
Sabrina de Azevedo Silveira
Danielle Biscaro Pedrolli
dc.subject.por.fl_str_mv antimicrobials
genomics
bioprospecting
computational screening
lasso peptides
topic antimicrobials
genomics
bioprospecting
computational screening
lasso peptides
Biologia Computacional
Anti-Infecciosos
Genômica
Bioprospecção
Peptídeos
dc.subject.other.pt_BR.fl_str_mv Biologia Computacional
Anti-Infecciosos
Genômica
Bioprospecção
Peptídeos
description The emergence and development of antimicrobial resistance (AMR) against conventional antimicrobials are the main fears of modern medicine and food security, leading both pharmaceutical and food industries to renew their interest in the discovery of microbial natural products with antimicrobial properties. With the advent of the massive amount and freely available metagenomic data, containing sequences from cultured and uncultured species, the field of natural product discovery is transforming, and new strategies such as peptide mining are being developed. Here, we present the first pattern-matching tool created to prospect lasso peptide directly from short reads of metagenomic data, which is a constraint for the classical genome mining approach based on sequence similarity. The tool receives as input FASTQ or FASTA format files and query patterns. A control test was performed on a mock community containing 27 genomes from known lasso peptide producers and 9 negative genomes. The query patterns were designed based on 35 lasso sequences associated with 27 genomes. The sequences were randomly divided into a group termed training, containing 21 sequences and a testing group with 14 sequences. For the training group, a distance matrix was obtained by multidimensional scaling technique based on the plot of three groups. For each group, a consensus sequence was obtained, and a query pattern was used to screen potential lasso peptides in rumen metagenomics data resulting in 3 new peptides. To validate the potential new lasso peptides, a user-friendly E. coli expression vector containing optimized genes to produce lasso peptides in high yield was designed and synthesized. The vector was able to produce the lasso peptide microcin J25 with the ability to inhibit both gram-negative and gram-positive bacteria. The study presents the potential to access data from whole communities, which are a potentially unique resource for novel antimicrobial peptides.
publishDate 2020
dc.date.issued.fl_str_mv 2020-10-30
dc.date.accessioned.fl_str_mv 2022-05-10T14:59:04Z
dc.date.available.fl_str_mv 2022-05-10T14:59:04Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1843/41503
url http://hdl.handle.net/1843/41503
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Bioinformatica
dc.publisher.initials.fl_str_mv UFMG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv ICB - INSTITUTO DE CIÊNCIAS BIOLOGICAS
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
bitstream.url.fl_str_mv https://repositorio.ufmg.br/bitstream/1843/41503/1/versao_final_repositorio.pdf
https://repositorio.ufmg.br/bitstream/1843/41503/2/license.txt
bitstream.checksum.fl_str_mv 0e1ffcb6cc15aac55ccc727680a32729
cda590c95a0b51b4d15f60c9642ca272
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_ 1803589402163675136