Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFMG |
Texto Completo: | http://hdl.handle.net/1843/41503 |
Resumo: | The emergence and development of antimicrobial resistance (AMR) against conventional antimicrobials are the main fears of modern medicine and food security, leading both pharmaceutical and food industries to renew their interest in the discovery of microbial natural products with antimicrobial properties. With the advent of the massive amount and freely available metagenomic data, containing sequences from cultured and uncultured species, the field of natural product discovery is transforming, and new strategies such as peptide mining are being developed. Here, we present the first pattern-matching tool created to prospect lasso peptide directly from short reads of metagenomic data, which is a constraint for the classical genome mining approach based on sequence similarity. The tool receives as input FASTQ or FASTA format files and query patterns. A control test was performed on a mock community containing 27 genomes from known lasso peptide producers and 9 negative genomes. The query patterns were designed based on 35 lasso sequences associated with 27 genomes. The sequences were randomly divided into a group termed training, containing 21 sequences and a testing group with 14 sequences. For the training group, a distance matrix was obtained by multidimensional scaling technique based on the plot of three groups. For each group, a consensus sequence was obtained, and a query pattern was used to screen potential lasso peptides in rumen metagenomics data resulting in 3 new peptides. To validate the potential new lasso peptides, a user-friendly E. coli expression vector containing optimized genes to produce lasso peptides in high yield was designed and synthesized. The vector was able to produce the lasso peptide microcin J25 with the ability to inhibit both gram-negative and gram-positive bacteria. The study presents the potential to access data from whole communities, which are a potentially unique resource for novel antimicrobial peptides. |
id |
UFMG_e706b7a562343a9082450edc6058cad9 |
---|---|
oai_identifier_str |
oai:repositorio.ufmg.br:1843/41503 |
network_acronym_str |
UFMG |
network_name_str |
Repositório Institucional da UFMG |
repository_id_str |
|
spelling |
Tiago Antônio de Oliveira Mendeshttp://lattes.cnpq.br/6003001459902104Hilário Cuquetto MantovaniRaquel Cardoso de Melo MinardiSabrina de Azevedo SilveiraDanielle Biscaro Pedrollihttp://lattes.cnpq.br/3463385819845841Igor Andrade Figueiredo de Souza2022-05-10T14:59:04Z2022-05-10T14:59:04Z2020-10-30http://hdl.handle.net/1843/41503The emergence and development of antimicrobial resistance (AMR) against conventional antimicrobials are the main fears of modern medicine and food security, leading both pharmaceutical and food industries to renew their interest in the discovery of microbial natural products with antimicrobial properties. With the advent of the massive amount and freely available metagenomic data, containing sequences from cultured and uncultured species, the field of natural product discovery is transforming, and new strategies such as peptide mining are being developed. Here, we present the first pattern-matching tool created to prospect lasso peptide directly from short reads of metagenomic data, which is a constraint for the classical genome mining approach based on sequence similarity. The tool receives as input FASTQ or FASTA format files and query patterns. A control test was performed on a mock community containing 27 genomes from known lasso peptide producers and 9 negative genomes. The query patterns were designed based on 35 lasso sequences associated with 27 genomes. The sequences were randomly divided into a group termed training, containing 21 sequences and a testing group with 14 sequences. For the training group, a distance matrix was obtained by multidimensional scaling technique based on the plot of three groups. For each group, a consensus sequence was obtained, and a query pattern was used to screen potential lasso peptides in rumen metagenomics data resulting in 3 new peptides. To validate the potential new lasso peptides, a user-friendly E. coli expression vector containing optimized genes to produce lasso peptides in high yield was designed and synthesized. The vector was able to produce the lasso peptide microcin J25 with the ability to inhibit both gram-negative and gram-positive bacteria. The study presents the potential to access data from whole communities, which are a potentially unique resource for novel antimicrobial peptides.O surgimento e o desenvolvimento da resistência antimicrobiana (AMR) contra os antimicrobianos convencionais são os principais temores da medicina moderna e da segurança alimentar, levando ambas as indústrias farmacêutica e alimentícia a renovar seu interesse na descoberta de produtos naturais microbianos com propriedades antimicrobianas. Com o advento de abundante quantidade de dados metagenômicos disponíveis, contendo sequências de espécies cultivadas e não-cultivadas, o campo da descoberta de produtos naturais está se transformando, e novas estratégias, como a mineração de peptídeos, estão sendo desenvolvidas. Aqui, apresentamos a primeira ferramenta de correspondência de padrões criada para prospectar peptídeo laço diretamente a partir de short reads de dados metagenômicos, que atualmente é uma restrição para a abordagem clássica de mineração genômica com base em similaridade de sequências. A ferramenta recebe como entrada arquivos de formato FASTQ ou FASTA e os padrões de consulta. Um teste controle foi realizado em uma comunidade simulada, contendo 27 genomas de produtores conhecidos de peptídeos laço e 9 genomas não produtores. Os padrões de consulta foram obtidos a partir de 35 sequências peptídeos laço de 27 genomas distintos. As sequências foram divididas aleatoriamente em grupo treino, contendo 21 sequências e um grupo teste com 14 sequências. Para o grupo reino, uma matriz de distância foi obtida pela técnica de dimensionamento multidimensional e na separação de três grupos. Para cada grupo, uma sequência consenso foi obtida, e um padrão de consulta foi usado para procurar potenciais peptídeos laço em dados metagenômicos de rúmen, levando a descoberta de 3 novos potenciais peptídeos. Para validar os potenciais peptídeos laço, um vetor de expressão para E. Coli contendo genes otimizados para produção de peptídeos laço em alto rendimento foi projetado e sintetizado. O vetor foi capaz de produzir o peptídeo laço microcina J25, com a capacidade de inibir bactérias gram-negativas e gram-positivas. O estudo demonstra o potencial de acessar dados de comunidades bacterianas, um recurso potencialmente único para novos peptídeos antimicrobianos.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoFAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorOutra AgênciaengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em BioinformaticaUFMGBrasilICB - INSTITUTO DE CIÊNCIAS BIOLOGICASBiologia ComputacionalAnti-InfecciososGenômicaBioprospecçãoPeptídeosantimicrobialsgenomicsbioprospectingcomputational screeninglasso peptidesDevelopment of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptidesinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALversao_final_repositorio.pdfversao_final_repositorio.pdfapplication/pdf1541824https://repositorio.ufmg.br/bitstream/1843/41503/1/versao_final_repositorio.pdf0e1ffcb6cc15aac55ccc727680a32729MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/41503/2/license.txtcda590c95a0b51b4d15f60c9642ca272MD521843/415032022-05-10 11:59:04.963oai:repositorio.ufmg.br:1843/41503TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2022-05-10T14:59:04Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
dc.title.pt_BR.fl_str_mv |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
title |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
spellingShingle |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides Igor Andrade Figueiredo de Souza antimicrobials genomics bioprospecting computational screening lasso peptides Biologia Computacional Anti-Infecciosos Genômica Bioprospecção Peptídeos |
title_short |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
title_full |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
title_fullStr |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
title_full_unstemmed |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
title_sort |
Development of a computational tool to screen antimicrobial peptides in metagenomics data associated to a new vector for high-scale production of lasso peptides |
author |
Igor Andrade Figueiredo de Souza |
author_facet |
Igor Andrade Figueiredo de Souza |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Tiago Antônio de Oliveira Mendes |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/6003001459902104 |
dc.contributor.advisor-co1.fl_str_mv |
Hilário Cuquetto Mantovani |
dc.contributor.referee1.fl_str_mv |
Raquel Cardoso de Melo Minardi |
dc.contributor.referee2.fl_str_mv |
Sabrina de Azevedo Silveira |
dc.contributor.referee3.fl_str_mv |
Danielle Biscaro Pedrolli |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/3463385819845841 |
dc.contributor.author.fl_str_mv |
Igor Andrade Figueiredo de Souza |
contributor_str_mv |
Tiago Antônio de Oliveira Mendes Hilário Cuquetto Mantovani Raquel Cardoso de Melo Minardi Sabrina de Azevedo Silveira Danielle Biscaro Pedrolli |
dc.subject.por.fl_str_mv |
antimicrobials genomics bioprospecting computational screening lasso peptides |
topic |
antimicrobials genomics bioprospecting computational screening lasso peptides Biologia Computacional Anti-Infecciosos Genômica Bioprospecção Peptídeos |
dc.subject.other.pt_BR.fl_str_mv |
Biologia Computacional Anti-Infecciosos Genômica Bioprospecção Peptídeos |
description |
The emergence and development of antimicrobial resistance (AMR) against conventional antimicrobials are the main fears of modern medicine and food security, leading both pharmaceutical and food industries to renew their interest in the discovery of microbial natural products with antimicrobial properties. With the advent of the massive amount and freely available metagenomic data, containing sequences from cultured and uncultured species, the field of natural product discovery is transforming, and new strategies such as peptide mining are being developed. Here, we present the first pattern-matching tool created to prospect lasso peptide directly from short reads of metagenomic data, which is a constraint for the classical genome mining approach based on sequence similarity. The tool receives as input FASTQ or FASTA format files and query patterns. A control test was performed on a mock community containing 27 genomes from known lasso peptide producers and 9 negative genomes. The query patterns were designed based on 35 lasso sequences associated with 27 genomes. The sequences were randomly divided into a group termed training, containing 21 sequences and a testing group with 14 sequences. For the training group, a distance matrix was obtained by multidimensional scaling technique based on the plot of three groups. For each group, a consensus sequence was obtained, and a query pattern was used to screen potential lasso peptides in rumen metagenomics data resulting in 3 new peptides. To validate the potential new lasso peptides, a user-friendly E. coli expression vector containing optimized genes to produce lasso peptides in high yield was designed and synthesized. The vector was able to produce the lasso peptide microcin J25 with the ability to inhibit both gram-negative and gram-positive bacteria. The study presents the potential to access data from whole communities, which are a potentially unique resource for novel antimicrobial peptides. |
publishDate |
2020 |
dc.date.issued.fl_str_mv |
2020-10-30 |
dc.date.accessioned.fl_str_mv |
2022-05-10T14:59:04Z |
dc.date.available.fl_str_mv |
2022-05-10T14:59:04Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1843/41503 |
url |
http://hdl.handle.net/1843/41503 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Bioinformatica |
dc.publisher.initials.fl_str_mv |
UFMG |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
ICB - INSTITUTO DE CIÊNCIAS BIOLOGICAS |
publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
instname_str |
Universidade Federal de Minas Gerais (UFMG) |
instacron_str |
UFMG |
institution |
UFMG |
reponame_str |
Repositório Institucional da UFMG |
collection |
Repositório Institucional da UFMG |
bitstream.url.fl_str_mv |
https://repositorio.ufmg.br/bitstream/1843/41503/1/versao_final_repositorio.pdf https://repositorio.ufmg.br/bitstream/1843/41503/2/license.txt |
bitstream.checksum.fl_str_mv |
0e1ffcb6cc15aac55ccc727680a32729 cda590c95a0b51b4d15f60c9642ca272 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
repository.mail.fl_str_mv |
|
_version_ |
1803589402163675136 |