Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
Autor(a) principal: | |
---|---|
Data de Publicação: | 2015 |
Tipo de documento: | Tese |
Idioma: | por |
Título da fonte: | Biblioteca Digital de Teses e Dissertações do LNCC |
Texto Completo: | https://tede.lncc.br/handle/tede/221 |
Resumo: | The protein structure prediciton problem (PSP) consists of discovering the native three-dimensional arrangement of a protein molecule using the information stored in its amino acid sequence. Unveiling the 3D structure of a protein is a way to obtain crucial information about its functions, given that the function of a protein is intrinsically related to its native three-dimensional structure. The experimental determination of the protein structure presents some technical difficulties and is also costly in workload and time. Thus, the investment in computational methods for PSP becomes imminent. This thesis has as main objective to increase the predictive ability of the GAPF protein structure prediction program and contribute to the advancement of theories and methodologies in the free-modeling prediction area. Efforts are directed on two fronts: (i) Improve the modeling of the energy function by the development and Implementing new potential for modeling the problem. (ii) To Increase the conformational search through the development and implementation of a multi-objective genetic algorithm. For the modeling of the problem, they were inserted in the function cost new ad hoc potentials that deal with hydrophobic compactation and with hydrogen bonds, key components in protein folding. For conformational search, a multiobjective steady-state genetic algorithm with phenotypic crowding was proposed. The new methodology was evaluated in a test set of 46 proteins, of all classes, and compared to consolidated methods in the literature, such as quark. The contributions of this thesis provided a major advance in the GAPF's predictive power, increasing the quality of the models and allowing investments in longer sequences. Advances have been notable in beta-sheets predictions, mainly due to the inclusion of hydrogen bonding potentials. Were made available also interesting tools for the future development of the program and GAPF was put as a good candidate for free-modeling predictions against prominent methodologies in the area. |
id |
LNCC_512778961cc42eab8cabf943ff7426b7 |
---|---|
oai_identifier_str |
oai:tede-server.lncc.br:tede/221 |
network_acronym_str |
LNCC |
network_name_str |
Biblioteca Digital de Teses e Dissertações do LNCC |
repository_id_str |
|
spelling |
Dardenne, Laurent Emmanuelhttp://lattes.cnpq.br/8344194525615133Custódio, Fábio LimaNicolas, Marisa Fabianahttp://lattes.cnpq.br/0717161560405537Bisch, Paulo MascarelloAraújo, Antonio Francisco Pereirahttp://lattes.cnpq.br/7690535205003366Rocha, Gregório Kappaun2015-10-13T18:53:59Z2015-09-02ROCHA, G. K. Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes, 2015. xxx,170 f. Tese (Programa de Pós-Graduação em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2015.https://tede.lncc.br/handle/tede/221The protein structure prediciton problem (PSP) consists of discovering the native three-dimensional arrangement of a protein molecule using the information stored in its amino acid sequence. Unveiling the 3D structure of a protein is a way to obtain crucial information about its functions, given that the function of a protein is intrinsically related to its native three-dimensional structure. The experimental determination of the protein structure presents some technical difficulties and is also costly in workload and time. Thus, the investment in computational methods for PSP becomes imminent. This thesis has as main objective to increase the predictive ability of the GAPF protein structure prediction program and contribute to the advancement of theories and methodologies in the free-modeling prediction area. Efforts are directed on two fronts: (i) Improve the modeling of the energy function by the development and Implementing new potential for modeling the problem. (ii) To Increase the conformational search through the development and implementation of a multi-objective genetic algorithm. For the modeling of the problem, they were inserted in the function cost new ad hoc potentials that deal with hydrophobic compactation and with hydrogen bonds, key components in protein folding. For conformational search, a multiobjective steady-state genetic algorithm with phenotypic crowding was proposed. The new methodology was evaluated in a test set of 46 proteins, of all classes, and compared to consolidated methods in the literature, such as quark. The contributions of this thesis provided a major advance in the GAPF's predictive power, increasing the quality of the models and allowing investments in longer sequences. Advances have been notable in beta-sheets predictions, mainly due to the inclusion of hydrogen bonding potentials. Were made available also interesting tools for the future development of the program and GAPF was put as a good candidate for free-modeling predictions against prominent methodologies in the area.O problema da predição de estrutura de proteínas (PSP) consiste em desvendar o arranjo tridimensional da molécula a partir de sua sequência de aminoácidos. Conhecer a estrutura das proteínas constituintes de um sistema biológico é uma forma de se obter informações cruciais sobre o seu funcionamento, haja vista que a função de uma proteína está intrinsecamente relacionada à sua estrutura nativa tridimensional. A determinação experimental da estrutura de uma proteína além de apresentar dificuldades técnicas, é também dispendiosa em volume de trabalho e de tempo. Sendo assim, o investimento em métodos computacionais para PSP torna-se eminente. Essa tese tem como objetivo geral aumentar a capacidade preditiva do programa de predição de estrutura de proteínas GAPF e contribuir para o avanço das teorias e metodologias na área da predição independente de moldes (free-modeling). Os esforços são direcionados em duas frentes: (i) Melhorar a modelagem da função de energia, através do desenvolvimento e implementação de novos potenciais para a modelagem do problema. (ii) Incrementar a busca conformacional, através do desenvolvimento e implementação de um algoritmo genético multiobjetivo. Para a modelagem do problema, foram inseridos na função custo novos potenciais ad hoc que tratam da compactação hidrofóbica e das ligações de hidrogênio, componentes fundamentais no enovelamento protéico. Para a busca na superfície de energia, um algoritmo genético não-geracional multiobjetivo com crowding fenotípico foi proposto. A nova metodologia foi avaliada em um conjunto teste com 46 proteínas, de todas as classes, e comparada com métodos consolidados na literatura como o QUARK. As contribuições desta tese proporcionaram um grande avanço no poder preditivo do programa GAPF, aumentando a qualidade dos modelos e permitindo investir em sequências maiores. Avanços foram notáveis na predição de folhas-beta, principalmente fruto dos potenciais de ligação de hidrogênio inseridos. Disponibilizou-se, ainda, ferramentas interessantes para o desenvolvimento futuro do programa e colocou o GAPF como um bom candidato para predições independentes de molde frente metodologias de destaque na área.Submitted by Maria Cristina (library@lncc.br) on 2015-10-13T18:53:31Z No. of bitstreams: 1 Tese_Gregorio_LNCC_Set_2015_FINAL.pdf: 24967973 bytes, checksum: 0efd2d2481063521b74d53264c4be5bb (MD5)Approved for entry into archive by Maria Cristina (library@lncc.br) on 2015-10-13T18:53:44Z (GMT) No. of bitstreams: 1 Tese_Gregorio_LNCC_Set_2015_FINAL.pdf: 24967973 bytes, checksum: 0efd2d2481063521b74d53264c4be5bb (MD5)Made available in DSpace on 2015-10-13T18:53:59Z (GMT). No. of bitstreams: 1 Tese_Gregorio_LNCC_Set_2015_FINAL.pdf: 24967973 bytes, checksum: 0efd2d2481063521b74d53264c4be5bb (MD5) Previous issue date: 2015-09-17Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiroapplication/pdfhttp://tede-server.lncc.br:8080/retrieve/502/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpgporLaboratório Nacional de Computação CientíficaPrograma de Pós-Graduação em Modelagem ComputacionalLNCCBrasilCoordenação de Pós-Graduação e Aperfeiçoamento (COPGA)Biologia computacionalPredição de estruturas de proteinasModelagem molecularComputational biologyPrediction of protein structureMolecular modelingCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAODesenvolvimento de metodologias para predição de estruturas de proteínas independente de moldesDevelopment of free-modeling methodologies for protein structure predictioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações do LNCCinstname:Laboratório Nacional de Computação Científica (LNCC)instacron:LNCCLICENSElicense.txtlicense.txttext/plain; charset=utf-82165http://tede-server.lncc.br:8080/tede/bitstream/tede/221/1/license.txtbd3efa91386c1718a7f26a329fdcb468MD51ORIGINALTese_Gregorio_LNCC_Set_2015_FINAL.pdfTese_Gregorio_LNCC_Set_2015_FINAL.pdfapplication/pdf24967973http://tede-server.lncc.br:8080/tede/bitstream/tede/221/2/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf0efd2d2481063521b74d53264c4be5bbMD52THUMBNAILTese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpgTese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpgimage/jpeg3383http://tede-server.lncc.br:8080/tede/bitstream/tede/221/3/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpg58eb246610ffae8f89886eec54cec8ecMD53tede/2212023-06-02 11:40:47.468oai:tede-server.lncc.br:tede/221Tk9UQTogQ09MT1FVRSBBUVVJIEEgU1VBIFBSw5NQUklBIExJQ0VOw4dBCkVzdGEgbGljZW7Dp2EgZGUgZXhlbXBsbyDDqSBmb3JuZWNpZGEgYXBlbmFzIHBhcmEgZmlucyBpbmZvcm1hdGl2b3MuCgpMSUNFTsOHQSBERSBESVNUUklCVUnDh8ODTyBOw4NPLUVYQ0xVU0lWQQoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgClhYWCAoU2lnbGEgZGEgVW5pdmVyc2lkYWRlKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBwb2RlLCBzZW0gYWx0ZXJhciBvIGNvbnRlw7pkbywgdHJhbnNwb3IgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIApwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgU2lnbGEgZGUgVW5pdmVyc2lkYWRlIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPDs3BpYSBhIHN1YSB0ZXNlIG91IApkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyAKbmVzdGEgbGljZW7Dp2EuIFZvY8OqIHRhbWLDqW0gZGVjbGFyYSBxdWUgbyBkZXDDs3NpdG8gZGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBuw6NvLCBxdWUgc2VqYSBkZSBzZXUgCmNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiAKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSAKb3MgZGlyZWl0b3MgYXByZXNlbnRhZG9zIG5lc3RhIGxpY2Vuw6dhLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIAppZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250ZcO6ZG8gZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFRFU0UgT1UgRElTU0VSVEHDh8ODTyBPUkEgREVQT1NJVEFEQSBURU5IQSBTSURPIFJFU1VMVEFETyBERSBVTSBQQVRST0PDjU5JTyBPVSAKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBTSUdMQSBERSAKVU5JVkVSU0lEQURFLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyAKVEFNQsOJTSBBUyBERU1BSVMgT0JSSUdBw4fDlUVTIEVYSUdJREFTIFBPUiBDT05UUkFUTyBPVSBBQ09SRE8uCgpBIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Teses e Dissertaçõeshttps://tede.lncc.br/PUBhttps://tede.lncc.br/oai/requestlibrary@lncc.br||library@lncc.bropendoar:2023-06-02T14:40:47Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)false |
dc.title.por.fl_str_mv |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
dc.title.alternative.eng.fl_str_mv |
Development of free-modeling methodologies for protein structure prediction |
title |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
spellingShingle |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes Rocha, Gregório Kappaun Biologia computacional Predição de estruturas de proteinas Modelagem molecular Computational biology Prediction of protein structure Molecular modeling CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAO |
title_short |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
title_full |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
title_fullStr |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
title_full_unstemmed |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
title_sort |
Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes |
author |
Rocha, Gregório Kappaun |
author_facet |
Rocha, Gregório Kappaun |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Dardenne, Laurent Emmanuel |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/8344194525615133 |
dc.contributor.advisor2.fl_str_mv |
Custódio, Fábio Lima |
dc.contributor.referee1.fl_str_mv |
Nicolas, Marisa Fabiana |
dc.contributor.referee1Lattes.fl_str_mv |
http://lattes.cnpq.br/0717161560405537 |
dc.contributor.referee2.fl_str_mv |
Bisch, Paulo Mascarello |
dc.contributor.referee3.fl_str_mv |
Araújo, Antonio Francisco Pereira |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/7690535205003366 |
dc.contributor.author.fl_str_mv |
Rocha, Gregório Kappaun |
contributor_str_mv |
Dardenne, Laurent Emmanuel Custódio, Fábio Lima Nicolas, Marisa Fabiana Bisch, Paulo Mascarello Araújo, Antonio Francisco Pereira |
dc.subject.por.fl_str_mv |
Biologia computacional Predição de estruturas de proteinas Modelagem molecular Computational biology Prediction of protein structure Molecular modeling |
topic |
Biologia computacional Predição de estruturas de proteinas Modelagem molecular Computational biology Prediction of protein structure Molecular modeling CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAO |
dc.subject.cnpq.fl_str_mv |
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAO |
description |
The protein structure prediciton problem (PSP) consists of discovering the native three-dimensional arrangement of a protein molecule using the information stored in its amino acid sequence. Unveiling the 3D structure of a protein is a way to obtain crucial information about its functions, given that the function of a protein is intrinsically related to its native three-dimensional structure. The experimental determination of the protein structure presents some technical difficulties and is also costly in workload and time. Thus, the investment in computational methods for PSP becomes imminent. This thesis has as main objective to increase the predictive ability of the GAPF protein structure prediction program and contribute to the advancement of theories and methodologies in the free-modeling prediction area. Efforts are directed on two fronts: (i) Improve the modeling of the energy function by the development and Implementing new potential for modeling the problem. (ii) To Increase the conformational search through the development and implementation of a multi-objective genetic algorithm. For the modeling of the problem, they were inserted in the function cost new ad hoc potentials that deal with hydrophobic compactation and with hydrogen bonds, key components in protein folding. For conformational search, a multiobjective steady-state genetic algorithm with phenotypic crowding was proposed. The new methodology was evaluated in a test set of 46 proteins, of all classes, and compared to consolidated methods in the literature, such as quark. The contributions of this thesis provided a major advance in the GAPF's predictive power, increasing the quality of the models and allowing investments in longer sequences. Advances have been notable in beta-sheets predictions, mainly due to the inclusion of hydrogen bonding potentials. Were made available also interesting tools for the future development of the program and GAPF was put as a good candidate for free-modeling predictions against prominent methodologies in the area. |
publishDate |
2015 |
dc.date.accessioned.fl_str_mv |
2015-10-13T18:53:59Z |
dc.date.issued.fl_str_mv |
2015-09-02 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
ROCHA, G. K. Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes, 2015. xxx,170 f. Tese (Programa de Pós-Graduação em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2015. |
dc.identifier.uri.fl_str_mv |
https://tede.lncc.br/handle/tede/221 |
identifier_str_mv |
ROCHA, G. K. Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes, 2015. xxx,170 f. Tese (Programa de Pós-Graduação em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2015. |
url |
https://tede.lncc.br/handle/tede/221 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Laboratório Nacional de Computação Científica |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Modelagem Computacional |
dc.publisher.initials.fl_str_mv |
LNCC |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA) |
publisher.none.fl_str_mv |
Laboratório Nacional de Computação Científica |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações do LNCC instname:Laboratório Nacional de Computação Científica (LNCC) instacron:LNCC |
instname_str |
Laboratório Nacional de Computação Científica (LNCC) |
instacron_str |
LNCC |
institution |
LNCC |
reponame_str |
Biblioteca Digital de Teses e Dissertações do LNCC |
collection |
Biblioteca Digital de Teses e Dissertações do LNCC |
bitstream.url.fl_str_mv |
http://tede-server.lncc.br:8080/tede/bitstream/tede/221/1/license.txt http://tede-server.lncc.br:8080/tede/bitstream/tede/221/2/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf http://tede-server.lncc.br:8080/tede/bitstream/tede/221/3/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpg |
bitstream.checksum.fl_str_mv |
bd3efa91386c1718a7f26a329fdcb468 0efd2d2481063521b74d53264c4be5bb 58eb246610ffae8f89886eec54cec8ec |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC) |
repository.mail.fl_str_mv |
library@lncc.br||library@lncc.br |
_version_ |
1797683218664652800 |