Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes

Detalhes bibliográficos
Autor(a) principal: Rocha, Gregório Kappaun
Data de Publicação: 2015
Tipo de documento: Tese
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações do LNCC
Texto Completo: https://tede.lncc.br/handle/tede/221
Resumo: The protein structure prediciton problem (PSP) consists of discovering the native three-dimensional arrangement of a protein molecule using the information stored in its amino acid sequence. Unveiling the 3D structure of a protein is a way to obtain crucial information about its functions, given that the function of a protein is intrinsically related to its native three-dimensional structure. The experimental determination of the protein structure presents some technical difficulties and is also costly in workload and time. Thus, the investment in computational methods for PSP becomes imminent. This thesis has as main objective to increase the predictive ability of the GAPF protein structure prediction program and contribute to the advancement of theories and methodologies in the free-modeling prediction area. Efforts are directed on two fronts: (i) Improve the modeling of the energy function by the development and Implementing new potential for modeling the problem. (ii) To Increase the conformational search through the development and implementation of a multi-objective genetic algorithm. For the modeling of the problem, they were inserted in the function cost new ad hoc potentials that deal with hydrophobic compactation and with hydrogen bonds, key components in protein folding. For conformational search, a multiobjective steady-state genetic algorithm with phenotypic crowding was proposed. The new methodology was evaluated in a test set of 46 proteins, of all classes, and compared to consolidated methods in the literature, such as quark. The contributions of this thesis provided a major advance in the GAPF's predictive power, increasing the quality of the models and allowing investments in longer sequences. Advances have been notable in beta-sheets predictions, mainly due to the inclusion of hydrogen bonding potentials. Were made available also interesting tools for the future development of the program and GAPF was put as a good candidate for free-modeling predictions against prominent methodologies in the area.
id LNCC_512778961cc42eab8cabf943ff7426b7
oai_identifier_str oai:tede-server.lncc.br:tede/221
network_acronym_str LNCC
network_name_str Biblioteca Digital de Teses e Dissertações do LNCC
repository_id_str
spelling Dardenne, Laurent Emmanuelhttp://lattes.cnpq.br/8344194525615133Custódio, Fábio LimaNicolas, Marisa Fabianahttp://lattes.cnpq.br/0717161560405537Bisch, Paulo MascarelloAraújo, Antonio Francisco Pereirahttp://lattes.cnpq.br/7690535205003366Rocha, Gregório Kappaun2015-10-13T18:53:59Z2015-09-02ROCHA, G. K. Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes, 2015. xxx,170 f. Tese (Programa de Pós-Graduação em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2015.https://tede.lncc.br/handle/tede/221The protein structure prediciton problem (PSP) consists of discovering the native three-dimensional arrangement of a protein molecule using the information stored in its amino acid sequence. Unveiling the 3D structure of a protein is a way to obtain crucial information about its functions, given that the function of a protein is intrinsically related to its native three-dimensional structure. The experimental determination of the protein structure presents some technical difficulties and is also costly in workload and time. Thus, the investment in computational methods for PSP becomes imminent. This thesis has as main objective to increase the predictive ability of the GAPF protein structure prediction program and contribute to the advancement of theories and methodologies in the free-modeling prediction area. Efforts are directed on two fronts: (i) Improve the modeling of the energy function by the development and Implementing new potential for modeling the problem. (ii) To Increase the conformational search through the development and implementation of a multi-objective genetic algorithm. For the modeling of the problem, they were inserted in the function cost new ad hoc potentials that deal with hydrophobic compactation and with hydrogen bonds, key components in protein folding. For conformational search, a multiobjective steady-state genetic algorithm with phenotypic crowding was proposed. The new methodology was evaluated in a test set of 46 proteins, of all classes, and compared to consolidated methods in the literature, such as quark. The contributions of this thesis provided a major advance in the GAPF's predictive power, increasing the quality of the models and allowing investments in longer sequences. Advances have been notable in beta-sheets predictions, mainly due to the inclusion of hydrogen bonding potentials. Were made available also interesting tools for the future development of the program and GAPF was put as a good candidate for free-modeling predictions against prominent methodologies in the area.O problema da predição de estrutura de proteínas (PSP) consiste em desvendar o arranjo tridimensional da molécula a partir de sua sequência de aminoácidos. Conhecer a estrutura das proteínas constituintes de um sistema biológico é uma forma de se obter informações cruciais sobre o seu funcionamento, haja vista que a função de uma proteína está intrinsecamente relacionada à sua estrutura nativa tridimensional. A determinação experimental da estrutura de uma proteína além de apresentar dificuldades técnicas, é também dispendiosa em volume de trabalho e de tempo. Sendo assim, o investimento em métodos computacionais para PSP torna-se eminente. Essa tese tem como objetivo geral aumentar a capacidade preditiva do programa de predição de estrutura de proteínas GAPF e contribuir para o avanço das teorias e metodologias na área da predição independente de moldes (free-modeling). Os esforços são direcionados em duas frentes: (i) Melhorar a modelagem da função de energia, através do desenvolvimento e implementação de novos potenciais para a modelagem do problema. (ii) Incrementar a busca conformacional, através do desenvolvimento e implementação de um algoritmo genético multiobjetivo. Para a modelagem do problema, foram inseridos na função custo novos potenciais ad hoc que tratam da compactação hidrofóbica e das ligações de hidrogênio, componentes fundamentais no enovelamento protéico. Para a busca na superfície de energia, um algoritmo genético não-geracional multiobjetivo com crowding fenotípico foi proposto. A nova metodologia foi avaliada em um conjunto teste com 46 proteínas, de todas as classes, e comparada com métodos consolidados na literatura como o QUARK. As contribuições desta tese proporcionaram um grande avanço no poder preditivo do programa GAPF, aumentando a qualidade dos modelos e permitindo investir em sequências maiores. Avanços foram notáveis na predição de folhas-beta, principalmente fruto dos potenciais de ligação de hidrogênio inseridos. Disponibilizou-se, ainda, ferramentas interessantes para o desenvolvimento futuro do programa e colocou o GAPF como um bom candidato para predições independentes de molde frente metodologias de destaque na área.Submitted by Maria Cristina (library@lncc.br) on 2015-10-13T18:53:31Z No. of bitstreams: 1 Tese_Gregorio_LNCC_Set_2015_FINAL.pdf: 24967973 bytes, checksum: 0efd2d2481063521b74d53264c4be5bb (MD5)Approved for entry into archive by Maria Cristina (library@lncc.br) on 2015-10-13T18:53:44Z (GMT) No. of bitstreams: 1 Tese_Gregorio_LNCC_Set_2015_FINAL.pdf: 24967973 bytes, checksum: 0efd2d2481063521b74d53264c4be5bb (MD5)Made available in DSpace on 2015-10-13T18:53:59Z (GMT). No. of bitstreams: 1 Tese_Gregorio_LNCC_Set_2015_FINAL.pdf: 24967973 bytes, checksum: 0efd2d2481063521b74d53264c4be5bb (MD5) Previous issue date: 2015-09-17Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiroapplication/pdfhttp://tede-server.lncc.br:8080/retrieve/502/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpgporLaboratório Nacional de Computação CientíficaPrograma de Pós-Graduação em Modelagem ComputacionalLNCCBrasilCoordenação de Pós-Graduação e Aperfeiçoamento (COPGA)Biologia computacionalPredição de estruturas de proteinasModelagem molecularComputational biologyPrediction of protein structureMolecular modelingCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAODesenvolvimento de metodologias para predição de estruturas de proteínas independente de moldesDevelopment of free-modeling methodologies for protein structure predictioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações do LNCCinstname:Laboratório Nacional de Computação Científica (LNCC)instacron:LNCCLICENSElicense.txtlicense.txttext/plain; charset=utf-82165http://tede-server.lncc.br:8080/tede/bitstream/tede/221/1/license.txtbd3efa91386c1718a7f26a329fdcb468MD51ORIGINALTese_Gregorio_LNCC_Set_2015_FINAL.pdfTese_Gregorio_LNCC_Set_2015_FINAL.pdfapplication/pdf24967973http://tede-server.lncc.br:8080/tede/bitstream/tede/221/2/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf0efd2d2481063521b74d53264c4be5bbMD52THUMBNAILTese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpgTese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpgimage/jpeg3383http://tede-server.lncc.br:8080/tede/bitstream/tede/221/3/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpg58eb246610ffae8f89886eec54cec8ecMD53tede/2212023-06-02 11:40:47.468oai:tede-server.lncc.br:tede/221Tk9UQTogQ09MT1FVRSBBUVVJIEEgU1VBIFBSw5NQUklBIExJQ0VOw4dBCkVzdGEgbGljZW7Dp2EgZGUgZXhlbXBsbyDDqSBmb3JuZWNpZGEgYXBlbmFzIHBhcmEgZmlucyBpbmZvcm1hdGl2b3MuCgpMSUNFTsOHQSBERSBESVNUUklCVUnDh8ODTyBOw4NPLUVYQ0xVU0lWQQoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgClhYWCAoU2lnbGEgZGEgVW5pdmVyc2lkYWRlKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBwb2RlLCBzZW0gYWx0ZXJhciBvIGNvbnRlw7pkbywgdHJhbnNwb3IgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIApwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgU2lnbGEgZGUgVW5pdmVyc2lkYWRlIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPDs3BpYSBhIHN1YSB0ZXNlIG91IApkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyAKbmVzdGEgbGljZW7Dp2EuIFZvY8OqIHRhbWLDqW0gZGVjbGFyYSBxdWUgbyBkZXDDs3NpdG8gZGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBuw6NvLCBxdWUgc2VqYSBkZSBzZXUgCmNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiAKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSAKb3MgZGlyZWl0b3MgYXByZXNlbnRhZG9zIG5lc3RhIGxpY2Vuw6dhLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIAppZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250ZcO6ZG8gZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFRFU0UgT1UgRElTU0VSVEHDh8ODTyBPUkEgREVQT1NJVEFEQSBURU5IQSBTSURPIFJFU1VMVEFETyBERSBVTSBQQVRST0PDjU5JTyBPVSAKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBTSUdMQSBERSAKVU5JVkVSU0lEQURFLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyAKVEFNQsOJTSBBUyBERU1BSVMgT0JSSUdBw4fDlUVTIEVYSUdJREFTIFBPUiBDT05UUkFUTyBPVSBBQ09SRE8uCgpBIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Teses e Dissertaçõeshttps://tede.lncc.br/PUBhttps://tede.lncc.br/oai/requestlibrary@lncc.br||library@lncc.bropendoar:2023-06-02T14:40:47Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)false
dc.title.por.fl_str_mv Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
dc.title.alternative.eng.fl_str_mv Development of free-modeling methodologies for protein structure prediction
title Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
spellingShingle Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
Rocha, Gregório Kappaun
Biologia computacional
Predição de estruturas de proteinas
Modelagem molecular
Computational biology
Prediction of protein structure
Molecular modeling
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAO
title_short Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
title_full Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
title_fullStr Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
title_full_unstemmed Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
title_sort Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes
author Rocha, Gregório Kappaun
author_facet Rocha, Gregório Kappaun
author_role author
dc.contributor.advisor1.fl_str_mv Dardenne, Laurent Emmanuel
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/8344194525615133
dc.contributor.advisor2.fl_str_mv Custódio, Fábio Lima
dc.contributor.referee1.fl_str_mv Nicolas, Marisa Fabiana
dc.contributor.referee1Lattes.fl_str_mv http://lattes.cnpq.br/0717161560405537
dc.contributor.referee2.fl_str_mv Bisch, Paulo Mascarello
dc.contributor.referee3.fl_str_mv Araújo, Antonio Francisco Pereira
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/7690535205003366
dc.contributor.author.fl_str_mv Rocha, Gregório Kappaun
contributor_str_mv Dardenne, Laurent Emmanuel
Custódio, Fábio Lima
Nicolas, Marisa Fabiana
Bisch, Paulo Mascarello
Araújo, Antonio Francisco Pereira
dc.subject.por.fl_str_mv Biologia computacional
Predição de estruturas de proteinas
Modelagem molecular
Computational biology
Prediction of protein structure
Molecular modeling
topic Biologia computacional
Predição de estruturas de proteinas
Modelagem molecular
Computational biology
Prediction of protein structure
Molecular modeling
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAO
dc.subject.cnpq.fl_str_mv CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::COMPUTABILIDADE E MODELOS DE COMPUTACAO
description The protein structure prediciton problem (PSP) consists of discovering the native three-dimensional arrangement of a protein molecule using the information stored in its amino acid sequence. Unveiling the 3D structure of a protein is a way to obtain crucial information about its functions, given that the function of a protein is intrinsically related to its native three-dimensional structure. The experimental determination of the protein structure presents some technical difficulties and is also costly in workload and time. Thus, the investment in computational methods for PSP becomes imminent. This thesis has as main objective to increase the predictive ability of the GAPF protein structure prediction program and contribute to the advancement of theories and methodologies in the free-modeling prediction area. Efforts are directed on two fronts: (i) Improve the modeling of the energy function by the development and Implementing new potential for modeling the problem. (ii) To Increase the conformational search through the development and implementation of a multi-objective genetic algorithm. For the modeling of the problem, they were inserted in the function cost new ad hoc potentials that deal with hydrophobic compactation and with hydrogen bonds, key components in protein folding. For conformational search, a multiobjective steady-state genetic algorithm with phenotypic crowding was proposed. The new methodology was evaluated in a test set of 46 proteins, of all classes, and compared to consolidated methods in the literature, such as quark. The contributions of this thesis provided a major advance in the GAPF's predictive power, increasing the quality of the models and allowing investments in longer sequences. Advances have been notable in beta-sheets predictions, mainly due to the inclusion of hydrogen bonding potentials. Were made available also interesting tools for the future development of the program and GAPF was put as a good candidate for free-modeling predictions against prominent methodologies in the area.
publishDate 2015
dc.date.accessioned.fl_str_mv 2015-10-13T18:53:59Z
dc.date.issued.fl_str_mv 2015-09-02
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv ROCHA, G. K. Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes, 2015. xxx,170 f. Tese (Programa de Pós-Graduação em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2015.
dc.identifier.uri.fl_str_mv https://tede.lncc.br/handle/tede/221
identifier_str_mv ROCHA, G. K. Desenvolvimento de metodologias para predição de estruturas de proteínas independente de moldes, 2015. xxx,170 f. Tese (Programa de Pós-Graduação em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2015.
url https://tede.lncc.br/handle/tede/221
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Laboratório Nacional de Computação Científica
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Modelagem Computacional
dc.publisher.initials.fl_str_mv LNCC
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA)
publisher.none.fl_str_mv Laboratório Nacional de Computação Científica
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações do LNCC
instname:Laboratório Nacional de Computação Científica (LNCC)
instacron:LNCC
instname_str Laboratório Nacional de Computação Científica (LNCC)
instacron_str LNCC
institution LNCC
reponame_str Biblioteca Digital de Teses e Dissertações do LNCC
collection Biblioteca Digital de Teses e Dissertações do LNCC
bitstream.url.fl_str_mv http://tede-server.lncc.br:8080/tede/bitstream/tede/221/1/license.txt
http://tede-server.lncc.br:8080/tede/bitstream/tede/221/2/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf
http://tede-server.lncc.br:8080/tede/bitstream/tede/221/3/Tese_Gregorio_LNCC_Set_2015_FINAL.pdf.jpg
bitstream.checksum.fl_str_mv bd3efa91386c1718a7f26a329fdcb468
0efd2d2481063521b74d53264c4be5bb
58eb246610ffae8f89886eec54cec8ec
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)
repository.mail.fl_str_mv library@lncc.br||library@lncc.br
_version_ 1797683218664652800