spelling |
Marcos Augusto dos Santoshttp://lattes.cnpq.br/7251716819215153http://lattes.cnpq.br/9779299719144051Carmelina Figueiredo Vieira Leite2020-08-01T20:55:59Z2020-08-01T20:55:59Z2016-08-29http://hdl.handle.net/1843/33892The advance in proteins secondary structure prediction produces directly impacts on health and biological processes knowledge. Despite the achievements and advances, the prediction of proteins structure remains a challenge. Considering this fact, we propose a de novo method for the prediction of alpha helix. Initially, we created a list of proteins with low identity between them, from the repository Protein Data Bank, using PISCES. Each protein was separated into fragments (of size 9) using the sliding window technique. From the obtained fragments, we classified them into the ones that were 100% a standard type alpha helix, the ones that were not a 100% of the same type of secondary structure. For each fragment, we used a sliding window of size 3 to characterize them. These had a value associated with the occurrence of the alpha helix structure. It was possible to predict the secondary structure group, alpha helix, of an unknown protein/query. To accomplish our goals, we used modified logistic regression and constructed two methods for prediction of these structures. Tests of accuracy and specificity applied to the methods gave results greater than 70%. Unfortunately, the sensitivity did not show good results. One of the methods revealed to be a very promising application for the secondary structure prediction problem, and to a possible usage in other purpose. All methods were implemented in MatLab R2015b (2015)O avanço na predição da estrutura secundária de proteínas produz diretamente impactos na saúde e no conhecimento de processos biológicos. Apesar das conquistas e avanços, a predição da estrutura de proteínas continua a ser um desafio. Neste trabalho, nós propomos um método de novo para a predição de alfa hélice. Primeiramente, criamos uma lista de proteínas com baixa identidade entre eles, a partir do Banco de dados Protein Data Bank, utilizando a ferramenta PISCES. Cada proteína foi separada em fragmentos de tamanho (9), utilizando a técnica de janela deslizante. Os fragmentos obtidos foram classificados em aqueles que são 100% alfa hélice do tipo padrão e aquelas que não têm 100% deste tipo de estrutura secundária. Para cada fragmento, utilizamos uma janela deslizante de tamanho 3 para caracterizar cada um. Estes tripletos têm um valor associado com a ocorrência da estrutura α hélice. Com isso, é possível prever a estrutura secundária de uma proteína desconhecida. Para isso, usamos regressão logística modificada e construídos dois métodos de predição. Testes de precisão, especificidade deram origem a resultados superiores a 70%. Infelizmente, a sensibilidade não teve um bom resultado. Um dos métodos criados revelou-se promissor, tanto para este problema quanto para os outros problemas. Todos os métodos foram implementados em Matlab R2015b (2015)CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoFAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em BioinformaticaUFMGBrasilICB - INSTITUTO DE CIÊNCIAS BIOLOGICAShttp://creativecommons.org/licenses/by-nc-nd/3.0/pt/info:eu-repo/semantics/openAccessBioinformáticaModelos LogísticosPrevisõesProteínasLogistic regressionPredictionProteinStructurePrediction of alpha helices in proteins using Modified Logistic Regression Modelinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALPPGBioinformatica_CarmelinaFigueiredoVieiraLeite_DissertacaoMESTRADO.pdfPPGBioinformatica_CarmelinaFigueiredoVieiraLeite_DissertacaoMESTRADO.pdfapplication/pdf2691070https://repositorio.ufmg.br/bitstream/1843/33892/1/PPGBioinformatica_CarmelinaFigueiredoVieiraLeite_DissertacaoMESTRADO.pdf3a0874965274058015a7dc51059016d0MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufmg.br/bitstream/1843/33892/2/license_rdfcfd6801dba008cb6adbd9838b81582abMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82119https://repositorio.ufmg.br/bitstream/1843/33892/3/license.txt34badce4be7e31e3adb4575ae96af679MD531843/338922020-08-01 17:55:59.698oai:repositorio.ufmg.br:1843/33892TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KCg==Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oaiopendoar:2020-08-01T20:55:59Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
|