Optimizing Data Selection for Contact Prediction in Proteins
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/91154 |
Resumo: | Proteins are essential to life across all organisms. They act as enzymes, antibodies, transporters of molecules, structural elements, among other important roles. Their ability to interact with specific molecules in a selective manner, is what makes them important. Being able to understand their interaction can provide many advantages in fields such as drug design and metabolic engineering. Current methods of predicting protein interaction attempt to geometrically fit the structures of two proteins together by generating a large amount of potential configurations and then discriminating the correct pose from the remaining ones. Given the large search space, approaches to reduce the complexity are often employed. Identifying a contact point between the pairing proteins is a good constraining factor. If at least one contact can be predicted among a small set of possibilities (e.g. 100), the search space will be significantly reduced. Using structural and evolutionary information of the interacting proteins, a machine learning predictor can be developed for this task. Such evolutionary measures are computed over a substantial amount of homologous sequences, which can be filtered and ordered in many different ways. As a result, a machine learning solution was developed that focused in measuring the effects that differing homolog arrangements can have over the final prediction. |
id |
RCAP_7cec65f0339dfc12ab2c4a96f9a70a07 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/91154 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Optimizing Data Selection for Contact Prediction in ProteinsContact predictionMachine learningBioinformaticsProtein-Protein InteractionsDomínio/Área Científica::Engenharia e Tecnologia::Engenharia dos MateriaisProteins are essential to life across all organisms. They act as enzymes, antibodies, transporters of molecules, structural elements, among other important roles. Their ability to interact with specific molecules in a selective manner, is what makes them important. Being able to understand their interaction can provide many advantages in fields such as drug design and metabolic engineering. Current methods of predicting protein interaction attempt to geometrically fit the structures of two proteins together by generating a large amount of potential configurations and then discriminating the correct pose from the remaining ones. Given the large search space, approaches to reduce the complexity are often employed. Identifying a contact point between the pairing proteins is a good constraining factor. If at least one contact can be predicted among a small set of possibilities (e.g. 100), the search space will be significantly reduced. Using structural and evolutionary information of the interacting proteins, a machine learning predictor can be developed for this task. Such evolutionary measures are computed over a substantial amount of homologous sequences, which can be filtered and ordered in many different ways. As a result, a machine learning solution was developed that focused in measuring the effects that differing homolog arrangements can have over the final prediction.Krippahl, LudwigRUNFial, Guilherme José Gago2020-01-14T10:48:56Z201920192019-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/91154enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:40:28Zoai:run.unl.pt:10362/91154Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:37:16.513402Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Optimizing Data Selection for Contact Prediction in Proteins |
title |
Optimizing Data Selection for Contact Prediction in Proteins |
spellingShingle |
Optimizing Data Selection for Contact Prediction in Proteins Fial, Guilherme José Gago Contact prediction Machine learning Bioinformatics Protein-Protein Interactions Domínio/Área Científica::Engenharia e Tecnologia::Engenharia dos Materiais |
title_short |
Optimizing Data Selection for Contact Prediction in Proteins |
title_full |
Optimizing Data Selection for Contact Prediction in Proteins |
title_fullStr |
Optimizing Data Selection for Contact Prediction in Proteins |
title_full_unstemmed |
Optimizing Data Selection for Contact Prediction in Proteins |
title_sort |
Optimizing Data Selection for Contact Prediction in Proteins |
author |
Fial, Guilherme José Gago |
author_facet |
Fial, Guilherme José Gago |
author_role |
author |
dc.contributor.none.fl_str_mv |
Krippahl, Ludwig RUN |
dc.contributor.author.fl_str_mv |
Fial, Guilherme José Gago |
dc.subject.por.fl_str_mv |
Contact prediction Machine learning Bioinformatics Protein-Protein Interactions Domínio/Área Científica::Engenharia e Tecnologia::Engenharia dos Materiais |
topic |
Contact prediction Machine learning Bioinformatics Protein-Protein Interactions Domínio/Área Científica::Engenharia e Tecnologia::Engenharia dos Materiais |
description |
Proteins are essential to life across all organisms. They act as enzymes, antibodies, transporters of molecules, structural elements, among other important roles. Their ability to interact with specific molecules in a selective manner, is what makes them important. Being able to understand their interaction can provide many advantages in fields such as drug design and metabolic engineering. Current methods of predicting protein interaction attempt to geometrically fit the structures of two proteins together by generating a large amount of potential configurations and then discriminating the correct pose from the remaining ones. Given the large search space, approaches to reduce the complexity are often employed. Identifying a contact point between the pairing proteins is a good constraining factor. If at least one contact can be predicted among a small set of possibilities (e.g. 100), the search space will be significantly reduced. Using structural and evolutionary information of the interacting proteins, a machine learning predictor can be developed for this task. Such evolutionary measures are computed over a substantial amount of homologous sequences, which can be filtered and ordered in many different ways. As a result, a machine learning solution was developed that focused in measuring the effects that differing homolog arrangements can have over the final prediction. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019 2019 2019-01-01T00:00:00Z 2020-01-14T10:48:56Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/91154 |
url |
http://hdl.handle.net/10362/91154 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137989612273665 |