3D Convolutional Neural Networks for Identifying Protein Interfaces
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/123467 |
Resumo: | Protein interaction is a fundamental part of nearly all biochemical processes and proteins evolved specific surface regions for molecular recognition and interaction. These regions are different from the remaining surface, with different amino acid compositions, geometry and chemical properties. Detecting protein interfaces can lead to a better understanding of protein interactions granting advantages to fields such as drug design and metabolic engineering. Most of the existing interface predictors use structured data, clearly defined data types usually obtained from data sets. However, proteins are very complex molecules and there is not a single property capable of distinguishing the interface from the rest of the protein surface to all types of proteins. Indeed, deep learning arises as an adequate approach able to capture feature from unstructured data as images, texts, sensor data and volumes. In here, the aim was to identify interface regions in known protein spatial structures together with their biochemical properties by exploring new applications of 3D convolutional neural networks. For this, some state-of-the-art convolutional neural networks architectures were explored in order to find an architecture that suits this problem, and even more, have good performance. Other state-of-the-art machine learning predictors are also considered to identify the best biochemical properties to be added as new channels. Afterward, the interface predictions will be compared with the ground-truth, obtained by calculating the distances of atoms between the different chains of the protein complexes. |
id |
RCAP_c53f9fb801e2341be6c03c9bb410e4f6 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/123467 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
3D Convolutional Neural Networks for Identifying Protein InterfacesMachine learningDeep learningNeural networksBioinformaticsProteinInterface predictionDomínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaProtein interaction is a fundamental part of nearly all biochemical processes and proteins evolved specific surface regions for molecular recognition and interaction. These regions are different from the remaining surface, with different amino acid compositions, geometry and chemical properties. Detecting protein interfaces can lead to a better understanding of protein interactions granting advantages to fields such as drug design and metabolic engineering. Most of the existing interface predictors use structured data, clearly defined data types usually obtained from data sets. However, proteins are very complex molecules and there is not a single property capable of distinguishing the interface from the rest of the protein surface to all types of proteins. Indeed, deep learning arises as an adequate approach able to capture feature from unstructured data as images, texts, sensor data and volumes. In here, the aim was to identify interface regions in known protein spatial structures together with their biochemical properties by exploring new applications of 3D convolutional neural networks. For this, some state-of-the-art convolutional neural networks architectures were explored in order to find an architecture that suits this problem, and even more, have good performance. Other state-of-the-art machine learning predictors are also considered to identify the best biochemical properties to be added as new channels. Afterward, the interface predictions will be compared with the ground-truth, obtained by calculating the distances of atoms between the different chains of the protein complexes.A interação entre proteínas é fundamental em todos os processos biológicos e bioquímicos. As proteínas são compostas por regiões específicas que permitem o reconhecimento molecular e, consequentemente, interações com outras moléculas. Normalmente, estas regiões são estruturalmente diferentes da restante molécula sendo caracterizadas e compostas por aminoácidos diferentes, propriedades químicas e geometria diversa. A detecção das interfaces das proteínas pode ser uma mais valia no contexto de perceber a interação entre as mesmas e consecutivamente, ser vantajoso para o design de novos fármacos (ou drug design) e engenharia metabólica. As previsões de interfaces usam maioritariamente dados estruturados, ou seja, dados bem definidos normalmente obtidos em bancos de dados. No entanto, as proteínas são moléculas complexas o que impossibilita a distinção da sua interface, uma vez que não existe uma propriedade única e específica para todas. Deste modo, o deep learning é uma ferramenta fundamental porque usa características de dados não estruturados, como por exemplo a informação espacial da proteína, imagens, textos, dados de sensores ou volumes. O objetivo principal deste projeto é identificar regiões de interfaces através de estruturas tri-dimensionais de proteínas conhecidas juntamente com as respetivas distribuição espacial das suas propriedades, usando redes neuronais de convolução. Neste trabalho foram estudados algoritmos de deep learning para encontrar a rede neuronal mais adequada ao problema que pretendemos resolver com o melhor desempenho. Outros algoritmos de previsão foram considerados para identificar quais as melhores propriedades bioquímicas a serem usadas como novos canais de input. Seguidamente, as previsões do modelo foram comparadas com as interfaces reais, que foram obtidas pelo cálculo das distâncias dos átomos entre cadeias diferentes do mesmo complexo.Krippahl, LudwigRUNPascoal, Cláudio2021-08-31T14:30:18Z2021-022021-02-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/123467enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:04:48Zoai:run.unl.pt:10362/123467Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:45:03.521198Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
title |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
spellingShingle |
3D Convolutional Neural Networks for Identifying Protein Interfaces Pascoal, Cláudio Machine learning Deep learning Neural networks Bioinformatics Protein Interface prediction Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
title_short |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
title_full |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
title_fullStr |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
title_full_unstemmed |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
title_sort |
3D Convolutional Neural Networks for Identifying Protein Interfaces |
author |
Pascoal, Cláudio |
author_facet |
Pascoal, Cláudio |
author_role |
author |
dc.contributor.none.fl_str_mv |
Krippahl, Ludwig RUN |
dc.contributor.author.fl_str_mv |
Pascoal, Cláudio |
dc.subject.por.fl_str_mv |
Machine learning Deep learning Neural networks Bioinformatics Protein Interface prediction Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
topic |
Machine learning Deep learning Neural networks Bioinformatics Protein Interface prediction Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
description |
Protein interaction is a fundamental part of nearly all biochemical processes and proteins evolved specific surface regions for molecular recognition and interaction. These regions are different from the remaining surface, with different amino acid compositions, geometry and chemical properties. Detecting protein interfaces can lead to a better understanding of protein interactions granting advantages to fields such as drug design and metabolic engineering. Most of the existing interface predictors use structured data, clearly defined data types usually obtained from data sets. However, proteins are very complex molecules and there is not a single property capable of distinguishing the interface from the rest of the protein surface to all types of proteins. Indeed, deep learning arises as an adequate approach able to capture feature from unstructured data as images, texts, sensor data and volumes. In here, the aim was to identify interface regions in known protein spatial structures together with their biochemical properties by exploring new applications of 3D convolutional neural networks. For this, some state-of-the-art convolutional neural networks architectures were explored in order to find an architecture that suits this problem, and even more, have good performance. Other state-of-the-art machine learning predictors are also considered to identify the best biochemical properties to be added as new channels. Afterward, the interface predictions will be compared with the ground-truth, obtained by calculating the distances of atoms between the different chains of the protein complexes. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-08-31T14:30:18Z 2021-02 2021-02-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/123467 |
url |
http://hdl.handle.net/10362/123467 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138056909881344 |