Towards Data Warehousing and Mining of Protein Unfolding Simulation Data

Bibliographic Details
Main Author: Berrar, Daniel
Publication Date: 2005
Other Authors: Stahl, Frederic, Silva, Candida, Rodrigues, J., Brito, Rui, Dubitzky, Werner
Format: Article
Language: eng
Source: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Download full: http://hdl.handle.net/10316/7790
https://doi.org/10.1007/s10877-005-0676-z
Summary: Objectives. The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data.Methods. To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis.Results.To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse.Conclusions.Web and grid services, especially pre-defined data mining services that can run on or ‘near’ the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
id RCAP_04f570420ce734381c869aefadaddbfb
oai_identifier_str oai:estudogeral.uc.pt:10316/7790
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Towards Data Warehousing and Mining of Protein Unfolding Simulation DataObjectives. The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data.Methods. To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis.Results.To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse.Conclusions.Web and grid services, especially pre-defined data mining services that can run on or ‘near’ the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.2005info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/7790http://hdl.handle.net/10316/7790https://doi.org/10.1007/s10877-005-0676-zengJournal of Clinical Monitoring and Computing. 19:4 (2005) 307-317Berrar, DanielStahl, FredericSilva, CandidaRodrigues, J.Brito, RuiDubitzky, Wernerinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2020-05-25T13:09:10Zoai:estudogeral.uc.pt:10316/7790Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:01:27.250777Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
title Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
spellingShingle Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
Berrar, Daniel
title_short Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
title_full Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
title_fullStr Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
title_full_unstemmed Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
title_sort Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
author Berrar, Daniel
author_facet Berrar, Daniel
Stahl, Frederic
Silva, Candida
Rodrigues, J.
Brito, Rui
Dubitzky, Werner
author_role author
author2 Stahl, Frederic
Silva, Candida
Rodrigues, J.
Brito, Rui
Dubitzky, Werner
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Berrar, Daniel
Stahl, Frederic
Silva, Candida
Rodrigues, J.
Brito, Rui
Dubitzky, Werner
description Objectives. The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data.Methods. To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis.Results.To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse.Conclusions.Web and grid services, especially pre-defined data mining services that can run on or ‘near’ the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
publishDate 2005
dc.date.none.fl_str_mv 2005
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10316/7790
http://hdl.handle.net/10316/7790
https://doi.org/10.1007/s10877-005-0676-z
url http://hdl.handle.net/10316/7790
https://doi.org/10.1007/s10877-005-0676-z
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Journal of Clinical Monitoring and Computing. 19:4 (2005) 307-317
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133905831329792