ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability
Autor(a) principal: | |
---|---|
Data de Publicação: | 2017 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/156999 |
Resumo: | The computational reconstruction of ancestral proteins provides information on past biological events and has practical implications for biomedicine and biotechnology. Currently available tools for ancestral sequence reconstruction (ASR) are often based on empirical amino acid substitution models that assume that all sites evolve at the same rate and under the same process. However, this assumption is frequently violated because protein evolution is highly heterogeneous due to different selective constraints among sites. Here, we present ProtASR, a new evolutionary framework to infer ancestral protein sequences accounting for selection on protein stability. First, ProtASR generates site-specific substitution matrices through the structurally constrained mean-field (MF) substitution model, which considers both unfolding and misfolding stability. We previously showed that MF models outperform empirical amino acid substitution models, as well as other structurally constrained substitution models, both in terms of likelihood and correctly inferring amino acid distributions across sites. In the second step, ProtASR adapts a well-established maximum-likelihood (ML) ASR procedure to infer ancestral proteins under MF models. A known bias of ML ASR methods is that they tend to overestimate the stability of ancestral proteins by underestimating the frequency of deleterious mutations. We compared ProtASR under MF to two empirical substitution models (JTT and CAT), reconstructing the ancestral sequences of simulated proteins. ProtASR yields reconstructed proteins with less biased stabilities, which are significantly closer to those of the simulated proteins. Analysis of extant protein families suggests that folding stability evolves through time across protein families, potentially reflecting neutral fluctuation. Some families exhibit a more constant protein folding stability, while others are more variable. ProtASR is freely available from https://github.com/miguelarenas/protasr and includes detailed documentation and ready-to-use examples. It runs in seconds/minutes depending on protein length and alignment size. [Ancestral sequence reconstruction; folding stability; molecular adaptation; phylogenetics; protein evolution; protein structure.]. |
id |
RCAP_5d7e760ba650ffb54c0b5493ffe788b7 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/156999 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding StabilityAncestral sequence reconstructionProtein evolutionMolecular adaptationPhylogeneticsFolding stabilityProtein structureThe computational reconstruction of ancestral proteins provides information on past biological events and has practical implications for biomedicine and biotechnology. Currently available tools for ancestral sequence reconstruction (ASR) are often based on empirical amino acid substitution models that assume that all sites evolve at the same rate and under the same process. However, this assumption is frequently violated because protein evolution is highly heterogeneous due to different selective constraints among sites. Here, we present ProtASR, a new evolutionary framework to infer ancestral protein sequences accounting for selection on protein stability. First, ProtASR generates site-specific substitution matrices through the structurally constrained mean-field (MF) substitution model, which considers both unfolding and misfolding stability. We previously showed that MF models outperform empirical amino acid substitution models, as well as other structurally constrained substitution models, both in terms of likelihood and correctly inferring amino acid distributions across sites. In the second step, ProtASR adapts a well-established maximum-likelihood (ML) ASR procedure to infer ancestral proteins under MF models. A known bias of ML ASR methods is that they tend to overestimate the stability of ancestral proteins by underestimating the frequency of deleterious mutations. We compared ProtASR under MF to two empirical substitution models (JTT and CAT), reconstructing the ancestral sequences of simulated proteins. ProtASR yields reconstructed proteins with less biased stabilities, which are significantly closer to those of the simulated proteins. Analysis of extant protein families suggests that folding stability evolves through time across protein families, potentially reflecting neutral fluctuation. Some families exhibit a more constant protein folding stability, while others are more variable. ProtASR is freely available from https://github.com/miguelarenas/protasr and includes detailed documentation and ready-to-use examples. It runs in seconds/minutes depending on protein length and alignment size. [Ancestral sequence reconstruction; folding stability; molecular adaptation; phylogenetics; protein evolution; protein structure.].Society of Systematic Biologists20172017-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/10216/156999eng1063-515710.1093/sysbio/syw121Arenas, MWeber, CCLiberles, DABastolla, Uinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-02T01:24:40Zoai:repositorio-aberto.up.pt:10216/156999Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:59:27.875125Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
title |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
spellingShingle |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability Arenas, M Ancestral sequence reconstruction Protein evolution Molecular adaptation Phylogenetics Folding stability Protein structure |
title_short |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
title_full |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
title_fullStr |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
title_full_unstemmed |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
title_sort |
ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability |
author |
Arenas, M |
author_facet |
Arenas, M Weber, CC Liberles, DA Bastolla, U |
author_role |
author |
author2 |
Weber, CC Liberles, DA Bastolla, U |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Arenas, M Weber, CC Liberles, DA Bastolla, U |
dc.subject.por.fl_str_mv |
Ancestral sequence reconstruction Protein evolution Molecular adaptation Phylogenetics Folding stability Protein structure |
topic |
Ancestral sequence reconstruction Protein evolution Molecular adaptation Phylogenetics Folding stability Protein structure |
description |
The computational reconstruction of ancestral proteins provides information on past biological events and has practical implications for biomedicine and biotechnology. Currently available tools for ancestral sequence reconstruction (ASR) are often based on empirical amino acid substitution models that assume that all sites evolve at the same rate and under the same process. However, this assumption is frequently violated because protein evolution is highly heterogeneous due to different selective constraints among sites. Here, we present ProtASR, a new evolutionary framework to infer ancestral protein sequences accounting for selection on protein stability. First, ProtASR generates site-specific substitution matrices through the structurally constrained mean-field (MF) substitution model, which considers both unfolding and misfolding stability. We previously showed that MF models outperform empirical amino acid substitution models, as well as other structurally constrained substitution models, both in terms of likelihood and correctly inferring amino acid distributions across sites. In the second step, ProtASR adapts a well-established maximum-likelihood (ML) ASR procedure to infer ancestral proteins under MF models. A known bias of ML ASR methods is that they tend to overestimate the stability of ancestral proteins by underestimating the frequency of deleterious mutations. We compared ProtASR under MF to two empirical substitution models (JTT and CAT), reconstructing the ancestral sequences of simulated proteins. ProtASR yields reconstructed proteins with less biased stabilities, which are significantly closer to those of the simulated proteins. Analysis of extant protein families suggests that folding stability evolves through time across protein families, potentially reflecting neutral fluctuation. Some families exhibit a more constant protein folding stability, while others are more variable. ProtASR is freely available from https://github.com/miguelarenas/protasr and includes detailed documentation and ready-to-use examples. It runs in seconds/minutes depending on protein length and alignment size. [Ancestral sequence reconstruction; folding stability; molecular adaptation; phylogenetics; protein evolution; protein structure.]. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017 2017-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/156999 |
url |
https://hdl.handle.net/10216/156999 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1063-5157 10.1093/sysbio/syw121 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Society of Systematic Biologists |
publisher.none.fl_str_mv |
Society of Systematic Biologists |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137078361980928 |