Statistical, computational and visualization methodologies to unveil gene primary structure features
Autor(a) principal: | |
---|---|
Data de Publicação: | 2006 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10773/27687 |
Resumo: | Gene sequence features such as codon bias, codon context, and codon expansion (e.g. trinucleotide repeats) can be better understood at the genomic scale level by combining statistical methodologies with advanced computer algorithms and data visualization through sophisticated graphical interfaces. This paper presents the ANACONDA system, a bioinformatics application for gene primary structure analysis. Codon usage tables using absolute metrics and software for multivariate analysis of codon and amino acid usage are available in public databases. However, they do not provide easy computational and statistical tools to carry out detailed gene primary structure analysis on a genomic scale. We propose the usage of several statistical methods--contingency table analysis, residual analysis, multivariate analysis (cluster analysis)--to analyze the codon bias under various aspects (degree of association, contexts and clustering). The developed solution is a software application that provides a user-guided analysis of codon sequences considering several contexts and codon usage on a genomic scale. The utilization of this tool in our molecular biology laboratory is focused on particular genomes, especially those from Saccharomyces cerevisiae, Candida albicans and Escherichia coli. In order to illustrate the applicability and output layouts of the software these species are herein used as examples. The statistical tools incorporated in the system are allowing to obtain global views of important sequence features. It is expected that the results obtained will permit identification of general rules that govern codon context and codon usage in any genome. Additionally, identification of genes containing expanded codons that arise as a consequence of erroneous DNA replication events will permit uncovering new genes associated with human disease. |
id |
RCAP_86ab3275815499d0e5db58c65b6eaec9 |
---|---|
oai_identifier_str |
oai:ria.ua.pt:10773/27687 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Statistical, computational and visualization methodologies to unveil gene primary structure featuresBioinformatics softwareCodon contextCodon biasContingency tablesResidual analysisCluster analysisGene sequence features such as codon bias, codon context, and codon expansion (e.g. trinucleotide repeats) can be better understood at the genomic scale level by combining statistical methodologies with advanced computer algorithms and data visualization through sophisticated graphical interfaces. This paper presents the ANACONDA system, a bioinformatics application for gene primary structure analysis. Codon usage tables using absolute metrics and software for multivariate analysis of codon and amino acid usage are available in public databases. However, they do not provide easy computational and statistical tools to carry out detailed gene primary structure analysis on a genomic scale. We propose the usage of several statistical methods--contingency table analysis, residual analysis, multivariate analysis (cluster analysis)--to analyze the codon bias under various aspects (degree of association, contexts and clustering). The developed solution is a software application that provides a user-guided analysis of codon sequences considering several contexts and codon usage on a genomic scale. The utilization of this tool in our molecular biology laboratory is focused on particular genomes, especially those from Saccharomyces cerevisiae, Candida albicans and Escherichia coli. In order to illustrate the applicability and output layouts of the software these species are herein used as examples. The statistical tools incorporated in the system are allowing to obtain global views of important sequence features. It is expected that the results obtained will permit identification of general rules that govern codon context and codon usage in any genome. Additionally, identification of genes containing expanded codons that arise as a consequence of erroneous DNA replication events will permit uncovering new genes associated with human disease.Schattauer2020-02-27T10:33:35Z2006-01-01T00:00:00Z2006info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10773/27687eng0026-127010.1267/METH06020163Pinheiro, M.Afreixo, V.Moura, G.Freitas, A.Santos, M. A. S.Oliveira, J. L.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T11:53:36Zoai:ria.ua.pt:10773/27687Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:00:23.342173Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
title |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
spellingShingle |
Statistical, computational and visualization methodologies to unveil gene primary structure features Pinheiro, M. Bioinformatics software Codon context Codon bias Contingency tables Residual analysis Cluster analysis |
title_short |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
title_full |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
title_fullStr |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
title_full_unstemmed |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
title_sort |
Statistical, computational and visualization methodologies to unveil gene primary structure features |
author |
Pinheiro, M. |
author_facet |
Pinheiro, M. Afreixo, V. Moura, G. Freitas, A. Santos, M. A. S. Oliveira, J. L. |
author_role |
author |
author2 |
Afreixo, V. Moura, G. Freitas, A. Santos, M. A. S. Oliveira, J. L. |
author2_role |
author author author author author |
dc.contributor.author.fl_str_mv |
Pinheiro, M. Afreixo, V. Moura, G. Freitas, A. Santos, M. A. S. Oliveira, J. L. |
dc.subject.por.fl_str_mv |
Bioinformatics software Codon context Codon bias Contingency tables Residual analysis Cluster analysis |
topic |
Bioinformatics software Codon context Codon bias Contingency tables Residual analysis Cluster analysis |
description |
Gene sequence features such as codon bias, codon context, and codon expansion (e.g. trinucleotide repeats) can be better understood at the genomic scale level by combining statistical methodologies with advanced computer algorithms and data visualization through sophisticated graphical interfaces. This paper presents the ANACONDA system, a bioinformatics application for gene primary structure analysis. Codon usage tables using absolute metrics and software for multivariate analysis of codon and amino acid usage are available in public databases. However, they do not provide easy computational and statistical tools to carry out detailed gene primary structure analysis on a genomic scale. We propose the usage of several statistical methods--contingency table analysis, residual analysis, multivariate analysis (cluster analysis)--to analyze the codon bias under various aspects (degree of association, contexts and clustering). The developed solution is a software application that provides a user-guided analysis of codon sequences considering several contexts and codon usage on a genomic scale. The utilization of this tool in our molecular biology laboratory is focused on particular genomes, especially those from Saccharomyces cerevisiae, Candida albicans and Escherichia coli. In order to illustrate the applicability and output layouts of the software these species are herein used as examples. The statistical tools incorporated in the system are allowing to obtain global views of important sequence features. It is expected that the results obtained will permit identification of general rules that govern codon context and codon usage in any genome. Additionally, identification of genes containing expanded codons that arise as a consequence of erroneous DNA replication events will permit uncovering new genes associated with human disease. |
publishDate |
2006 |
dc.date.none.fl_str_mv |
2006-01-01T00:00:00Z 2006 2020-02-27T10:33:35Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10773/27687 |
url |
http://hdl.handle.net/10773/27687 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
0026-1270 10.1267/METH06020163 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Schattauer |
publisher.none.fl_str_mv |
Schattauer |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137659476508672 |