Can Power Laws Help Us Understand Gene and Proteome Information?
Autor(a) principal: | |
---|---|
Data de Publicação: | 2013 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.16/1642 |
Resumo: | Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis. |
id |
RCAP_ee43a57b6b444f4c6dce68e6c7816868 |
---|---|
oai_identifier_str |
oai:repositorio.chporto.pt:10400.16/1642 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Can Power Laws Help Us Understand Gene and Proteome Information?Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.This work was supported by FEDER Funds through the “Programa Operacional Factores de Competitividade— COMPETE” program and by National Funds through FCT “Fundaçao para a Ciência e a Tecnologia under Project FCOMP-01-0124-FEDER-PEst-OE/EEI/UI0760/2011Hindawi Publishing CorporationRepositório Científico do Centro Hospitalar Universitário de Santo AntónioTenreiro-Machado, J.Costa, A.Quelhas, M.2014-07-31T17:42:49Z20132013-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.16/1642eng. A. Tenreiro Machado, António C. Costa, and Maria Dulce Quelhas, “Can Power Laws Help Us Understand Gene and Proteome Information?,” Advances in Mathematical Physics, vol. 2013, Article ID 917153, 10 pages, 2013. doi:10.1155/2013/91715310.1155/2013/917153info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-10-20T10:56:50Zoai:repositorio.chporto.pt:10400.16/1642Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:38:01.369092Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Can Power Laws Help Us Understand Gene and Proteome Information? |
title |
Can Power Laws Help Us Understand Gene and Proteome Information? |
spellingShingle |
Can Power Laws Help Us Understand Gene and Proteome Information? Tenreiro-Machado, J. |
title_short |
Can Power Laws Help Us Understand Gene and Proteome Information? |
title_full |
Can Power Laws Help Us Understand Gene and Proteome Information? |
title_fullStr |
Can Power Laws Help Us Understand Gene and Proteome Information? |
title_full_unstemmed |
Can Power Laws Help Us Understand Gene and Proteome Information? |
title_sort |
Can Power Laws Help Us Understand Gene and Proteome Information? |
author |
Tenreiro-Machado, J. |
author_facet |
Tenreiro-Machado, J. Costa, A. Quelhas, M. |
author_role |
author |
author2 |
Costa, A. Quelhas, M. |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Repositório Científico do Centro Hospitalar Universitário de Santo António |
dc.contributor.author.fl_str_mv |
Tenreiro-Machado, J. Costa, A. Quelhas, M. |
description |
Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis. |
publishDate |
2013 |
dc.date.none.fl_str_mv |
2013 2013-01-01T00:00:00Z 2014-07-31T17:42:49Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.16/1642 |
url |
http://hdl.handle.net/10400.16/1642 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
. A. Tenreiro Machado, António C. Costa, and Maria Dulce Quelhas, “Can Power Laws Help Us Understand Gene and Proteome Information?,” Advances in Mathematical Physics, vol. 2013, Article ID 917153, 10 pages, 2013. doi:10.1155/2013/917153 10.1155/2013/917153 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Hindawi Publishing Corporation |
publisher.none.fl_str_mv |
Hindawi Publishing Corporation |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799133642028482560 |