Can power laws help us understand gene and proteome information?

Costa, António Cardoso; Machado, J.A.Tenreiro; Quelhas, Dulce

Can power laws help us understand gene and proteome information?

Detalhes bibliográficos
Autor(a) principal:	Costa, António Cardoso
Data de Publicação:	2013
Outros Autores:	Machado, J.A.Tenreiro, Quelhas, Dulce
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.22/1437
Resumo:	Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.

Metadados do item

id	RCAP_880290f13d4f2776b4f4c5bdb326f277
oai_identifier_str	oai:recipp.ipp.pt:10400.22/1437
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Can power laws help us understand gene and proteome information?ProteomePower lawCorrelationVisualizationProteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.Hindawi Publishing CorporationRepositório Científico do Instituto Politécnico do PortoCosta, António CardosoMachado, J.A.TenreiroQuelhas, Dulce2013-04-19T15:50:19Z20132013-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/1437eng1687-912010.1155/2013/917153info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T12:40:56Zoai:recipp.ipp.pt:10400.22/1437Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:22:43.093960Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Can power laws help us understand gene and proteome information?
title	Can power laws help us understand gene and proteome information?
spellingShingle	Can power laws help us understand gene and proteome information? Costa, António Cardoso Proteome Power law Correlation Visualization
title_short	Can power laws help us understand gene and proteome information?
title_full	Can power laws help us understand gene and proteome information?
title_fullStr	Can power laws help us understand gene and proteome information?
title_full_unstemmed	Can power laws help us understand gene and proteome information?
title_sort	Can power laws help us understand gene and proteome information?
author	Costa, António Cardoso
author_facet	Costa, António Cardoso Machado, J.A.Tenreiro Quelhas, Dulce
author_role	author
author2	Machado, J.A.Tenreiro Quelhas, Dulce
author2_role	author author
dc.contributor.none.fl_str_mv	Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv	Costa, António Cardoso Machado, J.A.Tenreiro Quelhas, Dulce
dc.subject.por.fl_str_mv	Proteome Power law Correlation Visualization
topic	Proteome Power law Correlation Visualization
description	Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.
publishDate	2013
dc.date.none.fl_str_mv	2013-04-19T15:50:19Z 2013 2013-01-01T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.22/1437
url	http://hdl.handle.net/10400.22/1437
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	1687-9120 10.1155/2013/917153
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Hindawi Publishing Corporation
publisher.none.fl_str_mv	Hindawi Publishing Corporation
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799131323192836096

Can power laws help us understand gene and proteome information?

Registros relacionados