Automatic classification of enzyme family in protein annotation
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1007/978-3-642-03223-3_8 http://hdl.handle.net/11449/71147 |
Resumo: | Most of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg. |
id |
UNSP_1e72e525aea6bebfcc42c199fb2e05ba |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/71147 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Automatic classification of enzyme family in protein annotationAutomatic classificationBiological functionsClassification errorsEnzymatic processEnzyme commissionsFunctional informationGenome annotationProtein annotationProtein functionsSequence homologySet of rulesSymbolic machine learningTri-dimensional structureAutomatic indexingBiologyEnzymesBioinformaticsMost of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg.Departamento de Informática Universidade de ÉvoraInstituto de Informática PPGC Universidade Federal Do Rio Grande Do sul, Porto Alegre, RS C. P. 15064, 91.501-970Dep. de Física e Biofísica Instituto de Biociências UNESP, Botucatu, SP C.P. 510, 18618-000Dep. de Física e Biofísica Instituto de Biociências UNESP, Botucatu, SP C.P. 510, 18618-000Universidade de ÉvoraUniversidade Federal do Rio Grande do Sul (UFRGS)Universidade Estadual Paulista (Unesp)Dos Santos, Cássia T.Bazzan, Ana L. C.Lemke, Ney [UNESP]2014-05-27T11:23:58Z2014-05-27T11:23:58Z2009-09-14info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject86-96http://dx.doi.org/10.1007/978-3-642-03223-3_8Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 5676 LNBI, p. 86-96.0302-97431611-3349http://hdl.handle.net/11449/7114710.1007/978-3-642-03223-3_82-s2.0-699491901177977035910952141Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)0,295info:eu-repo/semantics/openAccess2021-10-23T21:41:41Zoai:repositorio.unesp.br:11449/71147Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462021-10-23T21:41:41Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Automatic classification of enzyme family in protein annotation |
title |
Automatic classification of enzyme family in protein annotation |
spellingShingle |
Automatic classification of enzyme family in protein annotation Dos Santos, Cássia T. Automatic classification Biological functions Classification errors Enzymatic process Enzyme commissions Functional information Genome annotation Protein annotation Protein functions Sequence homology Set of rules Symbolic machine learning Tri-dimensional structure Automatic indexing Biology Enzymes Bioinformatics |
title_short |
Automatic classification of enzyme family in protein annotation |
title_full |
Automatic classification of enzyme family in protein annotation |
title_fullStr |
Automatic classification of enzyme family in protein annotation |
title_full_unstemmed |
Automatic classification of enzyme family in protein annotation |
title_sort |
Automatic classification of enzyme family in protein annotation |
author |
Dos Santos, Cássia T. |
author_facet |
Dos Santos, Cássia T. Bazzan, Ana L. C. Lemke, Ney [UNESP] |
author_role |
author |
author2 |
Bazzan, Ana L. C. Lemke, Ney [UNESP] |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Universidade de Évora Universidade Federal do Rio Grande do Sul (UFRGS) Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Dos Santos, Cássia T. Bazzan, Ana L. C. Lemke, Ney [UNESP] |
dc.subject.por.fl_str_mv |
Automatic classification Biological functions Classification errors Enzymatic process Enzyme commissions Functional information Genome annotation Protein annotation Protein functions Sequence homology Set of rules Symbolic machine learning Tri-dimensional structure Automatic indexing Biology Enzymes Bioinformatics |
topic |
Automatic classification Biological functions Classification errors Enzymatic process Enzyme commissions Functional information Genome annotation Protein annotation Protein functions Sequence homology Set of rules Symbolic machine learning Tri-dimensional structure Automatic indexing Biology Enzymes Bioinformatics |
description |
Most of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-09-14 2014-05-27T11:23:58Z 2014-05-27T11:23:58Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1007/978-3-642-03223-3_8 Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 5676 LNBI, p. 86-96. 0302-9743 1611-3349 http://hdl.handle.net/11449/71147 10.1007/978-3-642-03223-3_8 2-s2.0-69949190117 7977035910952141 |
url |
http://dx.doi.org/10.1007/978-3-642-03223-3_8 http://hdl.handle.net/11449/71147 |
identifier_str_mv |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 5676 LNBI, p. 86-96. 0302-9743 1611-3349 10.1007/978-3-642-03223-3_8 2-s2.0-69949190117 7977035910952141 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 0,295 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
86-96 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1799964657024237568 |