Automatic classification of enzyme family in protein annotation

Detalhes bibliográficos
Autor(a) principal: Dos Santos, Cássia T.
Data de Publicação: 2009
Outros Autores: Bazzan, Ana L. C., Lemke, Ney [UNESP]
Tipo de documento: Artigo de conferência
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1007/978-3-642-03223-3_8
http://hdl.handle.net/11449/71147
Resumo: Most of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg.
id UNSP_1e72e525aea6bebfcc42c199fb2e05ba
oai_identifier_str oai:repositorio.unesp.br:11449/71147
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Automatic classification of enzyme family in protein annotationAutomatic classificationBiological functionsClassification errorsEnzymatic processEnzyme commissionsFunctional informationGenome annotationProtein annotationProtein functionsSequence homologySet of rulesSymbolic machine learningTri-dimensional structureAutomatic indexingBiologyEnzymesBioinformaticsMost of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg.Departamento de Informática Universidade de ÉvoraInstituto de Informática PPGC Universidade Federal Do Rio Grande Do sul, Porto Alegre, RS C. P. 15064, 91.501-970Dep. de Física e Biofísica Instituto de Biociências UNESP, Botucatu, SP C.P. 510, 18618-000Dep. de Física e Biofísica Instituto de Biociências UNESP, Botucatu, SP C.P. 510, 18618-000Universidade de ÉvoraUniversidade Federal do Rio Grande do Sul (UFRGS)Universidade Estadual Paulista (Unesp)Dos Santos, Cássia T.Bazzan, Ana L. C.Lemke, Ney [UNESP]2014-05-27T11:23:58Z2014-05-27T11:23:58Z2009-09-14info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject86-96http://dx.doi.org/10.1007/978-3-642-03223-3_8Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 5676 LNBI, p. 86-96.0302-97431611-3349http://hdl.handle.net/11449/7114710.1007/978-3-642-03223-3_82-s2.0-699491901177977035910952141Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)0,295info:eu-repo/semantics/openAccess2021-10-23T21:41:41Zoai:repositorio.unesp.br:11449/71147Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462021-10-23T21:41:41Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Automatic classification of enzyme family in protein annotation
title Automatic classification of enzyme family in protein annotation
spellingShingle Automatic classification of enzyme family in protein annotation
Dos Santos, Cássia T.
Automatic classification
Biological functions
Classification errors
Enzymatic process
Enzyme commissions
Functional information
Genome annotation
Protein annotation
Protein functions
Sequence homology
Set of rules
Symbolic machine learning
Tri-dimensional structure
Automatic indexing
Biology
Enzymes
Bioinformatics
title_short Automatic classification of enzyme family in protein annotation
title_full Automatic classification of enzyme family in protein annotation
title_fullStr Automatic classification of enzyme family in protein annotation
title_full_unstemmed Automatic classification of enzyme family in protein annotation
title_sort Automatic classification of enzyme family in protein annotation
author Dos Santos, Cássia T.
author_facet Dos Santos, Cássia T.
Bazzan, Ana L. C.
Lemke, Ney [UNESP]
author_role author
author2 Bazzan, Ana L. C.
Lemke, Ney [UNESP]
author2_role author
author
dc.contributor.none.fl_str_mv Universidade de Évora
Universidade Federal do Rio Grande do Sul (UFRGS)
Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Dos Santos, Cássia T.
Bazzan, Ana L. C.
Lemke, Ney [UNESP]
dc.subject.por.fl_str_mv Automatic classification
Biological functions
Classification errors
Enzymatic process
Enzyme commissions
Functional information
Genome annotation
Protein annotation
Protein functions
Sequence homology
Set of rules
Symbolic machine learning
Tri-dimensional structure
Automatic indexing
Biology
Enzymes
Bioinformatics
topic Automatic classification
Biological functions
Classification errors
Enzymatic process
Enzyme commissions
Functional information
Genome annotation
Protein annotation
Protein functions
Sequence homology
Set of rules
Symbolic machine learning
Tri-dimensional structure
Automatic indexing
Biology
Enzymes
Bioinformatics
description Most of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg.
publishDate 2009
dc.date.none.fl_str_mv 2009-09-14
2014-05-27T11:23:58Z
2014-05-27T11:23:58Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/978-3-642-03223-3_8
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 5676 LNBI, p. 86-96.
0302-9743
1611-3349
http://hdl.handle.net/11449/71147
10.1007/978-3-642-03223-3_8
2-s2.0-69949190117
7977035910952141
url http://dx.doi.org/10.1007/978-3-642-03223-3_8
http://hdl.handle.net/11449/71147
identifier_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 5676 LNBI, p. 86-96.
0302-9743
1611-3349
10.1007/978-3-642-03223-3_8
2-s2.0-69949190117
7977035910952141
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
0,295
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 86-96
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1799964657024237568