Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning

Detalhes bibliográficos
Autor(a) principal: de Oliveira Almeida, Rodrigo [UNESP]
Data de Publicação: 2020
Outros Autores: Valente, Guilherme Targino [UNESP]
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1002/tpg2.20043
http://hdl.handle.net/11449/199306
Resumo: Most of the bioinformatics tools for enzyme annotation focus on enzymatic function assignments. Sequence similarity to well-characterized enzymes is often used for functional annotation and to assign metabolic pathways. However, these approaches are not feasible for all sequences leading to inaccurate annotations or lack of metabolic pathway information. Here we present the mApLe (metabolic pathway predictor of plant enzymes), a high-performance machine learning-based tool with models to label the metabolic pathway of enzymes rather than specifying enzymes’ reactions. The mApLe uses molecular descriptors of the enzyme sequences to perform predictions without considering sequence similarities with reference sequences. Hence, mApLe can classify a diversity of enzymes, even the ones without any homolog or with incomplete EC numbers. This tool can be used to improve the quality of genomic annotation of plants or to narrow down the number of candidate genes for metabolic engineering researches. The mApLe tool is available online, and the GUI can be locally installed.
id UNSP_68c66a891c68c6af423b871e94088b47
oai_identifier_str oai:repositorio.unesp.br:11449/199306
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learningMost of the bioinformatics tools for enzyme annotation focus on enzymatic function assignments. Sequence similarity to well-characterized enzymes is often used for functional annotation and to assign metabolic pathways. However, these approaches are not feasible for all sequences leading to inaccurate annotations or lack of metabolic pathway information. Here we present the mApLe (metabolic pathway predictor of plant enzymes), a high-performance machine learning-based tool with models to label the metabolic pathway of enzymes rather than specifying enzymes’ reactions. The mApLe uses molecular descriptors of the enzyme sequences to perform predictions without considering sequence similarities with reference sequences. Hence, mApLe can classify a diversity of enzymes, even the ones without any homolog or with incomplete EC numbers. This tool can be used to improve the quality of genomic annotation of plants or to narrow down the number of candidate genes for metabolic engineering researches. The mApLe tool is available online, and the GUI can be locally installed.Instituto Federal de Educação Ciência e Tecnologia do Sudeste de Minas Gerais MuriaéDepartment of Bioprocess and Biotechnology School of Agriculture São Paulo State University (Unesp)Department of Developmental Genetics Max Planck Institut für Herz- und Lungenforschung Bad NauheimDepartment of Bioprocess and Biotechnology School of Agriculture São Paulo State University (Unesp)MuriaéUniversidade Estadual Paulista (Unesp)Bad Nauheimde Oliveira Almeida, Rodrigo [UNESP]Valente, Guilherme Targino [UNESP]2020-12-12T01:36:14Z2020-12-12T01:36:14Z2020-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1002/tpg2.20043Plant Genome.1940-3372http://hdl.handle.net/11449/19930610.1002/tpg2.200432-s2.0-85089908075Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengPlant Genomeinfo:eu-repo/semantics/openAccess2021-10-23T07:00:44Zoai:repositorio.unesp.br:11449/199306Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462021-10-23T07:00:44Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
title Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
spellingShingle Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
de Oliveira Almeida, Rodrigo [UNESP]
title_short Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
title_full Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
title_fullStr Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
title_full_unstemmed Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
title_sort Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
author de Oliveira Almeida, Rodrigo [UNESP]
author_facet de Oliveira Almeida, Rodrigo [UNESP]
Valente, Guilherme Targino [UNESP]
author_role author
author2 Valente, Guilherme Targino [UNESP]
author2_role author
dc.contributor.none.fl_str_mv Muriaé
Universidade Estadual Paulista (Unesp)
Bad Nauheim
dc.contributor.author.fl_str_mv de Oliveira Almeida, Rodrigo [UNESP]
Valente, Guilherme Targino [UNESP]
description Most of the bioinformatics tools for enzyme annotation focus on enzymatic function assignments. Sequence similarity to well-characterized enzymes is often used for functional annotation and to assign metabolic pathways. However, these approaches are not feasible for all sequences leading to inaccurate annotations or lack of metabolic pathway information. Here we present the mApLe (metabolic pathway predictor of plant enzymes), a high-performance machine learning-based tool with models to label the metabolic pathway of enzymes rather than specifying enzymes’ reactions. The mApLe uses molecular descriptors of the enzyme sequences to perform predictions without considering sequence similarities with reference sequences. Hence, mApLe can classify a diversity of enzymes, even the ones without any homolog or with incomplete EC numbers. This tool can be used to improve the quality of genomic annotation of plants or to narrow down the number of candidate genes for metabolic engineering researches. The mApLe tool is available online, and the GUI can be locally installed.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-12T01:36:14Z
2020-12-12T01:36:14Z
2020-01-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1002/tpg2.20043
Plant Genome.
1940-3372
http://hdl.handle.net/11449/199306
10.1002/tpg2.20043
2-s2.0-85089908075
url http://dx.doi.org/10.1002/tpg2.20043
http://hdl.handle.net/11449/199306
identifier_str_mv Plant Genome.
1940-3372
10.1002/tpg2.20043
2-s2.0-85089908075
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Plant Genome
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1799965725350166528