LMAP: Lightweight Multigene Analyses in PAML

Detalhes bibliográficos
Autor(a) principal: Maldonado E.
Data de Publicação: 2016
Outros Autores: Almeida D., Escalona T., Khan I., Vasconcelos V., Antunes A.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/120491
Resumo: Background: Uncovering how phenotypic diversity arises and is maintained in nature has long been a major interest of evolutionary biologists. Recent advances in genome sequencing technologies have remarkably increased the efficiency to pinpoint genes involved in the adaptive evolution of phenotypes. Reliability of such findings is most often examined with statistical and computational methods using Maximum Likelihood codon-based models (i.e., site, branch, branch-site and clade models), such as those available in codeml from the Phylogenetic Analysis by Maximum Likelihood (PAML) package. While these models represent a well-defined workflow for documenting adaptive evolution, in practice they can be challenging for researchers having a vast amount of data, as multiple types of relevant codon-based datasets are generated, making the overall process hard and tedious to handle, error-prone and time-consuming. Results: We introduce LMAP (Lightweight Multigene Analyses in PAML), a user-friendly command-line and interactive package, designed to handle the codeml workflow, namely: directory organization, execution, results gathering and organization for Likelihood Ratio Test estimations with minimal manual user intervention. LMAP was developed for the workstation multi-core environment and provides a unique advantage for processing one, or more, if not all codeml codon-based models for multiple datasets at a time. Our software, proved efficiency throughout the codeml workflow, including, but not limited, to simultaneously handling more than 20 datasets. Conclusions: We have developed a simple and versatile LMAP package, with outstanding performance, enabling researchers to analyze multiple different codon-based datasets in a high-throughput fashion. At minimum, two file types are required within a single input directory: one for the multiple sequence alignment and another for the phylogenetic tree. To our knowledge, no other software combines all codeml codon substitution models of adaptive evolution. LMAP has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP package is released under GPLv3 license and is freely available at http://lmapaml.sourceforge.net/. © 2016 The Author(s).
id RCAP_a29d19b00abea8dd85794177de0a8677
oai_identifier_str oai:repositorio-aberto.up.pt:10216/120491
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling LMAP: Lightweight Multigene Analyses in PAMLBioinformaticsEfficiencyMaximum likelihoodMaximum likelihood estimationReliability analysisSoftware packagesAdaptive evolutioncodemlCodon substitutionsMulti coreMultigenePAMLOpen source softwarebioinformaticscodondirectoryhumanlicencemaximum likelihood methodmissense mutationphylogenetic treephylogenypipelinescientistsequence alignmentsoftwarestatistical modelworkflowBackground: Uncovering how phenotypic diversity arises and is maintained in nature has long been a major interest of evolutionary biologists. Recent advances in genome sequencing technologies have remarkably increased the efficiency to pinpoint genes involved in the adaptive evolution of phenotypes. Reliability of such findings is most often examined with statistical and computational methods using Maximum Likelihood codon-based models (i.e., site, branch, branch-site and clade models), such as those available in codeml from the Phylogenetic Analysis by Maximum Likelihood (PAML) package. While these models represent a well-defined workflow for documenting adaptive evolution, in practice they can be challenging for researchers having a vast amount of data, as multiple types of relevant codon-based datasets are generated, making the overall process hard and tedious to handle, error-prone and time-consuming. Results: We introduce LMAP (Lightweight Multigene Analyses in PAML), a user-friendly command-line and interactive package, designed to handle the codeml workflow, namely: directory organization, execution, results gathering and organization for Likelihood Ratio Test estimations with minimal manual user intervention. LMAP was developed for the workstation multi-core environment and provides a unique advantage for processing one, or more, if not all codeml codon-based models for multiple datasets at a time. Our software, proved efficiency throughout the codeml workflow, including, but not limited, to simultaneously handling more than 20 datasets. Conclusions: We have developed a simple and versatile LMAP package, with outstanding performance, enabling researchers to analyze multiple different codon-based datasets in a high-throughput fashion. At minimum, two file types are required within a single input directory: one for the multiple sequence alignment and another for the phylogenetic tree. To our knowledge, no other software combines all codeml codon substitution models of adaptive evolution. LMAP has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP package is released under GPLv3 license and is freely available at http://lmapaml.sourceforge.net/. © 2016 The Author(s).BMC20162016-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/10216/120491eng1471210510.1186/s12859-016-1204-5Maldonado E.Almeida D.Escalona T.Khan I.Vasconcelos V.Antunes A.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:45:25Zoai:repositorio-aberto.up.pt:10216/120491Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:07:54.272253Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv LMAP: Lightweight Multigene Analyses in PAML
title LMAP: Lightweight Multigene Analyses in PAML
spellingShingle LMAP: Lightweight Multigene Analyses in PAML
Maldonado E.
Bioinformatics
Efficiency
Maximum likelihood
Maximum likelihood estimation
Reliability analysis
Software packages
Adaptive evolution
codeml
Codon substitutions
Multi core
Multigene
PAML
Open source software
bioinformatics
codon
directory
human
licence
maximum likelihood method
missense mutation
phylogenetic tree
phylogeny
pipeline
scientist
sequence alignment
software
statistical model
workflow
title_short LMAP: Lightweight Multigene Analyses in PAML
title_full LMAP: Lightweight Multigene Analyses in PAML
title_fullStr LMAP: Lightweight Multigene Analyses in PAML
title_full_unstemmed LMAP: Lightweight Multigene Analyses in PAML
title_sort LMAP: Lightweight Multigene Analyses in PAML
author Maldonado E.
author_facet Maldonado E.
Almeida D.
Escalona T.
Khan I.
Vasconcelos V.
Antunes A.
author_role author
author2 Almeida D.
Escalona T.
Khan I.
Vasconcelos V.
Antunes A.
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Maldonado E.
Almeida D.
Escalona T.
Khan I.
Vasconcelos V.
Antunes A.
dc.subject.por.fl_str_mv Bioinformatics
Efficiency
Maximum likelihood
Maximum likelihood estimation
Reliability analysis
Software packages
Adaptive evolution
codeml
Codon substitutions
Multi core
Multigene
PAML
Open source software
bioinformatics
codon
directory
human
licence
maximum likelihood method
missense mutation
phylogenetic tree
phylogeny
pipeline
scientist
sequence alignment
software
statistical model
workflow
topic Bioinformatics
Efficiency
Maximum likelihood
Maximum likelihood estimation
Reliability analysis
Software packages
Adaptive evolution
codeml
Codon substitutions
Multi core
Multigene
PAML
Open source software
bioinformatics
codon
directory
human
licence
maximum likelihood method
missense mutation
phylogenetic tree
phylogeny
pipeline
scientist
sequence alignment
software
statistical model
workflow
description Background: Uncovering how phenotypic diversity arises and is maintained in nature has long been a major interest of evolutionary biologists. Recent advances in genome sequencing technologies have remarkably increased the efficiency to pinpoint genes involved in the adaptive evolution of phenotypes. Reliability of such findings is most often examined with statistical and computational methods using Maximum Likelihood codon-based models (i.e., site, branch, branch-site and clade models), such as those available in codeml from the Phylogenetic Analysis by Maximum Likelihood (PAML) package. While these models represent a well-defined workflow for documenting adaptive evolution, in practice they can be challenging for researchers having a vast amount of data, as multiple types of relevant codon-based datasets are generated, making the overall process hard and tedious to handle, error-prone and time-consuming. Results: We introduce LMAP (Lightweight Multigene Analyses in PAML), a user-friendly command-line and interactive package, designed to handle the codeml workflow, namely: directory organization, execution, results gathering and organization for Likelihood Ratio Test estimations with minimal manual user intervention. LMAP was developed for the workstation multi-core environment and provides a unique advantage for processing one, or more, if not all codeml codon-based models for multiple datasets at a time. Our software, proved efficiency throughout the codeml workflow, including, but not limited, to simultaneously handling more than 20 datasets. Conclusions: We have developed a simple and versatile LMAP package, with outstanding performance, enabling researchers to analyze multiple different codon-based datasets in a high-throughput fashion. At minimum, two file types are required within a single input directory: one for the multiple sequence alignment and another for the phylogenetic tree. To our knowledge, no other software combines all codeml codon substitution models of adaptive evolution. LMAP has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP package is released under GPLv3 license and is freely available at http://lmapaml.sourceforge.net/. © 2016 The Author(s).
publishDate 2016
dc.date.none.fl_str_mv 2016
2016-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/120491
url https://hdl.handle.net/10216/120491
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 14712105
10.1186/s12859-016-1204-5
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv BMC
publisher.none.fl_str_mv BMC
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799136004294049793