SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFMG |
Texto Completo: | https://doi.org/10.18637/jss.v090.i09 http://hdl.handle.net/1843/56441 https://orcid.org/0000-0002-5683-8326 |
Resumo: | The development of simulation-based methods, such as Markov chain Monte Carlo (MCMC), has contributed to an increased interest in the Bayesian framework as an alternative to deal with factor models. Many studies have used Bayesian factor analysis to explore gene expression data. We are particularly interested in the application of a sparse latent factor model (SLFM) based on sparsity priors (mixtures) to assess the significance of factors. The SLFM measures how strong the observed coherent expression pattern is in the data, which is an important source of information to evaluate gene activity. In the literature, this type of model has shown better results than other approaches intended for identification of patterns and metagene groups related to the underlying biology. However, a full Bayesian factor model relying on MCMC algorithms has an expensive computational cost, which makes it unattractive for general users. In this paper, we present the package slfm which uses C++ implementation via Rcpp to improve the computational performance of the SLFM within the widely used statistical tool R. We investigate real and simulated microarray data related to breast cancer. |
id |
UFMG_2c05e22fb2d7458448aee8661d387a34 |
---|---|
oai_identifier_str |
oai:repositorio.ufmg.br:1843/56441 |
network_acronym_str |
UFMG |
network_name_str |
Repositório Institucional da UFMG |
repository_id_str |
|
spelling |
2023-07-17T18:52:39Z2023-07-17T18:52:39Z2019-07-31909https://doi.org/10.18637/jss.v090.i091548-7660http://hdl.handle.net/1843/56441https://orcid.org/0000-0002-5683-8326The development of simulation-based methods, such as Markov chain Monte Carlo (MCMC), has contributed to an increased interest in the Bayesian framework as an alternative to deal with factor models. Many studies have used Bayesian factor analysis to explore gene expression data. We are particularly interested in the application of a sparse latent factor model (SLFM) based on sparsity priors (mixtures) to assess the significance of factors. The SLFM measures how strong the observed coherent expression pattern is in the data, which is an important source of information to evaluate gene activity. In the literature, this type of model has shown better results than other approaches intended for identification of patterns and metagene groups related to the underlying biology. However, a full Bayesian factor model relying on MCMC algorithms has an expensive computational cost, which makes it unattractive for general users. In this paper, we present the package slfm which uses C++ implementation via Rcpp to improve the computational performance of the SLFM within the widely used statistical tool R. We investigate real and simulated microarray data related to breast cancer.O desenvolvimento de métodos baseados em simulação, como a cadeia de Markov Monte Carlo (MCMC), tem contribuído para um aumento do interesse no framework Bayesiano como uma alternativa para lidar com modelos fatoriais. Muitos estudos usaram análise fatorial bayesiana para explorar dados de expressão gênica. Estamos particularmente interessados na aplicação de um modelo de fator latente esparso (SLFM) baseado em prioris de esparsidade (misturas) para avaliar a significância dos fatores. O SLFM mede a força do padrão de expressão coerente observado nos dados, o que é uma importante fonte de informação para avaliar a atividade do gene. Na literatura, esse tipo de modelo tem mostrado melhores resultados do que outras abordagens destinadas à identificação de padrões e grupos metagênicos relacionados à biologia subjacente. No entanto, um modelo de fator bayesiano completo baseado em algoritmos MCMC tem um custo computacional caro, o que o torna pouco atraente para usuários em geral. Neste artigo, apresentamos o pacote slfm que usa implementação C++ via Rcpp para melhorar o desempenho computacional do SLFM dentro da ferramenta estatística amplamente utilizada R. Investigamos dados reais e simulados de microarray relacionados ao câncer de mama.FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisengUniversidade Federal de Minas GeraisUFMGBrasilICX - DEPARTAMENTO DE ESTATÍSTICAJournal of Statistical SoftwareEstatísticaProbabilidadesTeoria bayesiana de decisão estatisticaC++ (Linguagem de programação de computador)R (Linguagem de programação de computador)Factor modelBayesian inferenceGene expressionSparsity priorsRcppSLFMSLFM: an R package to evaluate coherent patterns in microarray data via factor analysisSLFM: um pacote R para avaliar padrões coerentes em dados de microarray via análise fatorialinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://www.jstatsoft.org/article/view/v090i09João Daniel Nunes DuarteVinícius Diniz Mayrinkapplication/pdfinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGLICENSELicense.txtLicense.txttext/plain; charset=utf-82042https://repositorio.ufmg.br/bitstream/1843/56441/1/License.txtfa505098d172de0bc8864fc1287ffe22MD51ORIGINALSLFM an R package to evaluate coherent patterns in microarray data via factor analysis.pdfSLFM an R package to evaluate coherent patterns in microarray data via factor analysis.pdfapplication/pdf960204https://repositorio.ufmg.br/bitstream/1843/56441/2/SLFM%20an%20R%20package%20to%20evaluate%20coherent%20patterns%20in%20microarray%20data%20via%20factor%20analysis.pdf19a8f37cdd96601f2e6c9333ba321f5bMD521843/564412023-07-17 15:52:39.938oai:repositorio.ufmg.br:1843/56441TElDRU7vv71BIERFIERJU1RSSUJVSe+/ve+/vU8gTu+/vU8tRVhDTFVTSVZBIERPIFJFUE9TSVTvv71SSU8gSU5TVElUVUNJT05BTCBEQSBVRk1HCiAKCkNvbSBhIGFwcmVzZW50Ye+/ve+/vW8gZGVzdGEgbGljZW7vv71hLCB2b2Pvv70gKG8gYXV0b3IgKGVzKSBvdSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGRlIGF1dG9yKSBjb25jZWRlIGFvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbu+/vW8gZXhjbHVzaXZvIGUgaXJyZXZvZ++/vXZlbCBkZSByZXByb2R1emlyIGUvb3UgZGlzdHJpYnVpciBhIHN1YSBwdWJsaWNh77+977+9byAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0cu+/vW5pY28gZSBlbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mg77+9dWRpbyBvdSB277+9ZGVvLgoKVm9j77+9IGRlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zvv710aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2Pvv70gY29uY29yZGEgcXVlIG8gUmVwb3NpdO+/vXJpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250Ze+/vWRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNh77+977+9byBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHvv73vv71vLgoKVm9j77+9IHRhbWLvv71tIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPvv71waWEgZGUgc3VhIHB1YmxpY2Hvv73vv71vIHBhcmEgZmlucyBkZSBzZWd1cmFu77+9YSwgYmFjay11cCBlIHByZXNlcnZh77+977+9by4KClZvY++/vSBkZWNsYXJhIHF1ZSBhIHN1YSBwdWJsaWNh77+977+9byDvv70gb3JpZ2luYWwgZSBxdWUgdm9j77+9IHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vu77+9YS4gVm9j77+9IHRhbWLvv71tIGRlY2xhcmEgcXVlIG8gZGVw77+9c2l0byBkZSBzdWEgcHVibGljYe+/ve+/vW8gbu+/vW8sIHF1ZSBzZWphIGRlIHNldSBjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd177+9bS4KCkNhc28gYSBzdWEgcHVibGljYe+/ve+/vW8gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY++/vSBu77+9byBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2Pvv70gZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc++/vW8gaXJyZXN0cml0YSBkbyBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgcGFyYSBjb25jZWRlciBhbyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7vv71hLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3Tvv70gY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250Ze+/vWRvIGRhIHB1YmxpY2Hvv73vv71vIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFBVQkxJQ0Hvv73vv71PIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ++/vU5JTyBPVSBBUE9JTyBERSBVTUEgQUfvv71OQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0Pvv70gREVDTEFSQSBRVUUgUkVTUEVJVE9VIFRPRE9TIEUgUVVBSVNRVUVSIERJUkVJVE9TIERFIFJFVklT77+9TyBDT01PIFRBTULvv71NIEFTIERFTUFJUyBPQlJJR0Hvv73vv71FUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKTyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNh77+977+9bywgZSBu77+9byBmYXLvv70gcXVhbHF1ZXIgYWx0ZXJh77+977+9bywgYWzvv71tIGRhcXVlbGFzIGNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7vv71hLgo=Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2023-07-17T18:52:39Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
dc.title.pt_BR.fl_str_mv |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
dc.title.alternative.pt_BR.fl_str_mv |
SLFM: um pacote R para avaliar padrões coerentes em dados de microarray via análise fatorial |
title |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
spellingShingle |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis João Daniel Nunes Duarte Factor model Bayesian inference Gene expression Sparsity priors Rcpp SLFM Estatística Probabilidades Teoria bayesiana de decisão estatistica C++ (Linguagem de programação de computador) R (Linguagem de programação de computador) |
title_short |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
title_full |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
title_fullStr |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
title_full_unstemmed |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
title_sort |
SLFM: an R package to evaluate coherent patterns in microarray data via factor analysis |
author |
João Daniel Nunes Duarte |
author_facet |
João Daniel Nunes Duarte Vinícius Diniz Mayrink |
author_role |
author |
author2 |
Vinícius Diniz Mayrink |
author2_role |
author |
dc.contributor.author.fl_str_mv |
João Daniel Nunes Duarte Vinícius Diniz Mayrink |
dc.subject.por.fl_str_mv |
Factor model Bayesian inference Gene expression Sparsity priors Rcpp SLFM |
topic |
Factor model Bayesian inference Gene expression Sparsity priors Rcpp SLFM Estatística Probabilidades Teoria bayesiana de decisão estatistica C++ (Linguagem de programação de computador) R (Linguagem de programação de computador) |
dc.subject.other.pt_BR.fl_str_mv |
Estatística Probabilidades Teoria bayesiana de decisão estatistica C++ (Linguagem de programação de computador) R (Linguagem de programação de computador) |
description |
The development of simulation-based methods, such as Markov chain Monte Carlo (MCMC), has contributed to an increased interest in the Bayesian framework as an alternative to deal with factor models. Many studies have used Bayesian factor analysis to explore gene expression data. We are particularly interested in the application of a sparse latent factor model (SLFM) based on sparsity priors (mixtures) to assess the significance of factors. The SLFM measures how strong the observed coherent expression pattern is in the data, which is an important source of information to evaluate gene activity. In the literature, this type of model has shown better results than other approaches intended for identification of patterns and metagene groups related to the underlying biology. However, a full Bayesian factor model relying on MCMC algorithms has an expensive computational cost, which makes it unattractive for general users. In this paper, we present the package slfm which uses C++ implementation via Rcpp to improve the computational performance of the SLFM within the widely used statistical tool R. We investigate real and simulated microarray data related to breast cancer. |
publishDate |
2019 |
dc.date.issued.fl_str_mv |
2019-07-31 |
dc.date.accessioned.fl_str_mv |
2023-07-17T18:52:39Z |
dc.date.available.fl_str_mv |
2023-07-17T18:52:39Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1843/56441 |
dc.identifier.doi.pt_BR.fl_str_mv |
https://doi.org/10.18637/jss.v090.i09 |
dc.identifier.issn.pt_BR.fl_str_mv |
1548-7660 |
dc.identifier.orcid.pt_BR.fl_str_mv |
https://orcid.org/0000-0002-5683-8326 |
url |
https://doi.org/10.18637/jss.v090.i09 http://hdl.handle.net/1843/56441 https://orcid.org/0000-0002-5683-8326 |
identifier_str_mv |
1548-7660 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.ispartof.pt_BR.fl_str_mv |
Journal of Statistical Software |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.publisher.initials.fl_str_mv |
UFMG |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
ICX - DEPARTAMENTO DE ESTATÍSTICA |
publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
instname_str |
Universidade Federal de Minas Gerais (UFMG) |
instacron_str |
UFMG |
institution |
UFMG |
reponame_str |
Repositório Institucional da UFMG |
collection |
Repositório Institucional da UFMG |
bitstream.url.fl_str_mv |
https://repositorio.ufmg.br/bitstream/1843/56441/1/License.txt https://repositorio.ufmg.br/bitstream/1843/56441/2/SLFM%20an%20R%20package%20to%20evaluate%20coherent%20patterns%20in%20microarray%20data%20via%20factor%20analysis.pdf |
bitstream.checksum.fl_str_mv |
fa505098d172de0bc8864fc1287ffe22 19a8f37cdd96601f2e6c9333ba321f5b |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
repository.mail.fl_str_mv |
|
_version_ |
1803589567458050048 |