Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models

Detalhes bibliográficos
Autor(a) principal: Araújo, Gabriel Ferreira
Data de Publicação: 2014
Outros Autores: Macedo, Hendrik Teixeira, Chella, Marco Túlio, Estombelo Montesco, Carlos Alberto, Medeiros, Marcus Vinícius Oliveira
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFS
Texto Completo: https://ri.ufs.br/handle/riufs/1764
Resumo: Most machine learning algorithms need to handle large data sets. This feature often leads to limitations on processing time and memory. The Expectation-Maximization (EM) is one of such algorithms, which is used to train one of the most commonly used parametric statistical models, the Gaussian Mixture Models (GMM). All steps of the algorithm are potentially parallelizable once they iterate over the entire data set. In this study, we propose a parallel implementation of EM for training GMM using CUDA. Experiments are performed with a UCI dataset and results show a speedup of 7 if compared to the sequential version. We have also carried out modifications to the code in order to provide better access to global memory and shared memory usage. We have achieved up to 56.4% of achieved occupancy, regardless the number of Gaussians considered in the set of experiments.
id UFS-2_a30227d65fb4ab0a5104eb2f94c9addb
oai_identifier_str oai:ufs.br:riufs/1764
network_acronym_str UFS-2
network_name_str Repositório Institucional da UFS
repository_id_str
spelling Araújo, Gabriel FerreiraMacedo, Hendrik TeixeiraChella, Marco TúlioEstombelo Montesco, Carlos AlbertoMedeiros, Marcus Vinícius Oliveira2016-05-16T15:21:46Z2016-05-16T15:21:46Z2014-07ARAÚJO, G. F. et al. Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models. Journal of Computer Science, v. 10, n. 10, jul. 2014. Disponível em: <http://thescipub.com/abstract/10.3844/jcssp.2014.2124.2134>. Acesso em: 16 maio 2016.1552-6607https://ri.ufs.br/handle/riufs/1764Creative Commons Attribution LicenseMost machine learning algorithms need to handle large data sets. This feature often leads to limitations on processing time and memory. The Expectation-Maximization (EM) is one of such algorithms, which is used to train one of the most commonly used parametric statistical models, the Gaussian Mixture Models (GMM). All steps of the algorithm are potentially parallelizable once they iterate over the entire data set. In this study, we propose a parallel implementation of EM for training GMM using CUDA. Experiments are performed with a UCI dataset and results show a speedup of 7 if compared to the sequential version. We have also carried out modifications to the code in order to provide better access to global memory and shared memory usage. We have achieved up to 56.4% of achieved occupancy, regardless the number of Gaussians considered in the set of experiments.Science PublicationsExpectation-Maximization (EM)Gaussian Mixture Models (GMM)CUDAModelo de misturas guassianasParallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Modelsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleengreponame:Repositório Institucional da UFSinstname:Universidade Federal de Sergipe (UFS)instacron:UFSinfo:eu-repo/semantics/openAccessTHUMBNAILExpectationMaximisationAlgorithm.pdf.jpgExpectationMaximisationAlgorithm.pdf.jpgGenerated Thumbnailimage/jpeg1654https://ri.ufs.br/jspui/bitstream/riufs/1764/4/ExpectationMaximisationAlgorithm.pdf.jpge1d83bc55d4c8ea8b4696b80452e128bMD54ORIGINALExpectationMaximisationAlgorithm.pdfExpectationMaximisationAlgorithm.pdfapplication/pdf231445https://ri.ufs.br/jspui/bitstream/riufs/1764/1/ExpectationMaximisationAlgorithm.pdf23a26016efd98164dcb1970bc9ed33ddMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://ri.ufs.br/jspui/bitstream/riufs/1764/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52TEXTExpectationMaximisationAlgorithm.pdf.txtExpectationMaximisationAlgorithm.pdf.txtExtracted texttext/plain33778https://ri.ufs.br/jspui/bitstream/riufs/1764/3/ExpectationMaximisationAlgorithm.pdf.txtb35bf6003a9b91eb4dbd843e9ecdcba7MD53riufs/17642016-07-29 18:36:08.281oai:ufs.br:riufs/1764Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttps://ri.ufs.br/oai/requestrepositorio@academico.ufs.bropendoar:2016-07-29T21:36:08Repositório Institucional da UFS - Universidade Federal de Sergipe (UFS)false
dc.title.pt_BR.fl_str_mv Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
title Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
spellingShingle Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
Araújo, Gabriel Ferreira
Expectation-Maximization (EM)
Gaussian Mixture Models (GMM)
CUDA
Modelo de misturas guassianas
title_short Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
title_full Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
title_fullStr Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
title_full_unstemmed Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
title_sort Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models
author Araújo, Gabriel Ferreira
author_facet Araújo, Gabriel Ferreira
Macedo, Hendrik Teixeira
Chella, Marco Túlio
Estombelo Montesco, Carlos Alberto
Medeiros, Marcus Vinícius Oliveira
author_role author
author2 Macedo, Hendrik Teixeira
Chella, Marco Túlio
Estombelo Montesco, Carlos Alberto
Medeiros, Marcus Vinícius Oliveira
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Araújo, Gabriel Ferreira
Macedo, Hendrik Teixeira
Chella, Marco Túlio
Estombelo Montesco, Carlos Alberto
Medeiros, Marcus Vinícius Oliveira
dc.subject.por.fl_str_mv Expectation-Maximization (EM)
Gaussian Mixture Models (GMM)
CUDA
Modelo de misturas guassianas
topic Expectation-Maximization (EM)
Gaussian Mixture Models (GMM)
CUDA
Modelo de misturas guassianas
description Most machine learning algorithms need to handle large data sets. This feature often leads to limitations on processing time and memory. The Expectation-Maximization (EM) is one of such algorithms, which is used to train one of the most commonly used parametric statistical models, the Gaussian Mixture Models (GMM). All steps of the algorithm are potentially parallelizable once they iterate over the entire data set. In this study, we propose a parallel implementation of EM for training GMM using CUDA. Experiments are performed with a UCI dataset and results show a speedup of 7 if compared to the sequential version. We have also carried out modifications to the code in order to provide better access to global memory and shared memory usage. We have achieved up to 56.4% of achieved occupancy, regardless the number of Gaussians considered in the set of experiments.
publishDate 2014
dc.date.issued.fl_str_mv 2014-07
dc.date.accessioned.fl_str_mv 2016-05-16T15:21:46Z
dc.date.available.fl_str_mv 2016-05-16T15:21:46Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.citation.fl_str_mv ARAÚJO, G. F. et al. Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models. Journal of Computer Science, v. 10, n. 10, jul. 2014. Disponível em: <http://thescipub.com/abstract/10.3844/jcssp.2014.2124.2134>. Acesso em: 16 maio 2016.
dc.identifier.uri.fl_str_mv https://ri.ufs.br/handle/riufs/1764
dc.identifier.issn.none.fl_str_mv 1552-6607
dc.identifier.license.pt_BR.fl_str_mv Creative Commons Attribution License
identifier_str_mv ARAÚJO, G. F. et al. Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models. Journal of Computer Science, v. 10, n. 10, jul. 2014. Disponível em: <http://thescipub.com/abstract/10.3844/jcssp.2014.2124.2134>. Acesso em: 16 maio 2016.
1552-6607
Creative Commons Attribution License
url https://ri.ufs.br/handle/riufs/1764
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Science Publications
publisher.none.fl_str_mv Science Publications
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFS
instname:Universidade Federal de Sergipe (UFS)
instacron:UFS
instname_str Universidade Federal de Sergipe (UFS)
instacron_str UFS
institution UFS
reponame_str Repositório Institucional da UFS
collection Repositório Institucional da UFS
bitstream.url.fl_str_mv https://ri.ufs.br/jspui/bitstream/riufs/1764/4/ExpectationMaximisationAlgorithm.pdf.jpg
https://ri.ufs.br/jspui/bitstream/riufs/1764/1/ExpectationMaximisationAlgorithm.pdf
https://ri.ufs.br/jspui/bitstream/riufs/1764/2/license.txt
https://ri.ufs.br/jspui/bitstream/riufs/1764/3/ExpectationMaximisationAlgorithm.pdf.txt
bitstream.checksum.fl_str_mv e1d83bc55d4c8ea8b4696b80452e128b
23a26016efd98164dcb1970bc9ed33dd
8a4605be74aa9ea9d79846c1fba20a33
b35bf6003a9b91eb4dbd843e9ecdcba7
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFS - Universidade Federal de Sergipe (UFS)
repository.mail.fl_str_mv repositorio@academico.ufs.br
_version_ 1802110764518998016