Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção

Oliveira, Breno

Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção

Detalhes bibliográficos
Autor(a) principal:	Oliveira, Breno
Data de Publicação:	2021
Tipo de documento:	Dissertação
Idioma:	por
Título da fonte:	Repositório Institucional da UFG
dARK ID:	ark:/38995/00130000071v0
Texto Completo:	http://repositorio.bc.ufg.br/tede/handle/tede/11522
Resumo:	The development of machine learning solutions involves several well-established stages. However, scientific studies have a concentration on stages such as data engineering, model training, and performance evaluation metrics. The advent of machine learning solutions implementation in business environments at an unprecedented level inspires the revisiting of some problems previously mentioned in the literature, but little explored. Among them, monitoring and evaluating the deterioration of the solution over time. During machine learning models training, it is assumed that the data not seen by the model in production presents the same distribution as the data used during the training stage. However, production models can decrease/lose performance as data changes over time. This phenomenon is defined in the literature as concept deviation. In this context, this work proposes a methodology that uses Auto Machine Learning with data stream learning capable of mitigating eventual concept deviations that may arise in the models implemented in a production environment. Real data from a customer avoidance problem (Churn) of a large-circulation regional newspaper were used. Three machine learning models were implemented using two methodologies: the proposed methodology called autoML-DS and the reference methodology that makes use of conventional model retraining. The results showed that the reference methodology presents performance losses of the implemented models, while the autoML-DS has its predictive capacity preserved. AutoML-DS was able to adapt the models over time, without having to perform a complete retraining, keeping small variations in the error rate.

Metadados do item

id	UFG-2_8f0646bd153f6e91b9048b70f6df5e18
oai_identifier_str	oai:repositorio.bc.ufg.br:tede/11522
network_acronym_str	UFG-2
network_name_str	Repositório Institucional da UFG
repository_id_str
spelling	Soares, Anderson da Silvahttp://lattes.cnpq.br/1096941114079527Soares, Anderson da SilvaSoares, Telma Woerle de LimaSousa, Rafael Teixeirahttp://lattes.cnpq.br/3843157752512003Oliveira, Breno2021-08-02T11:35:46Z2021-08-02T11:35:46Z2021-07-02OLIVEIRA, B. Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção. 2021. 87 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, 2021.http://repositorio.bc.ufg.br/tede/handle/tede/11522ark:/38995/00130000071v0The development of machine learning solutions involves several well-established stages. However, scientific studies have a concentration on stages such as data engineering, model training, and performance evaluation metrics. The advent of machine learning solutions implementation in business environments at an unprecedented level inspires the revisiting of some problems previously mentioned in the literature, but little explored. Among them, monitoring and evaluating the deterioration of the solution over time. During machine learning models training, it is assumed that the data not seen by the model in production presents the same distribution as the data used during the training stage. However, production models can decrease/lose performance as data changes over time. This phenomenon is defined in the literature as concept deviation. In this context, this work proposes a methodology that uses Auto Machine Learning with data stream learning capable of mitigating eventual concept deviations that may arise in the models implemented in a production environment. Real data from a customer avoidance problem (Churn) of a large-circulation regional newspaper were used. Three machine learning models were implemented using two methodologies: the proposed methodology called autoML-DS and the reference methodology that makes use of conventional model retraining. The results showed that the reference methodology presents performance losses of the implemented models, while the autoML-DS has its predictive capacity preserved. AutoML-DS was able to adapt the models over time, without having to perform a complete retraining, keeping small variations in the error rate.O desenvolvimento de soluções de aprendizado de máquina prevê diversas etapas bem estabelecidas. No entanto, os estudos científicos possuem uma concentração em etapas como engenharia de dados, treinamento do modelo e métricas de avaliação de desempenho. O advento da implantação de soluções de aprendizado de máquina em ambientes empresariais em um nível sem precedentes inspira a revisitação de alguns problemas anteriormente apontados na literatura, porém pouco explorados como o monitoramento e avaliação da deterioração da solução ao longo do tempo. Durante o treinamento dos modelos de aprendizado de máquina, supõe-se que os dados não vistos pelo modelo em produção apresentem a mesma distribuição dos dados utilizados durante a etapa de treinamento. Modelos em produção podem perder desempenho à medida que os dados sofram alterações com o passar do tempo. Este fenômeno é definido na literatura como desvio de conceito. Nesse contexto, este trabalho propõe uma metodologia que utiliza Auto Machine Learning com aprendizado de dados em stream capazes de mitigar eventuais desvios de conceito que possam surgir nos modelos implementados em ambiente de produção. Foram utilizados dados reais de um problema de evasão de clientes (Churn) de um jornal de grande circulação regional. Foram implementados três modelos de aprendizado de máquina utilizando duas metodologias: a metodologia proposta denominada autoML-DS e a metodologia de referência que faz uso de retreinamento convencional dos modelos. Os resultados demonstraram que a metodologia de referência apresenta perdas de desempenho dos modelos implementados enquanto o autoML-DS tem sua capacidade preditiva preservada. O autoML-DS foi capaz de adaptar os modelos ao longo do tempo, sem a necessidade da realização de um retreino completo, mantendo pequenas variações na proporção de erros.Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2021-07-29T12:21:26Z No. of bitstreams: 2 Dissertação - Breno Oliveira - 2021.pdf: 3559015 bytes, checksum: 13b790a2df242d1fa7e05a02716b37eb (MD5) license_rdf: 805 bytes, checksum: 4460e5956bc1d1639be9ae6146a50347 (MD5)Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2021-08-02T11:35:46Z (GMT) No. of bitstreams: 2 Dissertação - Breno Oliveira - 2021.pdf: 3559015 bytes, checksum: 13b790a2df242d1fa7e05a02716b37eb (MD5) license_rdf: 805 bytes, checksum: 4460e5956bc1d1639be9ae6146a50347 (MD5)Made available in DSpace on 2021-08-02T11:35:46Z (GMT). No. of bitstreams: 2 Dissertação - Breno Oliveira - 2021.pdf: 3559015 bytes, checksum: 13b790a2df242d1fa7e05a02716b37eb (MD5) license_rdf: 805 bytes, checksum: 4460e5956bc1d1639be9ae6146a50347 (MD5) Previous issue date: 2021-07-02porUniversidade Federal de GoiásPrograma de Pós-graduação em Ciência da Computação (INF)UFGBrasilInstituto de Informática - INF (RG)Attribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessDesvio de conceitoAuto machine laerningDados em streamMachine learningAlgorithms in predictingEvaluating customer evasion in a production environmentCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOAlgoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produçãoMachine learning algorithms in predicting and evaluating customer evasion in a production environmentinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis2050050050026184reponame:Repositório Institucional da UFGinstname:Universidade Federal de Goiás (UFG)instacron:UFGLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.bc.ufg.br/tede/bitstreams/003c94bb-7792-42f1-a85f-99678e1bebe2/download8a4605be74aa9ea9d79846c1fba20a33MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805http://repositorio.bc.ufg.br/tede/bitstreams/c04877d6-d250-4300-950e-3288f06ba59d/download4460e5956bc1d1639be9ae6146a50347MD52ORIGINALDissertação - Breno Oliveira - 2021.pdfDissertação - Breno Oliveira - 2021.pdfapplication/pdf3559015http://repositorio.bc.ufg.br/tede/bitstreams/76dea18d-9e1f-45e0-b60c-83a7e2f120c1/download13b790a2df242d1fa7e05a02716b37ebMD53tede/115222021-08-02 08:35:46.497http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalopen.accessoai:repositorio.bc.ufg.br:tede/11522http://repositorio.bc.ufg.br/tedeRepositório InstitucionalPUBhttp://repositorio.bc.ufg.br/oai/requesttasesdissertacoes.bc@ufg.bropendoar:2021-08-02T11:35:46Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)falseTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
dc.title.pt_BR.fl_str_mv	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
dc.title.alternative.eng.fl_str_mv	Machine learning algorithms in predicting and evaluating customer evasion in a production environment
title	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
spellingShingle	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção Oliveira, Breno Desvio de conceito Auto machine laerning Dados em stream Machine learning Algorithms in predicting Evaluating customer evasion in a production environment CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
title_full	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
title_fullStr	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
title_full_unstemmed	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
title_sort	Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção
author	Oliveira, Breno
author_facet	Oliveira, Breno
author_role	author
dc.contributor.advisor1.fl_str_mv	Soares, Anderson da Silva
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/1096941114079527
dc.contributor.referee1.fl_str_mv	Soares, Anderson da Silva
dc.contributor.referee2.fl_str_mv	Soares, Telma Woerle de Lima
dc.contributor.referee3.fl_str_mv	Sousa, Rafael Teixeira
dc.contributor.authorLattes.fl_str_mv	http://lattes.cnpq.br/3843157752512003
dc.contributor.author.fl_str_mv	Oliveira, Breno
contributor_str_mv	Soares, Anderson da Silva Soares, Anderson da Silva Soares, Telma Woerle de Lima Sousa, Rafael Teixeira
dc.subject.por.fl_str_mv	Desvio de conceito Auto machine laerning Dados em stream
topic	Desvio de conceito Auto machine laerning Dados em stream Machine learning Algorithms in predicting Evaluating customer evasion in a production environment CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv	Machine learning Algorithms in predicting Evaluating customer evasion in a production environment
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description	The development of machine learning solutions involves several well-established stages. However, scientific studies have a concentration on stages such as data engineering, model training, and performance evaluation metrics. The advent of machine learning solutions implementation in business environments at an unprecedented level inspires the revisiting of some problems previously mentioned in the literature, but little explored. Among them, monitoring and evaluating the deterioration of the solution over time. During machine learning models training, it is assumed that the data not seen by the model in production presents the same distribution as the data used during the training stage. However, production models can decrease/lose performance as data changes over time. This phenomenon is defined in the literature as concept deviation. In this context, this work proposes a methodology that uses Auto Machine Learning with data stream learning capable of mitigating eventual concept deviations that may arise in the models implemented in a production environment. Real data from a customer avoidance problem (Churn) of a large-circulation regional newspaper were used. Three machine learning models were implemented using two methodologies: the proposed methodology called autoML-DS and the reference methodology that makes use of conventional model retraining. The results showed that the reference methodology presents performance losses of the implemented models, while the autoML-DS has its predictive capacity preserved. AutoML-DS was able to adapt the models over time, without having to perform a complete retraining, keeping small variations in the error rate.
publishDate	2021
dc.date.accessioned.fl_str_mv	2021-08-02T11:35:46Z
dc.date.available.fl_str_mv	2021-08-02T11:35:46Z
dc.date.issued.fl_str_mv	2021-07-02
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	OLIVEIRA, B. Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção. 2021. 87 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, 2021.
dc.identifier.uri.fl_str_mv	http://repositorio.bc.ufg.br/tede/handle/tede/11522
dc.identifier.dark.fl_str_mv	ark:/38995/00130000071v0
identifier_str_mv	OLIVEIRA, B. Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção. 2021. 87 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, 2021. ark:/38995/00130000071v0
url	http://repositorio.bc.ufg.br/tede/handle/tede/11522
dc.language.iso.fl_str_mv	por
language	por
dc.relation.program.fl_str_mv	20
dc.relation.confidence.fl_str_mv	500 500 500
dc.relation.department.fl_str_mv	26
dc.relation.cnpq.fl_str_mv	184
dc.rights.driver.fl_str_mv	Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de Goiás
dc.publisher.program.fl_str_mv	Programa de Pós-graduação em Ciência da Computação (INF)
dc.publisher.initials.fl_str_mv	UFG
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Instituto de Informática - INF (RG)
publisher.none.fl_str_mv	Universidade Federal de Goiás
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFG instname:Universidade Federal de Goiás (UFG) instacron:UFG
instname_str	Universidade Federal de Goiás (UFG)
instacron_str	UFG
institution	UFG
reponame_str	Repositório Institucional da UFG
collection	Repositório Institucional da UFG
bitstream.url.fl_str_mv	http://repositorio.bc.ufg.br/tede/bitstreams/003c94bb-7792-42f1-a85f-99678e1bebe2/download http://repositorio.bc.ufg.br/tede/bitstreams/c04877d6-d250-4300-950e-3288f06ba59d/download http://repositorio.bc.ufg.br/tede/bitstreams/76dea18d-9e1f-45e0-b60c-83a7e2f120c1/download
bitstream.checksum.fl_str_mv	8a4605be74aa9ea9d79846c1fba20a33 4460e5956bc1d1639be9ae6146a50347 13b790a2df242d1fa7e05a02716b37eb
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)
repository.mail.fl_str_mv	tasesdissertacoes.bc@ufg.br
_version_	1815172584902230016

Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção

Registros relacionados