A framework to improve churn prediction performance in retail banking

Detalhes bibliográficos
Autor(a) principal: Brito, João Batista Gonçalves de
Data de Publicação: 2024
Outros Autores: Bucco, Guilherme Brandelli, Silveira, Rodrigo Heldt, Becker, Joao Luiz, Silveira, Cleo Schmitt, Luce, Fernando Bins, Anzanello, Michel José
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFRGS
Texto Completo: http://hdl.handle.net/10183/274953
Resumo: Managing customer retention is critical to a company’s proftability and frm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no efort to data preparation methods. These methods directly impact the identifcation of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classifcation model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specifcity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry.
id UFRGS-2_673cc9ded2e15a33bcefd31fa357b737
oai_identifier_str oai:www.lume.ufrgs.br:10183/274953
network_acronym_str UFRGS-2
network_name_str Repositório Institucional da UFRGS
repository_id_str
spelling Brito, João Batista Gonçalves deBucco, Guilherme BrandelliSilveira, Rodrigo HeldtBecker, Joao LuizSilveira, Cleo SchmittLuce, Fernando BinsAnzanello, Michel José2024-04-19T06:13:08Z20242199-4730http://hdl.handle.net/10183/274953001195377Managing customer retention is critical to a company’s proftability and frm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no efort to data preparation methods. These methods directly impact the identifcation of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classifcation model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specifcity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry.application/pdfengFinancial innovation. [Heidelberg]. Vol. 10 (2024), [Art.] 17, 29 p.Modelagem de dadosProcessamento de dadosPrevisãoSetor bancárioRetenção de clientesMarketing de relacionamentoCustomer churn predictionImbalanced dataset treatmentFeature engineeringA framework to improve churn prediction performance in retail bankingEstrangeiroinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSTEXT001195377.pdf.txt001195377.pdf.txtExtracted Texttext/plain88426http://www.lume.ufrgs.br/bitstream/10183/274953/2/001195377.pdf.txt4cabfc12936fac4932b4a5d7f00df174MD52ORIGINAL001195377.pdfTexto completo (inglês)application/pdf2008742http://www.lume.ufrgs.br/bitstream/10183/274953/1/001195377.pdf59975eb2cb8bac4c9dec007268f599c6MD5110183/2749532024-04-20 06:33:56.604664oai:www.lume.ufrgs.br:10183/274953Repositório de PublicaçõesPUBhttps://lume.ufrgs.br/oai/requestopendoar:2024-04-20T09:33:56Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false
dc.title.pt_BR.fl_str_mv A framework to improve churn prediction performance in retail banking
title A framework to improve churn prediction performance in retail banking
spellingShingle A framework to improve churn prediction performance in retail banking
Brito, João Batista Gonçalves de
Modelagem de dados
Processamento de dados
Previsão
Setor bancário
Retenção de clientes
Marketing de relacionamento
Customer churn prediction
Imbalanced dataset treatment
Feature engineering
title_short A framework to improve churn prediction performance in retail banking
title_full A framework to improve churn prediction performance in retail banking
title_fullStr A framework to improve churn prediction performance in retail banking
title_full_unstemmed A framework to improve churn prediction performance in retail banking
title_sort A framework to improve churn prediction performance in retail banking
author Brito, João Batista Gonçalves de
author_facet Brito, João Batista Gonçalves de
Bucco, Guilherme Brandelli
Silveira, Rodrigo Heldt
Becker, Joao Luiz
Silveira, Cleo Schmitt
Luce, Fernando Bins
Anzanello, Michel José
author_role author
author2 Bucco, Guilherme Brandelli
Silveira, Rodrigo Heldt
Becker, Joao Luiz
Silveira, Cleo Schmitt
Luce, Fernando Bins
Anzanello, Michel José
author2_role author
author
author
author
author
author
dc.contributor.author.fl_str_mv Brito, João Batista Gonçalves de
Bucco, Guilherme Brandelli
Silveira, Rodrigo Heldt
Becker, Joao Luiz
Silveira, Cleo Schmitt
Luce, Fernando Bins
Anzanello, Michel José
dc.subject.por.fl_str_mv Modelagem de dados
Processamento de dados
Previsão
Setor bancário
Retenção de clientes
Marketing de relacionamento
topic Modelagem de dados
Processamento de dados
Previsão
Setor bancário
Retenção de clientes
Marketing de relacionamento
Customer churn prediction
Imbalanced dataset treatment
Feature engineering
dc.subject.eng.fl_str_mv Customer churn prediction
Imbalanced dataset treatment
Feature engineering
description Managing customer retention is critical to a company’s proftability and frm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no efort to data preparation methods. These methods directly impact the identifcation of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classifcation model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specifcity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry.
publishDate 2024
dc.date.accessioned.fl_str_mv 2024-04-19T06:13:08Z
dc.date.issued.fl_str_mv 2024
dc.type.driver.fl_str_mv Estrangeiro
info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10183/274953
dc.identifier.issn.pt_BR.fl_str_mv 2199-4730
dc.identifier.nrb.pt_BR.fl_str_mv 001195377
identifier_str_mv 2199-4730
001195377
url http://hdl.handle.net/10183/274953
dc.language.iso.fl_str_mv eng
language eng
dc.relation.ispartof.pt_BR.fl_str_mv Financial innovation. [Heidelberg]. Vol. 10 (2024), [Art.] 17, 29 p.
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFRGS
instname:Universidade Federal do Rio Grande do Sul (UFRGS)
instacron:UFRGS
instname_str Universidade Federal do Rio Grande do Sul (UFRGS)
instacron_str UFRGS
institution UFRGS
reponame_str Repositório Institucional da UFRGS
collection Repositório Institucional da UFRGS
bitstream.url.fl_str_mv http://www.lume.ufrgs.br/bitstream/10183/274953/2/001195377.pdf.txt
http://www.lume.ufrgs.br/bitstream/10183/274953/1/001195377.pdf
bitstream.checksum.fl_str_mv 4cabfc12936fac4932b4a5d7f00df174
59975eb2cb8bac4c9dec007268f599c6
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)
repository.mail.fl_str_mv
_version_ 1801225116496953344