A framework to improve churn prediction performance in retail banking
Autor(a) principal: | |
---|---|
Data de Publicação: | 2024 |
Outros Autores: | , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFRGS |
Texto Completo: | http://hdl.handle.net/10183/274953 |
Resumo: | Managing customer retention is critical to a company’s proftability and frm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no efort to data preparation methods. These methods directly impact the identifcation of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classifcation model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specifcity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry. |
id |
UFRGS-2_673cc9ded2e15a33bcefd31fa357b737 |
---|---|
oai_identifier_str |
oai:www.lume.ufrgs.br:10183/274953 |
network_acronym_str |
UFRGS-2 |
network_name_str |
Repositório Institucional da UFRGS |
repository_id_str |
|
spelling |
Brito, João Batista Gonçalves deBucco, Guilherme BrandelliSilveira, Rodrigo HeldtBecker, Joao LuizSilveira, Cleo SchmittLuce, Fernando BinsAnzanello, Michel José2024-04-19T06:13:08Z20242199-4730http://hdl.handle.net/10183/274953001195377Managing customer retention is critical to a company’s proftability and frm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no efort to data preparation methods. These methods directly impact the identifcation of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classifcation model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specifcity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry.application/pdfengFinancial innovation. [Heidelberg]. Vol. 10 (2024), [Art.] 17, 29 p.Modelagem de dadosProcessamento de dadosPrevisãoSetor bancárioRetenção de clientesMarketing de relacionamentoCustomer churn predictionImbalanced dataset treatmentFeature engineeringA framework to improve churn prediction performance in retail bankingEstrangeiroinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSTEXT001195377.pdf.txt001195377.pdf.txtExtracted Texttext/plain88426http://www.lume.ufrgs.br/bitstream/10183/274953/2/001195377.pdf.txt4cabfc12936fac4932b4a5d7f00df174MD52ORIGINAL001195377.pdfTexto completo (inglês)application/pdf2008742http://www.lume.ufrgs.br/bitstream/10183/274953/1/001195377.pdf59975eb2cb8bac4c9dec007268f599c6MD5110183/2749532024-04-20 06:33:56.604664oai:www.lume.ufrgs.br:10183/274953Repositório de PublicaçõesPUBhttps://lume.ufrgs.br/oai/requestopendoar:2024-04-20T09:33:56Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false |
dc.title.pt_BR.fl_str_mv |
A framework to improve churn prediction performance in retail banking |
title |
A framework to improve churn prediction performance in retail banking |
spellingShingle |
A framework to improve churn prediction performance in retail banking Brito, João Batista Gonçalves de Modelagem de dados Processamento de dados Previsão Setor bancário Retenção de clientes Marketing de relacionamento Customer churn prediction Imbalanced dataset treatment Feature engineering |
title_short |
A framework to improve churn prediction performance in retail banking |
title_full |
A framework to improve churn prediction performance in retail banking |
title_fullStr |
A framework to improve churn prediction performance in retail banking |
title_full_unstemmed |
A framework to improve churn prediction performance in retail banking |
title_sort |
A framework to improve churn prediction performance in retail banking |
author |
Brito, João Batista Gonçalves de |
author_facet |
Brito, João Batista Gonçalves de Bucco, Guilherme Brandelli Silveira, Rodrigo Heldt Becker, Joao Luiz Silveira, Cleo Schmitt Luce, Fernando Bins Anzanello, Michel José |
author_role |
author |
author2 |
Bucco, Guilherme Brandelli Silveira, Rodrigo Heldt Becker, Joao Luiz Silveira, Cleo Schmitt Luce, Fernando Bins Anzanello, Michel José |
author2_role |
author author author author author author |
dc.contributor.author.fl_str_mv |
Brito, João Batista Gonçalves de Bucco, Guilherme Brandelli Silveira, Rodrigo Heldt Becker, Joao Luiz Silveira, Cleo Schmitt Luce, Fernando Bins Anzanello, Michel José |
dc.subject.por.fl_str_mv |
Modelagem de dados Processamento de dados Previsão Setor bancário Retenção de clientes Marketing de relacionamento |
topic |
Modelagem de dados Processamento de dados Previsão Setor bancário Retenção de clientes Marketing de relacionamento Customer churn prediction Imbalanced dataset treatment Feature engineering |
dc.subject.eng.fl_str_mv |
Customer churn prediction Imbalanced dataset treatment Feature engineering |
description |
Managing customer retention is critical to a company’s proftability and frm value. However, predicting customer churn is challenging. The extant research on the topic mainly focuses on the type of model developed to predict churn, devoting little or no efort to data preparation methods. These methods directly impact the identifcation of patterns, increasing the model’s predictive performance. We addressed this problem by (1) employing feature engineering methods to generate a set of potential predictor features suitable for the banking industry and (2) preprocessing the majority and minority classes to improve the learning of the classifcation model pattern. The framework encompasses state-of-the-art data preprocessing methods: (1) feature engineering with recency, frequency, and monetary value concepts to address the imbalanced dataset issue, (2) oversampling using the adaptive synthetic sampling algorithm, and (3) undersampling using NEASMISS algorithm. After data preprocessing, we use XGBoost and elastic net methods for churn prediction. We validated the proposed framework with a dataset of more than 3 million customers and about 170 million transactions. The framework outperformed alternative methods reported in the literature in terms of precision-recall area under curve, accuracy, recall, and specifcity. From a practical perspective, the framework provides managers with valuable information to predict customer churn and develop strategies for customer retention in the banking industry. |
publishDate |
2024 |
dc.date.accessioned.fl_str_mv |
2024-04-19T06:13:08Z |
dc.date.issued.fl_str_mv |
2024 |
dc.type.driver.fl_str_mv |
Estrangeiro info:eu-repo/semantics/article |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10183/274953 |
dc.identifier.issn.pt_BR.fl_str_mv |
2199-4730 |
dc.identifier.nrb.pt_BR.fl_str_mv |
001195377 |
identifier_str_mv |
2199-4730 001195377 |
url |
http://hdl.handle.net/10183/274953 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.ispartof.pt_BR.fl_str_mv |
Financial innovation. [Heidelberg]. Vol. 10 (2024), [Art.] 17, 29 p. |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFRGS instname:Universidade Federal do Rio Grande do Sul (UFRGS) instacron:UFRGS |
instname_str |
Universidade Federal do Rio Grande do Sul (UFRGS) |
instacron_str |
UFRGS |
institution |
UFRGS |
reponame_str |
Repositório Institucional da UFRGS |
collection |
Repositório Institucional da UFRGS |
bitstream.url.fl_str_mv |
http://www.lume.ufrgs.br/bitstream/10183/274953/2/001195377.pdf.txt http://www.lume.ufrgs.br/bitstream/10183/274953/1/001195377.pdf |
bitstream.checksum.fl_str_mv |
4cabfc12936fac4932b4a5d7f00df174 59975eb2cb8bac4c9dec007268f599c6 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS) |
repository.mail.fl_str_mv |
|
_version_ |
1801225116496953344 |