Hybrid Random Forest Survival Model to Predict Customer Membership Dropout

Detalhes bibliográficos
Autor(a) principal: Sobreiro, Pedro
Data de Publicação: 2022
Outros Autores: Garcia-Alonso, José, Martinho, Domingos, Berrocal, Javier
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.26/42054
Resumo: Dropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur.
id RCAP_f815218c907a3bc4f162614c2b30955e
oai_identifier_str oai:comum.rcaap.pt:10400.26/42054
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Hybrid Random Forest Survival Model to Predict Customer Membership Dropoutcustomer dropoutmachine learningsurvival analysisDropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur.Repositório ComumSobreiro, PedroGarcia-Alonso, JoséMartinho, DomingosBerrocal, Javier2022-10-25T17:06:07Z20222022-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.26/42054eng10.3390/electronics11203328info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-12-23T12:15:14Zoai:comum.rcaap.pt:10400.26/42054Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T16:14:08.397647Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
title Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
spellingShingle Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
Sobreiro, Pedro
customer dropout
machine learning
survival analysis
title_short Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
title_full Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
title_fullStr Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
title_full_unstemmed Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
title_sort Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
author Sobreiro, Pedro
author_facet Sobreiro, Pedro
Garcia-Alonso, José
Martinho, Domingos
Berrocal, Javier
author_role author
author2 Garcia-Alonso, José
Martinho, Domingos
Berrocal, Javier
author2_role author
author
author
dc.contributor.none.fl_str_mv Repositório Comum
dc.contributor.author.fl_str_mv Sobreiro, Pedro
Garcia-Alonso, José
Martinho, Domingos
Berrocal, Javier
dc.subject.por.fl_str_mv customer dropout
machine learning
survival analysis
topic customer dropout
machine learning
survival analysis
description Dropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur.
publishDate 2022
dc.date.none.fl_str_mv 2022-10-25T17:06:07Z
2022
2022-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.26/42054
url http://hdl.handle.net/10400.26/42054
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.3390/electronics11203328
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799130594950512640