Hybrid Random Forest Survival Model to Predict Customer Membership Dropout
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.26/42054 |
Resumo: | Dropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur. |
id |
RCAP_f815218c907a3bc4f162614c2b30955e |
---|---|
oai_identifier_str |
oai:comum.rcaap.pt:10400.26/42054 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropoutcustomer dropoutmachine learningsurvival analysisDropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur.Repositório ComumSobreiro, PedroGarcia-Alonso, JoséMartinho, DomingosBerrocal, Javier2022-10-25T17:06:07Z20222022-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.26/42054eng10.3390/electronics11203328info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-12-23T12:15:14Zoai:comum.rcaap.pt:10400.26/42054Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T16:14:08.397647Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
title |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
spellingShingle |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout Sobreiro, Pedro customer dropout machine learning survival analysis |
title_short |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
title_full |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
title_fullStr |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
title_full_unstemmed |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
title_sort |
Hybrid Random Forest Survival Model to Predict Customer Membership Dropout |
author |
Sobreiro, Pedro |
author_facet |
Sobreiro, Pedro Garcia-Alonso, José Martinho, Domingos Berrocal, Javier |
author_role |
author |
author2 |
Garcia-Alonso, José Martinho, Domingos Berrocal, Javier |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Repositório Comum |
dc.contributor.author.fl_str_mv |
Sobreiro, Pedro Garcia-Alonso, José Martinho, Domingos Berrocal, Javier |
dc.subject.por.fl_str_mv |
customer dropout machine learning survival analysis |
topic |
customer dropout machine learning survival analysis |
description |
Dropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-10-25T17:06:07Z 2022 2022-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.26/42054 |
url |
http://hdl.handle.net/10400.26/42054 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.3390/electronics11203328 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799130594950512640 |