Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | , , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/11328/2679 |
Resumo: | Cross-Company Churn Prediction (CCCP) is a domain of research where one company (target) is lacking enough data and can use data from another company (source) to predict customer churn successfully. To support CCCP, the cross-company data is usually transformed to a set of similar normal distribution of target company data prior to building a CCCP model. However, it is still unclear which data transformation method is most effective in CCCP. Also, the impact of data transformation methods on CCCP model performance using different classifiers have not been comprehensively explored in the telecommunication sector. In this study, we devised a model for CCCP using data transformation methods (i.e., log, z-score, rank and box-cox) and presented not only an extensive comparison to validate the impact of these transformation methods in CCCP, but also evaluated the performance of underlying baseline classifiers (i.e., Naive Bayes (NB), K-Nearest Neighbour (KNN), Gradient Boosted Tree (GBT), Single Rule Induction (SRI) and Deep learner Neural net (DP)) for customer churn prediction in telecommunication sector using the above mentioned data transformation methods. We performed experiments on publicly available datasets related to the telecommunication sector. The results demonstrated that most of the data transformation methods (e.g., log, rank, and box-cox) improve the performance of CCCP significantly. However, the Z-Score data transformation method could not achieve better results as compared to the rest of the data transformation methods in this study. Moreover, it is also investigated that the CCCP model based on NB outperform on transformed data and DP, KNN and GBT performed on the average, while SRI classifier did not show significant results in term of the commonly used evaluation measures (i.e., probability of detection, probability of false alarm, area under the curve and g-mean). |
id |
RCAP_7891d93867c9c95effb51cc3641bb7ad |
---|---|
oai_identifier_str |
oai:repositorio.upt.pt:11328/2679 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methodsChurn predictionCross-companyData transformationBox-coxRankLogZ-ScoreCross-Company Churn Prediction (CCCP) is a domain of research where one company (target) is lacking enough data and can use data from another company (source) to predict customer churn successfully. To support CCCP, the cross-company data is usually transformed to a set of similar normal distribution of target company data prior to building a CCCP model. However, it is still unclear which data transformation method is most effective in CCCP. Also, the impact of data transformation methods on CCCP model performance using different classifiers have not been comprehensively explored in the telecommunication sector. In this study, we devised a model for CCCP using data transformation methods (i.e., log, z-score, rank and box-cox) and presented not only an extensive comparison to validate the impact of these transformation methods in CCCP, but also evaluated the performance of underlying baseline classifiers (i.e., Naive Bayes (NB), K-Nearest Neighbour (KNN), Gradient Boosted Tree (GBT), Single Rule Induction (SRI) and Deep learner Neural net (DP)) for customer churn prediction in telecommunication sector using the above mentioned data transformation methods. We performed experiments on publicly available datasets related to the telecommunication sector. The results demonstrated that most of the data transformation methods (e.g., log, rank, and box-cox) improve the performance of CCCP significantly. However, the Z-Score data transformation method could not achieve better results as compared to the rest of the data transformation methods in this study. Moreover, it is also investigated that the CCCP model based on NB outperform on transformed data and DP, KNN and GBT performed on the average, while SRI classifier did not show significant results in term of the commonly used evaluation measures (i.e., probability of detection, probability of false alarm, area under the curve and g-mean).This research was supported by the Cluster Research Projects Activity code # R16086 and R18027, Zayed University, Abu Dhabi, United Arab Emirates.Elsevier2019-05-10T10:19:37Z2019-05-102019-06-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfAmin, A., Shah, B., Khattak, A. M., Moreira, F., Ali, G., Rocha, Á., … Anwar, S. (2019). Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods. International Journal of Information Management, 46, 304–319. Disponível no Repositório UPT, http://hdl.handle.net/11328/2679http://hdl.handle.net/11328/2679Amin, A., Shah, B., Khattak, A. M., Moreira, F., Ali, G., Rocha, Á., … Anwar, S. (2019). Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods. International Journal of Information Management, 46, 304–319. Disponível no Repositório UPT, http://hdl.handle.net/11328/2679http://hdl.handle.net/11328/2679enghttps://www.sciencedirect.com/science/article/pii/S0268401218305930http://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/embargoedAccessAmin, AdnanShah, BabarKhattak, Asad MasoodMoreira, FernandoAli, GoharRocha, ÁlvaroAnwar, Sajidreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-16T02:12:58Zoai:repositorio.upt.pt:11328/2679Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:41:32.126495Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
title |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
spellingShingle |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods Amin, Adnan Churn prediction Cross-company Data transformation Box-cox Rank Log Z-Score |
title_short |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
title_full |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
title_fullStr |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
title_full_unstemmed |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
title_sort |
Cross-company customer churn prediction in telecommunication: a comparison of data transformation methods |
author |
Amin, Adnan |
author_facet |
Amin, Adnan Shah, Babar Khattak, Asad Masood Moreira, Fernando Ali, Gohar Rocha, Álvaro Anwar, Sajid |
author_role |
author |
author2 |
Shah, Babar Khattak, Asad Masood Moreira, Fernando Ali, Gohar Rocha, Álvaro Anwar, Sajid |
author2_role |
author author author author author author |
dc.contributor.author.fl_str_mv |
Amin, Adnan Shah, Babar Khattak, Asad Masood Moreira, Fernando Ali, Gohar Rocha, Álvaro Anwar, Sajid |
dc.subject.por.fl_str_mv |
Churn prediction Cross-company Data transformation Box-cox Rank Log Z-Score |
topic |
Churn prediction Cross-company Data transformation Box-cox Rank Log Z-Score |
description |
Cross-Company Churn Prediction (CCCP) is a domain of research where one company (target) is lacking enough data and can use data from another company (source) to predict customer churn successfully. To support CCCP, the cross-company data is usually transformed to a set of similar normal distribution of target company data prior to building a CCCP model. However, it is still unclear which data transformation method is most effective in CCCP. Also, the impact of data transformation methods on CCCP model performance using different classifiers have not been comprehensively explored in the telecommunication sector. In this study, we devised a model for CCCP using data transformation methods (i.e., log, z-score, rank and box-cox) and presented not only an extensive comparison to validate the impact of these transformation methods in CCCP, but also evaluated the performance of underlying baseline classifiers (i.e., Naive Bayes (NB), K-Nearest Neighbour (KNN), Gradient Boosted Tree (GBT), Single Rule Induction (SRI) and Deep learner Neural net (DP)) for customer churn prediction in telecommunication sector using the above mentioned data transformation methods. We performed experiments on publicly available datasets related to the telecommunication sector. The results demonstrated that most of the data transformation methods (e.g., log, rank, and box-cox) improve the performance of CCCP significantly. However, the Z-Score data transformation method could not achieve better results as compared to the rest of the data transformation methods in this study. Moreover, it is also investigated that the CCCP model based on NB outperform on transformed data and DP, KNN and GBT performed on the average, while SRI classifier did not show significant results in term of the commonly used evaluation measures (i.e., probability of detection, probability of false alarm, area under the curve and g-mean). |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-05-10T10:19:37Z 2019-05-10 2019-06-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
Amin, A., Shah, B., Khattak, A. M., Moreira, F., Ali, G., Rocha, Á., … Anwar, S. (2019). Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods. International Journal of Information Management, 46, 304–319. Disponível no Repositório UPT, http://hdl.handle.net/11328/2679 http://hdl.handle.net/11328/2679 Amin, A., Shah, B., Khattak, A. M., Moreira, F., Ali, G., Rocha, Á., … Anwar, S. (2019). Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods. International Journal of Information Management, 46, 304–319. Disponível no Repositório UPT, http://hdl.handle.net/11328/2679 http://hdl.handle.net/11328/2679 |
identifier_str_mv |
Amin, A., Shah, B., Khattak, A. M., Moreira, F., Ali, G., Rocha, Á., … Anwar, S. (2019). Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods. International Journal of Information Management, 46, 304–319. Disponível no Repositório UPT, http://hdl.handle.net/11328/2679 |
url |
http://hdl.handle.net/11328/2679 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://www.sciencedirect.com/science/article/pii/S0268401218305930 |
dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by/4.0/ info:eu-repo/semantics/embargoedAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by/4.0/ |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134978950299648 |