Improving the accuracy of predicting bank depositor' behavior using decision tree

Detalhes bibliográficos
Autor(a) principal: Safarkhani, F.
Data de Publicação: 2021
Outros Autores: Moro, S.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10071/23280
Resumo: Telemarketing is a widely adopted direct marketing technique in banks. Since customers hardly respond positively, data prediction models can help in selecting the most likely prospective customers. We aim to develop a classifier accuracy to predict which customer will subscribe to a long-term deposit proposed by a bank. Accordingly, this paper focuses on a combination of resampling, in order to reduce the imbalanced data, using feature selection, to reduce the complexity of data computing and dimension reduction of inefficiency data modeling. The performed operation has shown an improvement in the performance of the classification algorithm in terms of accuracy. The experimental results were run on a real bank dataset and the J48 decision tree achieved 94.39% accuracy prediction, with 0.975 sensitivity and 0.709 specificity, showing better results when compared to other approaches reported in the existing literature, such as logistic regression (91.79 accuracy; 0.975 sensitivity; 0.495 specificity) and Naive Bayes classifier (90.82% accuracy; 0.961 sensitivity; 0.507 specificity). Furthermore, our resampling and feature selection approach resulted in improved accuracy (94.39%) when compared to a state-of-the-art approach based on a fuzzy algorithm (92.89%).
id RCAP_c70895174d6717dd9c8d803ff80e2402
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/23280
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Improving the accuracy of predicting bank depositor' behavior using decision treeMachine learningData miningArtificial intelligenceTelemarketing is a widely adopted direct marketing technique in banks. Since customers hardly respond positively, data prediction models can help in selecting the most likely prospective customers. We aim to develop a classifier accuracy to predict which customer will subscribe to a long-term deposit proposed by a bank. Accordingly, this paper focuses on a combination of resampling, in order to reduce the imbalanced data, using feature selection, to reduce the complexity of data computing and dimension reduction of inefficiency data modeling. The performed operation has shown an improvement in the performance of the classification algorithm in terms of accuracy. The experimental results were run on a real bank dataset and the J48 decision tree achieved 94.39% accuracy prediction, with 0.975 sensitivity and 0.709 specificity, showing better results when compared to other approaches reported in the existing literature, such as logistic regression (91.79 accuracy; 0.975 sensitivity; 0.495 specificity) and Naive Bayes classifier (90.82% accuracy; 0.961 sensitivity; 0.507 specificity). Furthermore, our resampling and feature selection approach resulted in improved accuracy (94.39%) when compared to a state-of-the-art approach based on a fuzzy algorithm (92.89%).MDPI2021-10-06T08:49:24Z2021-01-01T00:00:00Z20212021-10-06T09:48:46Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/23280eng2076-341710.3390/app11199016Safarkhani, F.Moro, S.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:55:59Zoai:repositorio.iscte-iul.pt:10071/23280Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:28:39.017395Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Improving the accuracy of predicting bank depositor' behavior using decision tree
title Improving the accuracy of predicting bank depositor' behavior using decision tree
spellingShingle Improving the accuracy of predicting bank depositor' behavior using decision tree
Safarkhani, F.
Machine learning
Data mining
Artificial intelligence
title_short Improving the accuracy of predicting bank depositor' behavior using decision tree
title_full Improving the accuracy of predicting bank depositor' behavior using decision tree
title_fullStr Improving the accuracy of predicting bank depositor' behavior using decision tree
title_full_unstemmed Improving the accuracy of predicting bank depositor' behavior using decision tree
title_sort Improving the accuracy of predicting bank depositor' behavior using decision tree
author Safarkhani, F.
author_facet Safarkhani, F.
Moro, S.
author_role author
author2 Moro, S.
author2_role author
dc.contributor.author.fl_str_mv Safarkhani, F.
Moro, S.
dc.subject.por.fl_str_mv Machine learning
Data mining
Artificial intelligence
topic Machine learning
Data mining
Artificial intelligence
description Telemarketing is a widely adopted direct marketing technique in banks. Since customers hardly respond positively, data prediction models can help in selecting the most likely prospective customers. We aim to develop a classifier accuracy to predict which customer will subscribe to a long-term deposit proposed by a bank. Accordingly, this paper focuses on a combination of resampling, in order to reduce the imbalanced data, using feature selection, to reduce the complexity of data computing and dimension reduction of inefficiency data modeling. The performed operation has shown an improvement in the performance of the classification algorithm in terms of accuracy. The experimental results were run on a real bank dataset and the J48 decision tree achieved 94.39% accuracy prediction, with 0.975 sensitivity and 0.709 specificity, showing better results when compared to other approaches reported in the existing literature, such as logistic regression (91.79 accuracy; 0.975 sensitivity; 0.495 specificity) and Naive Bayes classifier (90.82% accuracy; 0.961 sensitivity; 0.507 specificity). Furthermore, our resampling and feature selection approach resulted in improved accuracy (94.39%) when compared to a state-of-the-art approach based on a fuzzy algorithm (92.89%).
publishDate 2021
dc.date.none.fl_str_mv 2021-10-06T08:49:24Z
2021-01-01T00:00:00Z
2021
2021-10-06T09:48:46Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/23280
url http://hdl.handle.net/10071/23280
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2076-3417
10.3390/app11199016
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134848789512192