Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/150753 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and Management |
id |
RCAP_bff392e2e96681251078463d0f5e5790 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/150753 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit CardsCredit ScoringLogistic RegressionRandom ForestDecision Treek-NNSVMNaïve BayesDissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and ManagementCredit scoring is a very important process for banks. It allows the credit analysts to calculate the probability of a client defaulting a payment on a specific time horizon. This process helps the bank to manage their assets, preparing themselves ahead of time for possible defaults and also in the decision-making process of conceding or denying a loan to a new client. There are several different machine learning classifiers that can be used to calculate the probability of default. Studies shown that there is no specific model that can be used as the best one for all circumstances, each model will depend on the dataset. In this study, six different machine learning models are applied on datasets to classify and predict clients more likely to commit credit default. The models compared in this study were chosen based on the most frequently used techniques in this field and because of the lack of studies comparing these six models in specific, namely Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, k-NN and Naïve Bayes. The goal of this comparison is to identity if there is a model that constantly outperforms the others. Three datasets are used. The first one is the German Credit Data with socioeconomic information from the clients requesting for a loan. The second one is the Credit Card Default Dataset with historic information about previous payments of credit cards invoice from clients, both datasets are from UCI repositorium. The last dataset is about credit concession with sociodemographic information about the clients obtained from Kaggle. To compare the models AUC is the main common metric used, followed by confusion matrix. After analysis, the random forest model presents the higher AUC for all datasets, other models vary their position on the ranking depending on the dataset. Finally, decision tree presented a bad AUC since it does not calculate probabilities but had one of the best accuracies of all models for two of the three datasets.Damásio, Bruno Miguel PintoRUNGonsalves, Eduardo Barreto Sulz2023-03-17T18:04:14Z2023-01-272023-01-27T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/150753TID:203248937enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:33:10Zoai:run.unl.pt:10362/150753Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:54:18.586335Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
title |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
spellingShingle |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards Gonsalves, Eduardo Barreto Sulz Credit Scoring Logistic Regression Random Forest Decision Tree k-NN SVM Naïve Bayes |
title_short |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
title_full |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
title_fullStr |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
title_full_unstemmed |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
title_sort |
Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards |
author |
Gonsalves, Eduardo Barreto Sulz |
author_facet |
Gonsalves, Eduardo Barreto Sulz |
author_role |
author |
dc.contributor.none.fl_str_mv |
Damásio, Bruno Miguel Pinto RUN |
dc.contributor.author.fl_str_mv |
Gonsalves, Eduardo Barreto Sulz |
dc.subject.por.fl_str_mv |
Credit Scoring Logistic Regression Random Forest Decision Tree k-NN SVM Naïve Bayes |
topic |
Credit Scoring Logistic Regression Random Forest Decision Tree k-NN SVM Naïve Bayes |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and Management |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-03-17T18:04:14Z 2023-01-27 2023-01-27T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/150753 TID:203248937 |
url |
http://hdl.handle.net/10362/150753 |
identifier_str_mv |
TID:203248937 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138132393721856 |