Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards

Detalhes bibliográficos
Autor(a) principal: Gonsalves, Eduardo Barreto Sulz
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/150753
Resumo: Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and Management
id RCAP_bff392e2e96681251078463d0f5e5790
oai_identifier_str oai:run.unl.pt:10362/150753
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit CardsCredit ScoringLogistic RegressionRandom ForestDecision Treek-NNSVMNaïve BayesDissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and ManagementCredit scoring is a very important process for banks. It allows the credit analysts to calculate the probability of a client defaulting a payment on a specific time horizon. This process helps the bank to manage their assets, preparing themselves ahead of time for possible defaults and also in the decision-making process of conceding or denying a loan to a new client. There are several different machine learning classifiers that can be used to calculate the probability of default. Studies shown that there is no specific model that can be used as the best one for all circumstances, each model will depend on the dataset. In this study, six different machine learning models are applied on datasets to classify and predict clients more likely to commit credit default. The models compared in this study were chosen based on the most frequently used techniques in this field and because of the lack of studies comparing these six models in specific, namely Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, k-NN and Naïve Bayes. The goal of this comparison is to identity if there is a model that constantly outperforms the others. Three datasets are used. The first one is the German Credit Data with socioeconomic information from the clients requesting for a loan. The second one is the Credit Card Default Dataset with historic information about previous payments of credit cards invoice from clients, both datasets are from UCI repositorium. The last dataset is about credit concession with sociodemographic information about the clients obtained from Kaggle. To compare the models AUC is the main common metric used, followed by confusion matrix. After analysis, the random forest model presents the higher AUC for all datasets, other models vary their position on the ranking depending on the dataset. Finally, decision tree presented a bad AUC since it does not calculate probabilities but had one of the best accuracies of all models for two of the three datasets.Damásio, Bruno Miguel PintoRUNGonsalves, Eduardo Barreto Sulz2023-03-17T18:04:14Z2023-01-272023-01-27T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/150753TID:203248937enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:33:10Zoai:run.unl.pt:10362/150753Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:54:18.586335Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
title Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
spellingShingle Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
Gonsalves, Eduardo Barreto Sulz
Credit Scoring
Logistic Regression
Random Forest
Decision Tree
k-NN
SVM
Naïve Bayes
title_short Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
title_full Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
title_fullStr Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
title_full_unstemmed Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
title_sort Different Approaches of Machine Learning Models in Credit Risk: A Case Study on Default on Credit Cards
author Gonsalves, Eduardo Barreto Sulz
author_facet Gonsalves, Eduardo Barreto Sulz
author_role author
dc.contributor.none.fl_str_mv Damásio, Bruno Miguel Pinto
RUN
dc.contributor.author.fl_str_mv Gonsalves, Eduardo Barreto Sulz
dc.subject.por.fl_str_mv Credit Scoring
Logistic Regression
Random Forest
Decision Tree
k-NN
SVM
Naïve Bayes
topic Credit Scoring
Logistic Regression
Random Forest
Decision Tree
k-NN
SVM
Naïve Bayes
description Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and Management
publishDate 2023
dc.date.none.fl_str_mv 2023-03-17T18:04:14Z
2023-01-27
2023-01-27T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/150753
TID:203248937
url http://hdl.handle.net/10362/150753
identifier_str_mv TID:203248937
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138132393721856