Machine Learning applied to credit risk assessment: Prediction of loan defaults
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/149818 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science |
id |
RCAP_e0ffc47e83258a0caca6e055e6edc721 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/149818 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Machine Learning applied to credit risk assessment: Prediction of loan defaultsCredit RiskMachine LearningLogistic RegressionEnsemble MethodsLoan DefaultsDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceDue to the recent financial crisis and regulatory concerns of Basel II, credit risk assessment is becoming a very important topic in the field of financial risk management. Financial institutions need to take great care when dealing with consumer loans in order to avoid losses and costs of opportunity. For this matter, credit scoring systems have been used to make informed decisions on whether or not to grant credit to clients who apply to them. Until now several credit scoring models have been proposed, from statistical models, to more complex artificial intelligence techniques. However, most of previous work is focused on employing single classifiers. Ensemble learning is a powerful machine learning paradigm which has proven to be of great value in solving a variety of problems. This study compares the performance of the industry standard, logistic regression, to four ensemble methods, i.e. AdaBoost, Gradient Boosting, Random Forest and Stacking in identifying potential loan defaults. All the models were built with a real world dataset with over one million customers from Lending Club, a financial institution based in the United States. The performance of the models was compared by using the Hold-out method as the evaluation design and accuracy, AUC, type I error and type II error as evaluation metrics. Experimental results reveal that the ensemble classifiers were able to outperform logistic regression on three key indicators, i.e. accuracy, type I error and type II error. AdaBoost performed better than the remaining classifiers considering a trade off between all the metrics evaluated. The main contribution of this thesis is an experimental addition to the literature on the preferred models for predicting potential loan defaulters.Castelli, MauroRUNSimão, Sofia Beatriz Santos2023-02-28T18:49:41Z2023-01-262023-01-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/149818TID:203239067enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:31:42Zoai:run.unl.pt:10362/149818Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:53:52.909378Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
title |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
spellingShingle |
Machine Learning applied to credit risk assessment: Prediction of loan defaults Simão, Sofia Beatriz Santos Credit Risk Machine Learning Logistic Regression Ensemble Methods Loan Defaults |
title_short |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
title_full |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
title_fullStr |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
title_full_unstemmed |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
title_sort |
Machine Learning applied to credit risk assessment: Prediction of loan defaults |
author |
Simão, Sofia Beatriz Santos |
author_facet |
Simão, Sofia Beatriz Santos |
author_role |
author |
dc.contributor.none.fl_str_mv |
Castelli, Mauro RUN |
dc.contributor.author.fl_str_mv |
Simão, Sofia Beatriz Santos |
dc.subject.por.fl_str_mv |
Credit Risk Machine Learning Logistic Regression Ensemble Methods Loan Defaults |
topic |
Credit Risk Machine Learning Logistic Regression Ensemble Methods Loan Defaults |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-02-28T18:49:41Z 2023-01-26 2023-01-26T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/149818 TID:203239067 |
url |
http://hdl.handle.net/10362/149818 |
identifier_str_mv |
TID:203239067 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138128958586880 |