Machine learning: Challenges and opportunities on credit risk
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10071/27558 |
Resumo: | The constant challenge in anticipating the risk of default by borrowers has led financial institutions to develop techniques and models to improve their credit risk monitoring, and to predict how likely it is for certain customers to default on a loan, as well as how likely it is for others to meet their financial obligations. Thus, it is interesting to investigate how financial institutions can anticipate this occurrence using Machine Learning algorithms. This dissertation aims to demonstrate the power of Machine Learning algorithms in credit risk analysis, focusing on building the models, training them, and testing the data, and presenting the opportunities and challenges of Machine Learning that are still open to developing future studies. For this purpose, we present two Machine Learning classification algorithms: Decision Trees and Logistic Regression. In addition, numerical results obtained from various comparisons of these algorithms, which were programmed and ran in Python using the Jupyter Notebook application, are also presented. The initial sample data, consisting of 850 observations, contained credit details about borrowers in the United States of America, and is freely available data. To check the model execution and performance, between Logistic Regression and Decision Trees, we used measures such as AUC, precision and F1-score. |
id |
RCAP_1e2ec9e3a2da32946ab1796ff9d8427a |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/27558 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Machine learning: Challenges and opportunities on credit riskRisco de crédito -- Credit riskData mining --Machine learningLinear and logistic regressionÁrvore de decisão -- Decision treeRandom forestRegressão linear e logísticaFloresta aleatóriaThe constant challenge in anticipating the risk of default by borrowers has led financial institutions to develop techniques and models to improve their credit risk monitoring, and to predict how likely it is for certain customers to default on a loan, as well as how likely it is for others to meet their financial obligations. Thus, it is interesting to investigate how financial institutions can anticipate this occurrence using Machine Learning algorithms. This dissertation aims to demonstrate the power of Machine Learning algorithms in credit risk analysis, focusing on building the models, training them, and testing the data, and presenting the opportunities and challenges of Machine Learning that are still open to developing future studies. For this purpose, we present two Machine Learning classification algorithms: Decision Trees and Logistic Regression. In addition, numerical results obtained from various comparisons of these algorithms, which were programmed and ran in Python using the Jupyter Notebook application, are also presented. The initial sample data, consisting of 850 observations, contained credit details about borrowers in the United States of America, and is freely available data. To check the model execution and performance, between Logistic Regression and Decision Trees, we used measures such as AUC, precision and F1-score.O constante desafio na antecipação do risco de incumprimento por parte dos tomadores de crédito, levou a que as instituições financeiras desenvolvessem técnicas e modelos de forma a melhorar a sua monitorização do risco de crédito, e antever o quão provável será para determinados clientes entrar em incumprimento, assim como o quão provável será para outros de cumprirem com as suas obrigações financeiras. Portanto, interessa averiguar como as instituições financeiras podem antecipar esta ocorrência beneficiando de algoritmos de Machine Learning. A presente dissertação pretende demonstrar o poder dos algoritmos de Machine Learning na análise de risco de crédito, com foco no processo de construção dos modelos, treinando-os e testando os dados, e apresentar as oportunidades e os desafios de Machine Learning que ainda estão em aberto para desenvolver futuros estudos. Para esse propósito, apresentamos dois algoritmos de classificação de Machine Learning: as Árvores de Decisão e a Regressão Logística. Adicionalmente, também se apresenta os resultados numéricos obtidos entre várias comparações desses algoritmos que foram programados e corridos em Python, utilizando a aplicação Jupyter Notebook. Os dados da amostra inicial, constituída por 850 observações, contêm detalhes de crédito sobre os tomadores de empréstimos nos Estados Unidos da América, sendo os dados de livre acesso e uitilização. Para verificar a execução e a performance do modelo, entre Regressão Logística e Árvores de Decisão, usamos medidas como o AUC, precisão e F1-score.2023-01-27T15:50:16Z2022-12-06T00:00:00Z2022-12-062022-10info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10071/27558TID:203127323engCosta, Patrícia Alexandra Guerreiroinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:40:02Zoai:repositorio.iscte-iul.pt:10071/27558Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:18:29.355833Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Machine learning: Challenges and opportunities on credit risk |
title |
Machine learning: Challenges and opportunities on credit risk |
spellingShingle |
Machine learning: Challenges and opportunities on credit risk Costa, Patrícia Alexandra Guerreiro Risco de crédito -- Credit risk Data mining -- Machine learning Linear and logistic regression Árvore de decisão -- Decision tree Random forest Regressão linear e logística Floresta aleatória |
title_short |
Machine learning: Challenges and opportunities on credit risk |
title_full |
Machine learning: Challenges and opportunities on credit risk |
title_fullStr |
Machine learning: Challenges and opportunities on credit risk |
title_full_unstemmed |
Machine learning: Challenges and opportunities on credit risk |
title_sort |
Machine learning: Challenges and opportunities on credit risk |
author |
Costa, Patrícia Alexandra Guerreiro |
author_facet |
Costa, Patrícia Alexandra Guerreiro |
author_role |
author |
dc.contributor.author.fl_str_mv |
Costa, Patrícia Alexandra Guerreiro |
dc.subject.por.fl_str_mv |
Risco de crédito -- Credit risk Data mining -- Machine learning Linear and logistic regression Árvore de decisão -- Decision tree Random forest Regressão linear e logística Floresta aleatória |
topic |
Risco de crédito -- Credit risk Data mining -- Machine learning Linear and logistic regression Árvore de decisão -- Decision tree Random forest Regressão linear e logística Floresta aleatória |
description |
The constant challenge in anticipating the risk of default by borrowers has led financial institutions to develop techniques and models to improve their credit risk monitoring, and to predict how likely it is for certain customers to default on a loan, as well as how likely it is for others to meet their financial obligations. Thus, it is interesting to investigate how financial institutions can anticipate this occurrence using Machine Learning algorithms. This dissertation aims to demonstrate the power of Machine Learning algorithms in credit risk analysis, focusing on building the models, training them, and testing the data, and presenting the opportunities and challenges of Machine Learning that are still open to developing future studies. For this purpose, we present two Machine Learning classification algorithms: Decision Trees and Logistic Regression. In addition, numerical results obtained from various comparisons of these algorithms, which were programmed and ran in Python using the Jupyter Notebook application, are also presented. The initial sample data, consisting of 850 observations, contained credit details about borrowers in the United States of America, and is freely available data. To check the model execution and performance, between Logistic Regression and Decision Trees, we used measures such as AUC, precision and F1-score. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-12-06T00:00:00Z 2022-12-06 2022-10 2023-01-27T15:50:16Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/27558 TID:203127323 |
url |
http://hdl.handle.net/10071/27558 |
identifier_str_mv |
TID:203127323 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134743959175168 |