Machine learning frameworks for retention models: an application to a motor insurance portfolio

Montrond, Diana Alexandra

Machine learning frameworks for retention models: an application to a motor insurance portfolio

Detalhes bibliográficos
Autor(a) principal:	Montrond, Diana Alexandra
Data de Publicação:	2021
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.5/23909
Resumo:	Mestrado Bolonha em Actuarial Science

Metadados do item

id	RCAP_4a893b923c78b0e205f7916ef3042b52
oai_identifier_str	oai:www.repository.utl.pt:10400.5/23909
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Machine learning frameworks for retention models: an application to a motor insurance portfolioSeguro automóvelRetençãoRegressão logísticaMachine learningExtreme gradient boostingLight gradient boostTreino e testeCorrelação de PersonCramér’s VAUCROC CurvesTeste de McNemarMotor insuranceRetentionLogistic retentionTraining and testingMc-Nemar’s testMestrado Bolonha em Actuarial ScienceNo mercado de seguradoras, prever a retenção é uma etapa importante para definir estratégias de forma a obter mais clientes, tal como, entender possíveis características que possam prever o cancelamento dos clientes atuais. Sendo assim, existe a necessidade de obter modelos mais eficientes e rigorosos para a necessidades do mercado. Este projeto é uma dessas tentativa pois seleciona três modelos diferentes para serem comparados e escolhe o que têm melhor resultados. Os três modelos selecionados são: Um pertence ao Modelos Lineares e Generalizados (GLM), a Regressão Logística, e dois modelos de Machine Learning Algorithms, o Extreme Gradient Boosting (xGBoost) e o Light Gradient Boost (Light GBM). Para avaliar os modelos, classificações como Area Under the Curve (AUC), Receiver Operating Characteristic Curve (ROC Curve) e Exatidão são usados tal como o Teste de McNemar. Este relatório também investiga maneiras de aprofundar o conhecimento da carteira de seguro de automóvel, realizando uma Análise Exploratória dos Dados. Métodos de Feature Enginering foram usados para lidar com problemas de valores omissos e variáveis categóricas. Com objetivo de melhor os dados, foi sugerido dois tipos de métodos para a divisão de treino e teste. Nesta análise foi descoberto que o método mais eficiente de divisão de treino e teste é selecionar aleatoriamente 80% de cada mês para toda a base de dados como conjunto de treino e o restante como conjunto de teste. Também se encontrou o melhor modelo de demanda que é o Light Gradient Boost, com diferenças mínimas do Extreme Gradient Boosting. Os modelos de Machine Learning Algorithms obtiveram resultados mais significativos que a tradicional Regressão Logística. A comparação dos modelos foi bem sucedida em desvendar as diferenças entre os modelos.In insurance market, predicting the retention is one key step to define strategies to obtain more clients as well as understand possible features that can prevent cancellation from current clients. Therefore, it exists a demand to have the best efficient modelling system that would lead to determinate better the needs of the market. This project is one of those attempts, it selects three different models to be compared and chooses which of them behaves the best. The selected three models are: One belongs to the Generalised Linear Model (GLM), the Logistic Regression and two Machine Learning Algorithms, the Extreme Gradient Boosting (xGBoost) and the Light Gradient Boost (Light GBM). To evaluate those models, classifications such as Area Under the Curve (AUC), Receiver Operating Characteristic Curve (ROC Curve) and Accuracy were used as well as the McNemar’s Test. This report also investigates ways to gain better knowledge about the Motor Insurance Portfolio, performing an Exploratory Data Analysis. It also covers methods of Feature Engineering to deal with data issues, such as missing data and with categorical variables. To obtain further improvements of the data, it was suggested two types of methods of training and testing split. This analysis shows that the most efficient training and testing methods is randomly select 80% of each month of the whole data set as a training set and the remaining as a testing set. It also concludes that the best retention model is the Light Gradient Boost, with very small difference from the Extreme Gradient Boosting. Those Machine Learning Algorithms performed significantly better than the traditional Logistic Regression. The model evaluations performed were mostly successful in discovering the differences between the models.Instituto Superior de Economia e GestãoSilva, João Andrade eNogueira, Francisco Alier ValentimRepositório da Universidade de LisboaMontrond, Diana Alexandra2022-09-25T00:30:22Z2021-102021-10-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.5/23909engMontrond, Diana Alexandra (2021). “Machine learning frameworks for retention models: an application to a motor insurance portfolio”. Dissertação de Mestrado. Universidade Técnica de Lisboa. Instituto Superior de Economia e Gestão.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-06T14:53:33Zoai:www.repository.utl.pt:10400.5/23909Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:08:01.845380Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Machine learning frameworks for retention models: an application to a motor insurance portfolio
title	Machine learning frameworks for retention models: an application to a motor insurance portfolio
spellingShingle	Machine learning frameworks for retention models: an application to a motor insurance portfolio Montrond, Diana Alexandra Seguro automóvel Retenção Regressão logística Machine learning Extreme gradient boosting Light gradient boost Treino e teste Correlação de Person Cramér’s V AUC ROC Curves Teste de McNemar Motor insurance Retention Logistic retention Training and testing Mc-Nemar’s test
title_short	Machine learning frameworks for retention models: an application to a motor insurance portfolio
title_full	Machine learning frameworks for retention models: an application to a motor insurance portfolio
title_fullStr	Machine learning frameworks for retention models: an application to a motor insurance portfolio
title_full_unstemmed	Machine learning frameworks for retention models: an application to a motor insurance portfolio
title_sort	Machine learning frameworks for retention models: an application to a motor insurance portfolio
author	Montrond, Diana Alexandra
author_facet	Montrond, Diana Alexandra
author_role	author
dc.contributor.none.fl_str_mv	Silva, João Andrade e Nogueira, Francisco Alier Valentim Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv	Montrond, Diana Alexandra
dc.subject.por.fl_str_mv	Seguro automóvel Retenção Regressão logística Machine learning Extreme gradient boosting Light gradient boost Treino e teste Correlação de Person Cramér’s V AUC ROC Curves Teste de McNemar Motor insurance Retention Logistic retention Training and testing Mc-Nemar’s test
topic	Seguro automóvel Retenção Regressão logística Machine learning Extreme gradient boosting Light gradient boost Treino e teste Correlação de Person Cramér’s V AUC ROC Curves Teste de McNemar Motor insurance Retention Logistic retention Training and testing Mc-Nemar’s test
description	Mestrado Bolonha em Actuarial Science
publishDate	2021
dc.date.none.fl_str_mv	2021-10 2021-10-01T00:00:00Z 2022-09-25T00:30:22Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.5/23909
url	http://hdl.handle.net/10400.5/23909
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	Montrond, Diana Alexandra (2021). “Machine learning frameworks for retention models: an application to a motor insurance portfolio”. Dissertação de Mestrado. Universidade Técnica de Lisboa. Instituto Superior de Economia e Gestão.
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Instituto Superior de Economia e Gestão
publisher.none.fl_str_mv	Instituto Superior de Economia e Gestão
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799131174424018944

Machine learning frameworks for retention models: an application to a motor insurance portfolio

Registros relacionados