Using data mining to predict automobile insurance fraud
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.14/15529 |
Resumo: | This thesis presents a study on the issue of Automobile Insurance Fraud. The purpose of this study is to increase knowledge concerning fraudulent claims in the Portuguese market, while raising awareness to the use of Data Mining techniques towards this, and other similar problems. We conduct an application of data mining techniques to the problem of predicting automobile insurance fraud, shown to be of interest to insurance companies around the world. We present fraud definitions and conduct an overview of existing literature on the subject. Live policy and claim data from the Portuguese insurance market in 2005 is used to train a Logit Regression Model and a CHAID Classification and Regression Tree. The use of Data Mining tools and techniques enabled the identification of underlying fraud patterns, specific to the raw data used to build the models. The list of potential fraud indicators includes variables such as the policy’s tenure, the number of policy holders, not admitting fault in the accident or fractioning premium payments semiannually. Other variables such as the number of days between the accident and the patient filing the claim, the client’s age, and the geographical location of the accident were also found to be relevant in specific sub-populations of the used dataset. Model variables and coefficients are interpreted comparatively and key performance results are presented, including PCC, sensitivity, specificity and AUROC. Both the Logit Model and the CHAID C&R Tree achieve fair results in predicting automobile insurance fraud in the used dataset. |
id |
RCAP_926481310383f044945664ab45eacd17 |
---|---|
oai_identifier_str |
oai:repositorio.ucp.pt:10400.14/15529 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Using data mining to predict automobile insurance fraudDomínio/Área Científica::Ciências Sociais::Economia e GestãoThis thesis presents a study on the issue of Automobile Insurance Fraud. The purpose of this study is to increase knowledge concerning fraudulent claims in the Portuguese market, while raising awareness to the use of Data Mining techniques towards this, and other similar problems. We conduct an application of data mining techniques to the problem of predicting automobile insurance fraud, shown to be of interest to insurance companies around the world. We present fraud definitions and conduct an overview of existing literature on the subject. Live policy and claim data from the Portuguese insurance market in 2005 is used to train a Logit Regression Model and a CHAID Classification and Regression Tree. The use of Data Mining tools and techniques enabled the identification of underlying fraud patterns, specific to the raw data used to build the models. The list of potential fraud indicators includes variables such as the policy’s tenure, the number of policy holders, not admitting fault in the accident or fractioning premium payments semiannually. Other variables such as the number of days between the accident and the patient filing the claim, the client’s age, and the geographical location of the accident were also found to be relevant in specific sub-populations of the used dataset. Model variables and coefficients are interpreted comparatively and key performance results are presented, including PCC, sensitivity, specificity and AUROC. Both the Logit Model and the CHAID C&R Tree achieve fair results in predicting automobile insurance fraud in the used dataset.Rafael, José FilipeVeritati - Repositório Institucional da Universidade Católica PortuguesaVale, João Bernardo do2014-11-07T15:18:13Z2012-09-1720122012-09-17T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.14/15529enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-09-12T01:37:22Zoai:repositorio.ucp.pt:10400.14/15529Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:12:56.801504Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Using data mining to predict automobile insurance fraud |
title |
Using data mining to predict automobile insurance fraud |
spellingShingle |
Using data mining to predict automobile insurance fraud Vale, João Bernardo do Domínio/Área Científica::Ciências Sociais::Economia e Gestão |
title_short |
Using data mining to predict automobile insurance fraud |
title_full |
Using data mining to predict automobile insurance fraud |
title_fullStr |
Using data mining to predict automobile insurance fraud |
title_full_unstemmed |
Using data mining to predict automobile insurance fraud |
title_sort |
Using data mining to predict automobile insurance fraud |
author |
Vale, João Bernardo do |
author_facet |
Vale, João Bernardo do |
author_role |
author |
dc.contributor.none.fl_str_mv |
Rafael, José Filipe Veritati - Repositório Institucional da Universidade Católica Portuguesa |
dc.contributor.author.fl_str_mv |
Vale, João Bernardo do |
dc.subject.por.fl_str_mv |
Domínio/Área Científica::Ciências Sociais::Economia e Gestão |
topic |
Domínio/Área Científica::Ciências Sociais::Economia e Gestão |
description |
This thesis presents a study on the issue of Automobile Insurance Fraud. The purpose of this study is to increase knowledge concerning fraudulent claims in the Portuguese market, while raising awareness to the use of Data Mining techniques towards this, and other similar problems. We conduct an application of data mining techniques to the problem of predicting automobile insurance fraud, shown to be of interest to insurance companies around the world. We present fraud definitions and conduct an overview of existing literature on the subject. Live policy and claim data from the Portuguese insurance market in 2005 is used to train a Logit Regression Model and a CHAID Classification and Regression Tree. The use of Data Mining tools and techniques enabled the identification of underlying fraud patterns, specific to the raw data used to build the models. The list of potential fraud indicators includes variables such as the policy’s tenure, the number of policy holders, not admitting fault in the accident or fractioning premium payments semiannually. Other variables such as the number of days between the accident and the patient filing the claim, the client’s age, and the geographical location of the accident were also found to be relevant in specific sub-populations of the used dataset. Model variables and coefficients are interpreted comparatively and key performance results are presented, including PCC, sensitivity, specificity and AUROC. Both the Logit Model and the CHAID C&R Tree achieve fair results in predicting automobile insurance fraud in the used dataset. |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012-09-17 2012 2012-09-17T00:00:00Z 2014-11-07T15:18:13Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.14/15529 |
url |
http://hdl.handle.net/10400.14/15529 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131806918770688 |