Using data mining to predict automobile insurance fraud

Detalhes bibliográficos
Autor(a) principal: Vale, João Bernardo do
Data de Publicação: 2012
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.14/15529
Resumo: This thesis presents a study on the issue of Automobile Insurance Fraud. The purpose of this study is to increase knowledge concerning fraudulent claims in the Portuguese market, while raising awareness to the use of Data Mining techniques towards this, and other similar problems. We conduct an application of data mining techniques to the problem of predicting automobile insurance fraud, shown to be of interest to insurance companies around the world. We present fraud definitions and conduct an overview of existing literature on the subject. Live policy and claim data from the Portuguese insurance market in 2005 is used to train a Logit Regression Model and a CHAID Classification and Regression Tree. The use of Data Mining tools and techniques enabled the identification of underlying fraud patterns, specific to the raw data used to build the models. The list of potential fraud indicators includes variables such as the policy’s tenure, the number of policy holders, not admitting fault in the accident or fractioning premium payments semiannually. Other variables such as the number of days between the accident and the patient filing the claim, the client’s age, and the geographical location of the accident were also found to be relevant in specific sub-populations of the used dataset. Model variables and coefficients are interpreted comparatively and key performance results are presented, including PCC, sensitivity, specificity and AUROC. Both the Logit Model and the CHAID C&R Tree achieve fair results in predicting automobile insurance fraud in the used dataset.
id RCAP_926481310383f044945664ab45eacd17
oai_identifier_str oai:repositorio.ucp.pt:10400.14/15529
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Using data mining to predict automobile insurance fraudDomínio/Área Científica::Ciências Sociais::Economia e GestãoThis thesis presents a study on the issue of Automobile Insurance Fraud. The purpose of this study is to increase knowledge concerning fraudulent claims in the Portuguese market, while raising awareness to the use of Data Mining techniques towards this, and other similar problems. We conduct an application of data mining techniques to the problem of predicting automobile insurance fraud, shown to be of interest to insurance companies around the world. We present fraud definitions and conduct an overview of existing literature on the subject. Live policy and claim data from the Portuguese insurance market in 2005 is used to train a Logit Regression Model and a CHAID Classification and Regression Tree. The use of Data Mining tools and techniques enabled the identification of underlying fraud patterns, specific to the raw data used to build the models. The list of potential fraud indicators includes variables such as the policy’s tenure, the number of policy holders, not admitting fault in the accident or fractioning premium payments semiannually. Other variables such as the number of days between the accident and the patient filing the claim, the client’s age, and the geographical location of the accident were also found to be relevant in specific sub-populations of the used dataset. Model variables and coefficients are interpreted comparatively and key performance results are presented, including PCC, sensitivity, specificity and AUROC. Both the Logit Model and the CHAID C&R Tree achieve fair results in predicting automobile insurance fraud in the used dataset.Rafael, José FilipeVeritati - Repositório Institucional da Universidade Católica PortuguesaVale, João Bernardo do2014-11-07T15:18:13Z2012-09-1720122012-09-17T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.14/15529enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-09-12T01:37:22Zoai:repositorio.ucp.pt:10400.14/15529Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:12:56.801504Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Using data mining to predict automobile insurance fraud
title Using data mining to predict automobile insurance fraud
spellingShingle Using data mining to predict automobile insurance fraud
Vale, João Bernardo do
Domínio/Área Científica::Ciências Sociais::Economia e Gestão
title_short Using data mining to predict automobile insurance fraud
title_full Using data mining to predict automobile insurance fraud
title_fullStr Using data mining to predict automobile insurance fraud
title_full_unstemmed Using data mining to predict automobile insurance fraud
title_sort Using data mining to predict automobile insurance fraud
author Vale, João Bernardo do
author_facet Vale, João Bernardo do
author_role author
dc.contributor.none.fl_str_mv Rafael, José Filipe
Veritati - Repositório Institucional da Universidade Católica Portuguesa
dc.contributor.author.fl_str_mv Vale, João Bernardo do
dc.subject.por.fl_str_mv Domínio/Área Científica::Ciências Sociais::Economia e Gestão
topic Domínio/Área Científica::Ciências Sociais::Economia e Gestão
description This thesis presents a study on the issue of Automobile Insurance Fraud. The purpose of this study is to increase knowledge concerning fraudulent claims in the Portuguese market, while raising awareness to the use of Data Mining techniques towards this, and other similar problems. We conduct an application of data mining techniques to the problem of predicting automobile insurance fraud, shown to be of interest to insurance companies around the world. We present fraud definitions and conduct an overview of existing literature on the subject. Live policy and claim data from the Portuguese insurance market in 2005 is used to train a Logit Regression Model and a CHAID Classification and Regression Tree. The use of Data Mining tools and techniques enabled the identification of underlying fraud patterns, specific to the raw data used to build the models. The list of potential fraud indicators includes variables such as the policy’s tenure, the number of policy holders, not admitting fault in the accident or fractioning premium payments semiannually. Other variables such as the number of days between the accident and the patient filing the claim, the client’s age, and the geographical location of the accident were also found to be relevant in specific sub-populations of the used dataset. Model variables and coefficients are interpreted comparatively and key performance results are presented, including PCC, sensitivity, specificity and AUROC. Both the Logit Model and the CHAID C&R Tree achieve fair results in predicting automobile insurance fraud in the used dataset.
publishDate 2012
dc.date.none.fl_str_mv 2012-09-17
2012
2012-09-17T00:00:00Z
2014-11-07T15:18:13Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.14/15529
url http://hdl.handle.net/10400.14/15529
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799131806918770688