Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/33863 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
id |
RCAP_a1aa338ceba38709210833f21c7aee39 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/33863 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detectionImbalanced datasetsFraudoversamplingInsuranceDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAlthough the current trend of data production is focused on generating tons of it every second, there are situations where the target category is represented extremely unequally, giving rise to imbalanced datasets, analyzing them correctly can lead to relevant decisions that produces appropriate business strategies. Fraud modeling is one example of this situation: it is expected less fraudulent transactions than reliable ones, predict them could be crucial for improving decisions and processes in a company. However, class imbalance produces a negative effect on traditional techniques in dealing with this problem, a lot of techniques have been proposed and oversampling is one of them. This work analyses the behavior of different oversampling techniques such as Random oversampling, SOMO and SMOTE, through different classifiers and evaluation metrics. The exercise is done with real data from an insurance company in Colombia predicting fraudulent claims for its compulsory auto product. Conclusions of this research demonstrate the advantages of using oversampling for imbalance circumstances but also the importance of comparing different evaluation metrics and classifiers to obtain accurate appropriate conclusions and comparable results.Bação, Fernando José Ferreira LucasRUNMoreno, María Fernanda Osorio2018-04-05T13:24:16Z2018-03-262018-03-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/33863TID:201894289enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:18:35Zoai:run.unl.pt:10362/33863Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:30:05.469104Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
spellingShingle |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection Moreno, María Fernanda Osorio Imbalanced datasets Fraud oversampling Insurance |
title_short |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_full |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_fullStr |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_full_unstemmed |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_sort |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
author |
Moreno, María Fernanda Osorio |
author_facet |
Moreno, María Fernanda Osorio |
author_role |
author |
dc.contributor.none.fl_str_mv |
Bação, Fernando José Ferreira Lucas RUN |
dc.contributor.author.fl_str_mv |
Moreno, María Fernanda Osorio |
dc.subject.por.fl_str_mv |
Imbalanced datasets Fraud oversampling Insurance |
topic |
Imbalanced datasets Fraud oversampling Insurance |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-04-05T13:24:16Z 2018-03-26 2018-03-26T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/33863 TID:201894289 |
url |
http://hdl.handle.net/10362/33863 |
identifier_str_mv |
TID:201894289 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137925223415808 |