Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection

Moreno, María Fernanda Osorio

Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection

Detalhes bibliográficos
Autor(a) principal:	Moreno, María Fernanda Osorio
Data de Publicação:	2018
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10362/33863
Resumo:	Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics

Metadados do item

id	RCAP_a1aa338ceba38709210833f21c7aee39
oai_identifier_str	oai:run.unl.pt:10362/33863
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detectionImbalanced datasetsFraudoversamplingInsuranceDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAlthough the current trend of data production is focused on generating tons of it every second, there are situations where the target category is represented extremely unequally, giving rise to imbalanced datasets, analyzing them correctly can lead to relevant decisions that produces appropriate business strategies. Fraud modeling is one example of this situation: it is expected less fraudulent transactions than reliable ones, predict them could be crucial for improving decisions and processes in a company. However, class imbalance produces a negative effect on traditional techniques in dealing with this problem, a lot of techniques have been proposed and oversampling is one of them. This work analyses the behavior of different oversampling techniques such as Random oversampling, SOMO and SMOTE, through different classifiers and evaluation metrics. The exercise is done with real data from an insurance company in Colombia predicting fraudulent claims for its compulsory auto product. Conclusions of this research demonstrate the advantages of using oversampling for imbalance circumstances but also the importance of comparing different evaluation metrics and classifiers to obtain accurate appropriate conclusions and comparable results.Bação, Fernando José Ferreira LucasRUNMoreno, María Fernanda Osorio2018-04-05T13:24:16Z2018-03-262018-03-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/33863TID:201894289enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:18:35Zoai:run.unl.pt:10362/33863Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:30:05.469104Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
title	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
spellingShingle	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection Moreno, María Fernanda Osorio Imbalanced datasets Fraud oversampling Insurance
title_short	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
title_full	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
title_fullStr	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
title_full_unstemmed	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
title_sort	Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
author	Moreno, María Fernanda Osorio
author_facet	Moreno, María Fernanda Osorio
author_role	author
dc.contributor.none.fl_str_mv	Bação, Fernando José Ferreira Lucas RUN
dc.contributor.author.fl_str_mv	Moreno, María Fernanda Osorio
dc.subject.por.fl_str_mv	Imbalanced datasets Fraud oversampling Insurance
topic	Imbalanced datasets Fraud oversampling Insurance
description	Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
publishDate	2018
dc.date.none.fl_str_mv	2018-04-05T13:24:16Z 2018-03-26 2018-03-26T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10362/33863 TID:201894289
url	http://hdl.handle.net/10362/33863
identifier_str_mv	TID:201894289
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799137925223415808

Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection

Registros relacionados