Métodos de detecção de fraude em cartões de crédito: um estudo comparativo

Piccin, Luiz Eduardo

Métodos de detecção de fraude em cartões de crédito: um estudo comparativo

Detalhes bibliográficos
Autor(a) principal:	Piccin, Luiz Eduardo
Data de Publicação:	2022
Tipo de documento:	Trabalho de conclusão de curso
Idioma:	por
Título da fonte:	Repositório Institucional da UFSCAR
Texto Completo:	https://repositorio.ufscar.br/handle/ufscar/16072
Resumo:	The arrival of the pandemic radically changed the consumption habits of goods and services, starting to occur almost exclusively in the virtual world, which in turn, in the context of fraud, has a greater number of loopholes when compared to the physical world. This increase in the number of online transactions (predominantly approved by credit card) has resulted in a greater number of frauds. From the business point of view, it is extremely important that companies are able to detect a fraudulent transaction, avoiding damage to the customer relationship and also financial losses. Usually, in the fraud detection process, there is a predictive model behind the scenes, which approaches the ideal when it presents high performance in the detection of fraudulent transactions and this extends to legitimate transactions (in technical terms, it means observing a low volume of false negatives and positives). In this work, we propose to compare the performance of two base classifiers when trained in two different architectures: the bounded version of logistic regression against its unbounded version, both with l1 regularization, using both balanced data (via k-means) and diversified (via bagging) as unbalanced data. On the k balanced and diversified training subsets to be built, the base classifiers are trained, combined by a weighted average and the final prediction is judged from this average. The comparative study is carried out in a real data scenario, in terms of AUC (Area Under the Curve) and other test statistics, such as KS (Kolmogorov-Smirnov), for example. The results obtained can also be compared with other works present in the literature

Metadados do item

id	SCAR_d7ba16297f319a2beb1b68d1287393a2
oai_identifier_str	oai:repositorio.ufscar.br:ufscar/16072
network_acronym_str	SCAR
network_name_str	Repositório Institucional da UFSCAR
repository_id_str	4322
spelling	Piccin, Luiz EduardoFerreira, Ricardo Felipehttp://lattes.cnpq.br/2355076087945221http://lattes.cnpq.br/4696717141691347dd1e5203-39b9-4a02-a592-210f6997e8e82022-05-07T00:29:32Z2022-05-07T00:29:32Z2022-04-14PICCIN, Luiz Eduardo. Métodos de detecção de fraude em cartões de crédito: um estudo comparativo. 2022. Trabalho de Conclusão de Curso (Graduação em Estatística) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/ufscar/16072.https://repositorio.ufscar.br/handle/ufscar/16072The arrival of the pandemic radically changed the consumption habits of goods and services, starting to occur almost exclusively in the virtual world, which in turn, in the context of fraud, has a greater number of loopholes when compared to the physical world. This increase in the number of online transactions (predominantly approved by credit card) has resulted in a greater number of frauds. From the business point of view, it is extremely important that companies are able to detect a fraudulent transaction, avoiding damage to the customer relationship and also financial losses. Usually, in the fraud detection process, there is a predictive model behind the scenes, which approaches the ideal when it presents high performance in the detection of fraudulent transactions and this extends to legitimate transactions (in technical terms, it means observing a low volume of false negatives and positives). In this work, we propose to compare the performance of two base classifiers when trained in two different architectures: the bounded version of logistic regression against its unbounded version, both with l1 regularization, using both balanced data (via k-means) and diversified (via bagging) as unbalanced data. On the k balanced and diversified training subsets to be built, the base classifiers are trained, combined by a weighted average and the final prediction is judged from this average. The comparative study is carried out in a real data scenario, in terms of AUC (Area Under the Curve) and other test statistics, such as KS (Kolmogorov-Smirnov), for example. The results obtained can also be compared with other works present in the literatureA chegada da pandemia mudou radicalmente os hábitos de consumo de bens e serviços, passando a ocorrer quase que exclusivamente no mundo virtual, que por sua vez, no contexto de fraude, possui um maior número de brechas quando comparado ao mundo físico. Esse aumento na quantidade de transações online (aprovadas predominantemente com cartão de crédito) resultou em um maior número de fraudes. Na ótica do negócio, é de extrema importância que as companhias sejam capazes de detectar uma transação fraudulenta, evitando prejuízos no relacionamento com o cliente e também perdas financeiras. Usualmente, no processo de detecção de fraude, existe nos bastidores um modelo preditivo, que se aproxima do ideal quando apresenta alta performance na detecção de transações fraudulentas e isso se estende para as transações legítimas (em termos técnicos, significa observar baixo volume de falsos negativos e positivos). Neste trabalho, propomos comparar a performance de dois classificadores base quando treinados em duas arquiteturas distintas: a versão limitada da regressão logística contra a sua versão não-limitada, ambas com regularização l1, utilizando como conjunto de treinamento tanto dados balanceados (via k-means) e diversificados (via bagging) quanto dados desbalanceados. Nos k subconjuntos de treinamento balanceados e diversificados a serem construídos, os classificadores bases são treinados, combinados por uma média ponderada e a previsão final é julgada a partir dessa média. O estudo comparativo é realizado em um cenário de dados reais, em termos da AUC (Area Under the Curve) e de outras estatísticas de teste, como o KS (índice Kolmogorov-Smirnov), por exemplo. Os resultados obtidos poderão ser comparados também com outras obras presentes na literatura.Não recebi financiamentoporUniversidade Federal de São CarlosCâmpus São CarlosEstatística - EsUFSCarAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessDetecção de fraudeRegressão logística limitadaRegularização l1Balanceamento e diversificação do conjunto de treinamentoCIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::ANALISE DE DADOSCIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::INFERENCIA PARAMETRICACIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAOMétodos de detecção de fraude em cartões de crédito: um estudo comparativoCredit card fraud detection methods: a comparative studyinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesis6008c64d439-6f5c-4dfc-88f1-144b0ce1ae8ereponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALMonografia - Luiz Piccin.pdfMonografia - Luiz Piccin.pdfMonografiaapplication/pdf492339https://repositorio.ufscar.br/bitstream/ufscar/16072/1/Monografia%20-%20Luiz%20Piccin.pdfe35b0dc8663ac7a91722ebe8e845171bMD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufscar.br/bitstream/ufscar/16072/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52TEXTMonografia - Luiz Piccin.pdf.txtMonografia - Luiz Piccin.pdf.txtExtracted texttext/plain104013https://repositorio.ufscar.br/bitstream/ufscar/16072/3/Monografia%20-%20Luiz%20Piccin.pdf.txt2b980b740b64630d4e930ee5fa6246b1MD53THUMBNAILMonografia - Luiz Piccin.pdf.jpgMonografia - Luiz Piccin.pdf.jpgIM Thumbnailimage/jpeg6752https://repositorio.ufscar.br/bitstream/ufscar/16072/4/Monografia%20-%20Luiz%20Piccin.pdf.jpg7b84ab79c3eac00aafd2758f8fd4d7feMD54ufscar/160722023-09-18 18:32:26.868oai:repositorio.ufscar.br:ufscar/16072Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:32:26Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.por.fl_str_mv	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
dc.title.alternative.eng.fl_str_mv	Credit card fraud detection methods: a comparative study
title	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
spellingShingle	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo Piccin, Luiz Eduardo Detecção de fraude Regressão logística limitada Regularização l1 Balanceamento e diversificação do conjunto de treinamento CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::ANALISE DE DADOS CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::INFERENCIA PARAMETRICA CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO
title_short	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
title_full	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
title_fullStr	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
title_full_unstemmed	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
title_sort	Métodos de detecção de fraude em cartões de crédito: um estudo comparativo
author	Piccin, Luiz Eduardo
author_facet	Piccin, Luiz Eduardo
author_role	author
dc.contributor.authorlattes.por.fl_str_mv	http://lattes.cnpq.br/4696717141691347
dc.contributor.author.fl_str_mv	Piccin, Luiz Eduardo
dc.contributor.advisor1.fl_str_mv	Ferreira, Ricardo Felipe
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/2355076087945221
dc.contributor.authorID.fl_str_mv	dd1e5203-39b9-4a02-a592-210f6997e8e8
contributor_str_mv	Ferreira, Ricardo Felipe
dc.subject.por.fl_str_mv	Detecção de fraude Regressão logística limitada Regularização l1 Balanceamento e diversificação do conjunto de treinamento
topic	Detecção de fraude Regressão logística limitada Regularização l1 Balanceamento e diversificação do conjunto de treinamento CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::ANALISE DE DADOS CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::INFERENCIA PARAMETRICA CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::ANALISE DE DADOS CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::INFERENCIA PARAMETRICA CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO
description	The arrival of the pandemic radically changed the consumption habits of goods and services, starting to occur almost exclusively in the virtual world, which in turn, in the context of fraud, has a greater number of loopholes when compared to the physical world. This increase in the number of online transactions (predominantly approved by credit card) has resulted in a greater number of frauds. From the business point of view, it is extremely important that companies are able to detect a fraudulent transaction, avoiding damage to the customer relationship and also financial losses. Usually, in the fraud detection process, there is a predictive model behind the scenes, which approaches the ideal when it presents high performance in the detection of fraudulent transactions and this extends to legitimate transactions (in technical terms, it means observing a low volume of false negatives and positives). In this work, we propose to compare the performance of two base classifiers when trained in two different architectures: the bounded version of logistic regression against its unbounded version, both with l1 regularization, using both balanced data (via k-means) and diversified (via bagging) as unbalanced data. On the k balanced and diversified training subsets to be built, the base classifiers are trained, combined by a weighted average and the final prediction is judged from this average. The comparative study is carried out in a real data scenario, in terms of AUC (Area Under the Curve) and other test statistics, such as KS (Kolmogorov-Smirnov), for example. The results obtained can also be compared with other works present in the literature
publishDate	2022
dc.date.accessioned.fl_str_mv	2022-05-07T00:29:32Z
dc.date.available.fl_str_mv	2022-05-07T00:29:32Z
dc.date.issued.fl_str_mv	2022-04-14
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/bachelorThesis
format	bachelorThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	PICCIN, Luiz Eduardo. Métodos de detecção de fraude em cartões de crédito: um estudo comparativo. 2022. Trabalho de Conclusão de Curso (Graduação em Estatística) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/ufscar/16072.
dc.identifier.uri.fl_str_mv	https://repositorio.ufscar.br/handle/ufscar/16072
identifier_str_mv	PICCIN, Luiz Eduardo. Métodos de detecção de fraude em cartões de crédito: um estudo comparativo. 2022. Trabalho de Conclusão de Curso (Graduação em Estatística) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/ufscar/16072.
url	https://repositorio.ufscar.br/handle/ufscar/16072
dc.language.iso.fl_str_mv	por
language	por
dc.relation.confidence.fl_str_mv	600
dc.relation.authority.fl_str_mv	8c64d439-6f5c-4dfc-88f1-144b0ce1ae8e
dc.rights.driver.fl_str_mv	Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de São Carlos Câmpus São Carlos Estatística - Es
dc.publisher.initials.fl_str_mv	UFSCar
publisher.none.fl_str_mv	Universidade Federal de São Carlos Câmpus São Carlos Estatística - Es
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR
instname_str	Universidade Federal de São Carlos (UFSCAR)
instacron_str	UFSCAR
institution	UFSCAR
reponame_str	Repositório Institucional da UFSCAR
collection	Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv	https://repositorio.ufscar.br/bitstream/ufscar/16072/1/Monografia%20-%20Luiz%20Piccin.pdf https://repositorio.ufscar.br/bitstream/ufscar/16072/2/license_rdf https://repositorio.ufscar.br/bitstream/ufscar/16072/3/Monografia%20-%20Luiz%20Piccin.pdf.txt https://repositorio.ufscar.br/bitstream/ufscar/16072/4/Monografia%20-%20Luiz%20Piccin.pdf.jpg
bitstream.checksum.fl_str_mv	e35b0dc8663ac7a91722ebe8e845171b e39d27027a6cc9cb039ad269a5db8e34 2b980b740b64630d4e930ee5fa6246b1 7b84ab79c3eac00aafd2758f8fd4d7fe
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv
_version_	1813715647507464192

Métodos de detecção de fraude em cartões de crédito: um estudo comparativo

Registros relacionados