Fairness in machine learning : an empirical experiment about protected features and their implications

Guntzel, Maurício Holler

Fairness in machine learning : an empirical experiment about protected features and their implications

Detalhes bibliográficos
Autor(a) principal:	Guntzel, Maurício Holler
Data de Publicação:	2022
Tipo de documento:	Trabalho de conclusão de curso
Idioma:	eng
Título da fonte:	Repositório Institucional da UFRGS
Texto Completo:	http://hdl.handle.net/10183/245286
Resumo:	Increasingly, machine learning models perform high-stakes decisions in almost any do main. These models and the datasets - they are trained on– may be prone to exacerbating social disparities due to unmitigated fairness issues. For example, features representing different social groups are known as protected features– as stated by Equality Act of 2010; they correspond to one of these fairness issues. This work explores the impact of protected features on predictive models’ outcomes and their performance and fairness. We propose a knowledge-driven pipeline for detecting protected features and mitigating their effect. Protected features are defined based on metadata and are removed during the training phase of the models. Nevertheless, these protected features are merged into the output of the models to preserve the original dataset information and enhance explainability. We empirically study four machine learning models (i.e., KNN, Decision Tree, Neural Net work, and Naive Bayes) and datasets for fairness benchmarking (i.e., COMPAS, Adult Census Income, and Credit Card Default). The observed results suggest that the proposed pipeline preserves the models’ performance and facilitate the extraction of information of the models’ to use in fairness metrics.

Metadados do item

id	UFRGS-2_d3adc1af1f78141b3b9fa224e2e21945
oai_identifier_str	oai:www.lume.ufrgs.br:10183/245286
network_acronym_str	UFRGS-2
network_name_str	Repositório Institucional da UFRGS
repository_id_str
spelling	Guntzel, Maurício HollerBarone, Dante Augusto CoutoCôrtes, Eduardo Gabriel2022-07-22T04:53:48Z2022http://hdl.handle.net/10183/245286001146016Increasingly, machine learning models perform high-stakes decisions in almost any do main. These models and the datasets - they are trained on– may be prone to exacerbating social disparities due to unmitigated fairness issues. For example, features representing different social groups are known as protected features– as stated by Equality Act of 2010; they correspond to one of these fairness issues. This work explores the impact of protected features on predictive models’ outcomes and their performance and fairness. We propose a knowledge-driven pipeline for detecting protected features and mitigating their effect. Protected features are defined based on metadata and are removed during the training phase of the models. Nevertheless, these protected features are merged into the output of the models to preserve the original dataset information and enhance explainability. We empirically study four machine learning models (i.e., KNN, Decision Tree, Neural Net work, and Naive Bayes) and datasets for fairness benchmarking (i.e., COMPAS, Adult Census Income, and Credit Card Default). The observed results suggest that the proposed pipeline preserves the models’ performance and facilitate the extraction of information of the models’ to use in fairness metrics.application/pdfengAprendizado de máquinaOleodutoBig dataPipelinefairnessmachine learningpositive outcomegroup fairnessFairness Though UnawarenessFairness in machine learning : an empirical experiment about protected features and their implicationsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisUniversidade Federal do Rio Grande do SulInstituto de InformáticaPorto Alegre, BR-RS2022Ciência da Computação: Ênfase em Ciência da Computação: Bachareladograduaçãoinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSTEXT001146016.pdf.txt001146016.pdf.txtExtracted Texttext/plain60068http://www.lume.ufrgs.br/bitstream/10183/245286/2/001146016.pdf.txt18db05681a42449f14bbb3199d515d06MD52ORIGINAL001146016.pdfTexto completoapplication/pdf7706013http://www.lume.ufrgs.br/bitstream/10183/245286/1/001146016.pdf39ee99e3c30c2d70937f83741012beeaMD5110183/2452862022-07-23 05:03:07.966685oai:www.lume.ufrgs.br:10183/245286Repositório de PublicaçõesPUBhttps://lume.ufrgs.br/oai/requestopendoar:2022-07-23T08:03:07Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false
dc.title.pt_BR.fl_str_mv	Fairness in machine learning : an empirical experiment about protected features and their implications
title	Fairness in machine learning : an empirical experiment about protected features and their implications
spellingShingle	Fairness in machine learning : an empirical experiment about protected features and their implications Guntzel, Maurício Holler Aprendizado de máquina Oleoduto Big data Pipeline fairness machine learning positive outcome group fairness Fairness Though Unawareness
title_short	Fairness in machine learning : an empirical experiment about protected features and their implications
title_full	Fairness in machine learning : an empirical experiment about protected features and their implications
title_fullStr	Fairness in machine learning : an empirical experiment about protected features and their implications
title_full_unstemmed	Fairness in machine learning : an empirical experiment about protected features and their implications
title_sort	Fairness in machine learning : an empirical experiment about protected features and their implications
author	Guntzel, Maurício Holler
author_facet	Guntzel, Maurício Holler
author_role	author
dc.contributor.author.fl_str_mv	Guntzel, Maurício Holler
dc.contributor.advisor1.fl_str_mv	Barone, Dante Augusto Couto
dc.contributor.advisor-co1.fl_str_mv	Côrtes, Eduardo Gabriel
contributor_str_mv	Barone, Dante Augusto Couto Côrtes, Eduardo Gabriel
dc.subject.por.fl_str_mv	Aprendizado de máquina Oleoduto Big data
topic	Aprendizado de máquina Oleoduto Big data Pipeline fairness machine learning positive outcome group fairness Fairness Though Unawareness
dc.subject.eng.fl_str_mv	Pipeline fairness machine learning positive outcome group fairness Fairness Though Unawareness
description	Increasingly, machine learning models perform high-stakes decisions in almost any do main. These models and the datasets - they are trained on– may be prone to exacerbating social disparities due to unmitigated fairness issues. For example, features representing different social groups are known as protected features– as stated by Equality Act of 2010; they correspond to one of these fairness issues. This work explores the impact of protected features on predictive models’ outcomes and their performance and fairness. We propose a knowledge-driven pipeline for detecting protected features and mitigating their effect. Protected features are defined based on metadata and are removed during the training phase of the models. Nevertheless, these protected features are merged into the output of the models to preserve the original dataset information and enhance explainability. We empirically study four machine learning models (i.e., KNN, Decision Tree, Neural Net work, and Naive Bayes) and datasets for fairness benchmarking (i.e., COMPAS, Adult Census Income, and Credit Card Default). The observed results suggest that the proposed pipeline preserves the models’ performance and facilitate the extraction of information of the models’ to use in fairness metrics.
publishDate	2022
dc.date.accessioned.fl_str_mv	2022-07-22T04:53:48Z
dc.date.issued.fl_str_mv	2022
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/bachelorThesis
format	bachelorThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10183/245286
dc.identifier.nrb.pt_BR.fl_str_mv	001146016
url	http://hdl.handle.net/10183/245286
identifier_str_mv	001146016
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFRGS instname:Universidade Federal do Rio Grande do Sul (UFRGS) instacron:UFRGS
instname_str	Universidade Federal do Rio Grande do Sul (UFRGS)
instacron_str	UFRGS
institution	UFRGS
reponame_str	Repositório Institucional da UFRGS
collection	Repositório Institucional da UFRGS
bitstream.url.fl_str_mv	http://www.lume.ufrgs.br/bitstream/10183/245286/2/001146016.pdf.txt http://www.lume.ufrgs.br/bitstream/10183/245286/1/001146016.pdf
bitstream.checksum.fl_str_mv	18db05681a42449f14bbb3199d515d06 39ee99e3c30c2d70937f83741012beea
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)
repository.mail.fl_str_mv
_version_	1801224638333714432

Fairness in machine learning : an empirical experiment about protected features and their implications

Registros relacionados