Fairness in machine learning : an empirical experiment about protected features and their implications

Detalhes bibliográficos
Autor(a) principal: Guntzel, Maurício Holler
Data de Publicação: 2022
Tipo de documento: Trabalho de conclusão de curso
Idioma: eng
Título da fonte: Repositório Institucional da UFRGS
Texto Completo: http://hdl.handle.net/10183/245286
Resumo: Increasingly, machine learning models perform high-stakes decisions in almost any do main. These models and the datasets - they are trained on– may be prone to exacerbating social disparities due to unmitigated fairness issues. For example, features representing different social groups are known as protected features– as stated by Equality Act of 2010; they correspond to one of these fairness issues. This work explores the impact of protected features on predictive models’ outcomes and their performance and fairness. We propose a knowledge-driven pipeline for detecting protected features and mitigating their effect. Protected features are defined based on metadata and are removed during the training phase of the models. Nevertheless, these protected features are merged into the output of the models to preserve the original dataset information and enhance explainability. We empirically study four machine learning models (i.e., KNN, Decision Tree, Neural Net work, and Naive Bayes) and datasets for fairness benchmarking (i.e., COMPAS, Adult Census Income, and Credit Card Default). The observed results suggest that the proposed pipeline preserves the models’ performance and facilitate the extraction of information of the models’ to use in fairness metrics.
id UFRGS-2_d3adc1af1f78141b3b9fa224e2e21945
oai_identifier_str oai:www.lume.ufrgs.br:10183/245286
network_acronym_str UFRGS-2
network_name_str Repositório Institucional da UFRGS
repository_id_str
spelling Guntzel, Maurício HollerBarone, Dante Augusto CoutoCôrtes, Eduardo Gabriel2022-07-22T04:53:48Z2022http://hdl.handle.net/10183/245286001146016Increasingly, machine learning models perform high-stakes decisions in almost any do main. These models and the datasets - they are trained on– may be prone to exacerbating social disparities due to unmitigated fairness issues. For example, features representing different social groups are known as protected features– as stated by Equality Act of 2010; they correspond to one of these fairness issues. This work explores the impact of protected features on predictive models’ outcomes and their performance and fairness. We propose a knowledge-driven pipeline for detecting protected features and mitigating their effect. Protected features are defined based on metadata and are removed during the training phase of the models. Nevertheless, these protected features are merged into the output of the models to preserve the original dataset information and enhance explainability. We empirically study four machine learning models (i.e., KNN, Decision Tree, Neural Net work, and Naive Bayes) and datasets for fairness benchmarking (i.e., COMPAS, Adult Census Income, and Credit Card Default). The observed results suggest that the proposed pipeline preserves the models’ performance and facilitate the extraction of information of the models’ to use in fairness metrics.application/pdfengAprendizado de máquinaOleodutoBig dataPipelinefairnessmachine learningpositive outcomegroup fairnessFairness Though UnawarenessFairness in machine learning : an empirical experiment about protected features and their implicationsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisUniversidade Federal do Rio Grande do SulInstituto de InformáticaPorto Alegre, BR-RS2022Ciência da Computação: Ênfase em Ciência da Computação: Bachareladograduaçãoinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSTEXT001146016.pdf.txt001146016.pdf.txtExtracted Texttext/plain60068http://www.lume.ufrgs.br/bitstream/10183/245286/2/001146016.pdf.txt18db05681a42449f14bbb3199d515d06MD52ORIGINAL001146016.pdfTexto completoapplication/pdf7706013http://www.lume.ufrgs.br/bitstream/10183/245286/1/001146016.pdf39ee99e3c30c2d70937f83741012beeaMD5110183/2452862022-07-23 05:03:07.966685oai:www.lume.ufrgs.br:10183/245286Repositório de PublicaçõesPUBhttps://lume.ufrgs.br/oai/requestopendoar:2022-07-23T08:03:07Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false
dc.title.pt_BR.fl_str_mv Fairness in machine learning : an empirical experiment about protected features and their implications
title Fairness in machine learning : an empirical experiment about protected features and their implications
spellingShingle Fairness in machine learning : an empirical experiment about protected features and their implications
Guntzel, Maurício Holler
Aprendizado de máquina
Oleoduto
Big data
Pipeline
fairness
machine learning
positive outcome
group fairness
Fairness Though Unawareness
title_short Fairness in machine learning : an empirical experiment about protected features and their implications
title_full Fairness in machine learning : an empirical experiment about protected features and their implications
title_fullStr Fairness in machine learning : an empirical experiment about protected features and their implications
title_full_unstemmed Fairness in machine learning : an empirical experiment about protected features and their implications
title_sort Fairness in machine learning : an empirical experiment about protected features and their implications
author Guntzel, Maurício Holler
author_facet Guntzel, Maurício Holler
author_role author
dc.contributor.author.fl_str_mv Guntzel, Maurício Holler
dc.contributor.advisor1.fl_str_mv Barone, Dante Augusto Couto
dc.contributor.advisor-co1.fl_str_mv Côrtes, Eduardo Gabriel
contributor_str_mv Barone, Dante Augusto Couto
Côrtes, Eduardo Gabriel
dc.subject.por.fl_str_mv Aprendizado de máquina
Oleoduto
Big data
topic Aprendizado de máquina
Oleoduto
Big data
Pipeline
fairness
machine learning
positive outcome
group fairness
Fairness Though Unawareness
dc.subject.eng.fl_str_mv Pipeline
fairness
machine learning
positive outcome
group fairness
Fairness Though Unawareness
description Increasingly, machine learning models perform high-stakes decisions in almost any do main. These models and the datasets - they are trained on– may be prone to exacerbating social disparities due to unmitigated fairness issues. For example, features representing different social groups are known as protected features– as stated by Equality Act of 2010; they correspond to one of these fairness issues. This work explores the impact of protected features on predictive models’ outcomes and their performance and fairness. We propose a knowledge-driven pipeline for detecting protected features and mitigating their effect. Protected features are defined based on metadata and are removed during the training phase of the models. Nevertheless, these protected features are merged into the output of the models to preserve the original dataset information and enhance explainability. We empirically study four machine learning models (i.e., KNN, Decision Tree, Neural Net work, and Naive Bayes) and datasets for fairness benchmarking (i.e., COMPAS, Adult Census Income, and Credit Card Default). The observed results suggest that the proposed pipeline preserves the models’ performance and facilitate the extraction of information of the models’ to use in fairness metrics.
publishDate 2022
dc.date.accessioned.fl_str_mv 2022-07-22T04:53:48Z
dc.date.issued.fl_str_mv 2022
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/bachelorThesis
format bachelorThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10183/245286
dc.identifier.nrb.pt_BR.fl_str_mv 001146016
url http://hdl.handle.net/10183/245286
identifier_str_mv 001146016
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFRGS
instname:Universidade Federal do Rio Grande do Sul (UFRGS)
instacron:UFRGS
instname_str Universidade Federal do Rio Grande do Sul (UFRGS)
instacron_str UFRGS
institution UFRGS
reponame_str Repositório Institucional da UFRGS
collection Repositório Institucional da UFRGS
bitstream.url.fl_str_mv http://www.lume.ufrgs.br/bitstream/10183/245286/2/001146016.pdf.txt
http://www.lume.ufrgs.br/bitstream/10183/245286/1/001146016.pdf
bitstream.checksum.fl_str_mv 18db05681a42449f14bbb3199d515d06
39ee99e3c30c2d70937f83741012beea
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)
repository.mail.fl_str_mv
_version_ 1801224638333714432