FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes

Salazar, Teresa; Santos, Miriam Seoane; Araújo, Helder; Abreu, Pedro Manuel Henriques da Cunha

FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes

Detalhes bibliográficos
Autor(a) principal:	Salazar, Teresa
Data de Publicação:	2021
Outros Autores:	Santos, Miriam Seoane, Araújo, Helder, Abreu, Pedro Manuel Henriques da Cunha
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10316/101209 https://doi.org/10.1109/ACCESS.2021.3084121
Resumo:	With the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS

Metadados do item

id	RCAP_ff816d1f28bde990de3fb5ba92f67919
oai_identifier_str	oai:estudogeral.uc.pt:10316/101209
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive AttributesClassification biasfairnessimbalanced dataK-nearest neighborhoodoversamplingWith the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS2021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/101209http://hdl.handle.net/10316/101209https://doi.org/10.1109/ACCESS.2021.3084121eng2169-3536Salazar, TeresaSantos, Miriam SeoaneAraújo, HelderAbreu, Pedro Manuel Henriques da Cunhainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-08-17T23:02:08Zoai:estudogeral.uc.pt:10316/101209Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:18:27.092847Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
spellingShingle	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes Salazar, Teresa Classification bias fairness imbalanced data K-nearest neighborhood oversampling
title_short	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_full	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_fullStr	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_full_unstemmed	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_sort	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
author	Salazar, Teresa
author_facet	Salazar, Teresa Santos, Miriam Seoane Araújo, Helder Abreu, Pedro Manuel Henriques da Cunha
author_role	author
author2	Santos, Miriam Seoane Araújo, Helder Abreu, Pedro Manuel Henriques da Cunha
author2_role	author author author
dc.contributor.author.fl_str_mv	Salazar, Teresa Santos, Miriam Seoane Araújo, Helder Abreu, Pedro Manuel Henriques da Cunha
dc.subject.por.fl_str_mv	Classification bias fairness imbalanced data K-nearest neighborhood oversampling
topic	Classification bias fairness imbalanced data K-nearest neighborhood oversampling
description	With the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS
publishDate	2021
dc.date.none.fl_str_mv	2021
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10316/101209 http://hdl.handle.net/10316/101209 https://doi.org/10.1109/ACCESS.2021.3084121
url	http://hdl.handle.net/10316/101209 https://doi.org/10.1109/ACCESS.2021.3084121
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	2169-3536
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799134079114805248

FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes

Registros relacionados