FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes

Detalhes bibliográficos
Autor(a) principal: Salazar, Teresa
Data de Publicação: 2021
Outros Autores: Santos, Miriam Seoane, Araújo, Helder, Abreu, Pedro Manuel Henriques da Cunha
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10316/101209
https://doi.org/10.1109/ACCESS.2021.3084121
Resumo: With the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS
id RCAP_ff816d1f28bde990de3fb5ba92f67919
oai_identifier_str oai:estudogeral.uc.pt:10316/101209
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive AttributesClassification biasfairnessimbalanced dataK-nearest neighborhoodoversamplingWith the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS2021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/101209http://hdl.handle.net/10316/101209https://doi.org/10.1109/ACCESS.2021.3084121eng2169-3536Salazar, TeresaSantos, Miriam SeoaneAraújo, HelderAbreu, Pedro Manuel Henriques da Cunhainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-08-17T23:02:08Zoai:estudogeral.uc.pt:10316/101209Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:18:27.092847Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
spellingShingle FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
Salazar, Teresa
Classification bias
fairness
imbalanced data
K-nearest neighborhood
oversampling
title_short FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_full FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_fullStr FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_full_unstemmed FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_sort FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
author Salazar, Teresa
author_facet Salazar, Teresa
Santos, Miriam Seoane
Araújo, Helder
Abreu, Pedro Manuel Henriques da Cunha
author_role author
author2 Santos, Miriam Seoane
Araújo, Helder
Abreu, Pedro Manuel Henriques da Cunha
author2_role author
author
author
dc.contributor.author.fl_str_mv Salazar, Teresa
Santos, Miriam Seoane
Araújo, Helder
Abreu, Pedro Manuel Henriques da Cunha
dc.subject.por.fl_str_mv Classification bias
fairness
imbalanced data
K-nearest neighborhood
oversampling
topic Classification bias
fairness
imbalanced data
K-nearest neighborhood
oversampling
description With the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS
publishDate 2021
dc.date.none.fl_str_mv 2021
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10316/101209
http://hdl.handle.net/10316/101209
https://doi.org/10.1109/ACCESS.2021.3084121
url http://hdl.handle.net/10316/101209
https://doi.org/10.1109/ACCESS.2021.3084121
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2169-3536
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134079114805248