FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10316/101209 https://doi.org/10.1109/ACCESS.2021.3084121 |
Resumo: | With the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS |
id |
RCAP_ff816d1f28bde990de3fb5ba92f67919 |
---|---|
oai_identifier_str |
oai:estudogeral.uc.pt:10316/101209 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive AttributesClassification biasfairnessimbalanced dataK-nearest neighborhoodoversamplingWith the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS2021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/101209http://hdl.handle.net/10316/101209https://doi.org/10.1109/ACCESS.2021.3084121eng2169-3536Salazar, TeresaSantos, Miriam SeoaneAraújo, HelderAbreu, Pedro Manuel Henriques da Cunhainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-08-17T23:02:08Zoai:estudogeral.uc.pt:10316/101209Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:18:27.092847Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
spellingShingle |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes Salazar, Teresa Classification bias fairness imbalanced data K-nearest neighborhood oversampling |
title_short |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_full |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_fullStr |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_full_unstemmed |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_sort |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
author |
Salazar, Teresa |
author_facet |
Salazar, Teresa Santos, Miriam Seoane Araújo, Helder Abreu, Pedro Manuel Henriques da Cunha |
author_role |
author |
author2 |
Santos, Miriam Seoane Araújo, Helder Abreu, Pedro Manuel Henriques da Cunha |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Salazar, Teresa Santos, Miriam Seoane Araújo, Helder Abreu, Pedro Manuel Henriques da Cunha |
dc.subject.por.fl_str_mv |
Classification bias fairness imbalanced data K-nearest neighborhood oversampling |
topic |
Classification bias fairness imbalanced data K-nearest neighborhood oversampling |
description |
With the increased use of machine learning algorithms to make decisions which impact people's lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups de ned by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes' distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes' imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more dif cult to learn by the classi ers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identi ed. We test the impact of FAWOS on different learning classi ers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classi ers while not neglecting the classi cation performance. Source code can be found at: https://github.com/teresalazar13/FAWOS |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10316/101209 http://hdl.handle.net/10316/101209 https://doi.org/10.1109/ACCESS.2021.3084121 |
url |
http://hdl.handle.net/10316/101209 https://doi.org/10.1109/ACCESS.2021.3084121 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
2169-3536 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134079114805248 |