Assessing the Fairness of Intelligent Systems

Valentim, Inês Filipa Rente

Assessing the Fairness of Intelligent Systems

Detalhes bibliográficos
Autor(a) principal:	Valentim, Inês Filipa Rente
Data de Publicação:	2019
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10316/87310
Resumo:	Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia

Metadados do item

id	RCAP_0c7cdcb24b22580aacfc79fdc05e6f67
oai_identifier_str	oai:estudogeral.uc.pt:10316/87310
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Assessing the Fairness of Intelligent SystemsAvaliação da Fairness de Sistemas InteligentesAprendizagem ComputacionalDiscriminaçãoFairnessSistemas InteligentesTomada de DecisãoDecision MakingDiscriminationFairnessIntelligent SystemsMachine LearningDissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e TecnologiaAtualmente, os sistemas de software baseados em modelos de Aprendizagem Computacional são ubíquos, sendo muitas vezes usados em cenários que afetam diretamente a vida das pessoas. Consequentemente, surgem diversas preocupações sociais e legais, nomeadamente que as decisões suportadas pelos resultados dos modelos possam levar ao tratamento menos favorável de alguns indivíduos, com base em atributos como raça, idade, ou sexo. Na realidade, a fairness é uma das propriedades que os sistemas devem possuir para que cumpram legislação atual, tal como o Regulamento Geral sobre a Proteção de Dados da UE.O objetivo principal deste trabalho é avaliar a fairness de sistemas baseados em modelos de Aprendizagem Computacional, em problemas de classificação. A preparação e o pré-processamento de dados são fulcrais em qualquer pipeline de Aprendizagem Computacional, sendo que era necessário estudar o seu efeito em termos de fairness. Nesta perspetiva, avaliámos o impacto do encoding de atributos categóricos, a remoção do atributo sensível dos dados de treino, e mecanismos de amostragem, como random undersampling e random oversampling. A influência do algoritmo de aprendizagem foi também tida em conta, sendo avaliadas Árvores de Decisão e Random Forests. Medimos a fairness em diferentes etapas do pipeline para compreender os fatores com maior impacto nesta propriedade.Os resultados mostram que fazer uma amostragem de acordo com o output esperado e optar por Random Forests em vez de Árvores de Decisão tende a ter efeitos negativos na fairness. Embora a remoção do atributo sensível dos dados de treino elimine a discriminação direta, os modelos são ainda assim capazes de explorar associações entre este atributo e os restantes, sendo que algumas vezes as classificações acabam mesmo por ser mais injustas que os próprios dados. Desta forma, é necessário que as organizações estejam cientes deste compromisso entre desempenho e fairness, avaliando-o de forma cuidada.Nowadays, software systems based on Machine Learning models are ubiquitous, often being used in scenarios that directly affect people's lives. Consequently, societal and legal concerns arise, namely that decisions supported by the models' outputs may lead to the unfair treatment of individuals, based on attributes like race, age, or sex. In fact, fairness is one of the properties systems must have to be compliant with current legislation, namely the EU General Data Protection Regulation.The main objective of this work is to assess the fairness of software systems based on Machine Learning models in classification scenarios. Data preparation and pre-processing are key on any Machine Learning pipeline, and their effect on fairness needed to be studied in detail. Thus, we assessed the impact of the encoding of the categorical features, the removal of the sensitive attribute from the training data, as well as sampling methods, such as random undersampling and random oversampling. The influence of the learning algorithm was also considered, with an initial evaluation of Decision Trees and Random Forests. Fairness was measured at different stages of the pipeline to understand the procedures with the most impact on it.Our results show that performing sampling with respect to the true labels and opting for Random Forests over Decision Trees often has a negative effect on fairness. Although removing the sensitive attribute from the training data prevents incurring in direct discrimination, the models are often still able to explore associations between this attribute and the remaining features, with the resulting classifications sometimes even being more unfair than the data. As a result, organisations must be aware of and carefully assess the trade-off between classification performance and fairness.H20202019-07-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesishttp://hdl.handle.net/10316/87310http://hdl.handle.net/10316/87310TID:202267180engValentim, Inês Filipa Renteinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2020-05-25T03:37:59Zoai:estudogeral.uc.pt:10316/87310Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:08:15.622580Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Assessing the Fairness of Intelligent Systems Avaliação da Fairness de Sistemas Inteligentes
title	Assessing the Fairness of Intelligent Systems
spellingShingle	Assessing the Fairness of Intelligent Systems Valentim, Inês Filipa Rente Aprendizagem Computacional Discriminação Fairness Sistemas Inteligentes Tomada de Decisão Decision Making Discrimination Fairness Intelligent Systems Machine Learning
title_short	Assessing the Fairness of Intelligent Systems
title_full	Assessing the Fairness of Intelligent Systems
title_fullStr	Assessing the Fairness of Intelligent Systems
title_full_unstemmed	Assessing the Fairness of Intelligent Systems
title_sort	Assessing the Fairness of Intelligent Systems
author	Valentim, Inês Filipa Rente
author_facet	Valentim, Inês Filipa Rente
author_role	author
dc.contributor.author.fl_str_mv	Valentim, Inês Filipa Rente
dc.subject.por.fl_str_mv	Aprendizagem Computacional Discriminação Fairness Sistemas Inteligentes Tomada de Decisão Decision Making Discrimination Fairness Intelligent Systems Machine Learning
topic	Aprendizagem Computacional Discriminação Fairness Sistemas Inteligentes Tomada de Decisão Decision Making Discrimination Fairness Intelligent Systems Machine Learning
description	Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia
publishDate	2019
dc.date.none.fl_str_mv	2019-07-08
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10316/87310 http://hdl.handle.net/10316/87310 TID:202267180
url	http://hdl.handle.net/10316/87310
identifier_str_mv	TID:202267180
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799133975092920320

Assessing the Fairness of Intelligent Systems

Registros relacionados