Guidelines for the Assessment of Black-box Interpretability Methods

Detalhes bibliográficos
Autor(a) principal: Araujo, Gabriel Gazetta de
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: https://www.teses.usp.br/teses/disponiveis/55/55134/tde-13102022-112418/
Resumo: With the rise of deep learning and complex machine learning algorithms, higher performance has been sought to reach equally high accuracy in a variety of environments and applications. The search for high accuracy has led to complex predictive models known as black-boxes that do not offer access to their decision-making processes: these models provide little to no explanations on why a certain outcome has resulted or what influenced that outcome. Unfortunately, these drawbacks can be utterly significant especially with sensitive scenarios such as legal, social, medical or financial applications that a misclassified outcome or even an outcome classified for the wrong reason might cause tremendous impacts. Driven by this consternation, interpretability techniques have come into play in an effort to bring, through a variety of methods, explanations to the outcome of a black-box model or even the reasoning behind that model, or sometimes proposing an interpretable predicting algorithm altogether. However, these techniques are not well established yet, which means that they are in constant development; similarly, the assessment of these techniques is also lacking. Currently, there is not a consensus on how they can be evaluated or even what properties interpretability methods are supposed to meet. Driven by that gap, this work proposes a set of evaluation metrics that are capable of calculating three desired properties obtained from interpretability techniques. These metrics can be used to assess and determine the best parameters or the best interpretability technique for determined experiments.
id USP_0f54d2ecd75526f82260e228b86f9a59
oai_identifier_str oai:teses.usp.br:tde-13102022-112418
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Guidelines for the Assessment of Black-box Interpretability MethodsDiretrizes para avaliação de técnicas de Interpretabilidade de modelos Caixa-PretaAprendizado de máquinaAprendizado profundoAssessmentAvaliaçãoBlack-boxDeep learningInterpretabilidadeInterpretabilityMachine learningModelos caixa-pretaNeural networksRedes neuraisWith the rise of deep learning and complex machine learning algorithms, higher performance has been sought to reach equally high accuracy in a variety of environments and applications. The search for high accuracy has led to complex predictive models known as black-boxes that do not offer access to their decision-making processes: these models provide little to no explanations on why a certain outcome has resulted or what influenced that outcome. Unfortunately, these drawbacks can be utterly significant especially with sensitive scenarios such as legal, social, medical or financial applications that a misclassified outcome or even an outcome classified for the wrong reason might cause tremendous impacts. Driven by this consternation, interpretability techniques have come into play in an effort to bring, through a variety of methods, explanations to the outcome of a black-box model or even the reasoning behind that model, or sometimes proposing an interpretable predicting algorithm altogether. However, these techniques are not well established yet, which means that they are in constant development; similarly, the assessment of these techniques is also lacking. Currently, there is not a consensus on how they can be evaluated or even what properties interpretability methods are supposed to meet. Driven by that gap, this work proposes a set of evaluation metrics that are capable of calculating three desired properties obtained from interpretability techniques. These metrics can be used to assess and determine the best parameters or the best interpretability technique for determined experiments.Com o surgimento de redes neurais profundas e algorítmos complexos de aprendizado de máquina, tem-se buscando cada vez mais maiores performances com o objetivo de alcançar melhores acurácias de classificação em uma variedade de aplicações. A busca por maior acurácia leva a modelos preditivos complexos conhecidos como caixas-pretas, que não oferecem acesso ao processo interno de decisão: estes modelos providenciam pouca ou nenhuma explicação no motivo pelo qual um determinado resultado foi obtido ou o que influenciou este resultado. Infelizmente, estas desvantagens podem ser impactantes especialmente em aplicações sensíveis como em cenários legais, sociais, médicos ou financeiros em que uma classificação errada ou uma classificação gerada por motivos errados pode causar impactos significativos. Motivados por esta preocupação, técnicas de interpretabilidade começam a surgir com o objetivo de trazer, por uma variedade de métodos, explicações para resultados de modelos caixa-preta, ou então propondo algorítmos preditivos originalmente interpretáveis. Porém, tais técnicas ainda não são maduras e estão em constante desenvolvimento; da mesma forma, a avaliação de tais técnicas também carecem de amadurecimento. Atualmente, não há um consenso em como elas podem ser avaliadas ou comparadas, ou então quais propriedades elas devem garantir. Este trabalho, partindo desta lacuna, propõe um conjunto de métricas avaliativas capazes de calcular três propriedades de técnicas de interpretabilidade. Tais métricas podem ser usadas para avaliar parâmetros ou determinar a melhor ferramenta de interpretabilidade para determinados experimentos.Biblioteca Digitais de Teses e Dissertações da USPNonato, Luis GustavoAraujo, Gabriel Gazetta de2022-08-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-13102022-112418/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2022-10-13T14:29:36Zoai:teses.usp.br:tde-13102022-112418Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212022-10-13T14:29:36Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Guidelines for the Assessment of Black-box Interpretability Methods
Diretrizes para avaliação de técnicas de Interpretabilidade de modelos Caixa-Preta
title Guidelines for the Assessment of Black-box Interpretability Methods
spellingShingle Guidelines for the Assessment of Black-box Interpretability Methods
Araujo, Gabriel Gazetta de
Aprendizado de máquina
Aprendizado profundo
Assessment
Avaliação
Black-box
Deep learning
Interpretabilidade
Interpretability
Machine learning
Modelos caixa-preta
Neural networks
Redes neurais
title_short Guidelines for the Assessment of Black-box Interpretability Methods
title_full Guidelines for the Assessment of Black-box Interpretability Methods
title_fullStr Guidelines for the Assessment of Black-box Interpretability Methods
title_full_unstemmed Guidelines for the Assessment of Black-box Interpretability Methods
title_sort Guidelines for the Assessment of Black-box Interpretability Methods
author Araujo, Gabriel Gazetta de
author_facet Araujo, Gabriel Gazetta de
author_role author
dc.contributor.none.fl_str_mv Nonato, Luis Gustavo
dc.contributor.author.fl_str_mv Araujo, Gabriel Gazetta de
dc.subject.por.fl_str_mv Aprendizado de máquina
Aprendizado profundo
Assessment
Avaliação
Black-box
Deep learning
Interpretabilidade
Interpretability
Machine learning
Modelos caixa-preta
Neural networks
Redes neurais
topic Aprendizado de máquina
Aprendizado profundo
Assessment
Avaliação
Black-box
Deep learning
Interpretabilidade
Interpretability
Machine learning
Modelos caixa-preta
Neural networks
Redes neurais
description With the rise of deep learning and complex machine learning algorithms, higher performance has been sought to reach equally high accuracy in a variety of environments and applications. The search for high accuracy has led to complex predictive models known as black-boxes that do not offer access to their decision-making processes: these models provide little to no explanations on why a certain outcome has resulted or what influenced that outcome. Unfortunately, these drawbacks can be utterly significant especially with sensitive scenarios such as legal, social, medical or financial applications that a misclassified outcome or even an outcome classified for the wrong reason might cause tremendous impacts. Driven by this consternation, interpretability techniques have come into play in an effort to bring, through a variety of methods, explanations to the outcome of a black-box model or even the reasoning behind that model, or sometimes proposing an interpretable predicting algorithm altogether. However, these techniques are not well established yet, which means that they are in constant development; similarly, the assessment of these techniques is also lacking. Currently, there is not a consensus on how they can be evaluated or even what properties interpretability methods are supposed to meet. Driven by that gap, this work proposes a set of evaluation metrics that are capable of calculating three desired properties obtained from interpretability techniques. These metrics can be used to assess and determine the best parameters or the best interpretability technique for determined experiments.
publishDate 2022
dc.date.none.fl_str_mv 2022-08-08
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/55/55134/tde-13102022-112418/
url https://www.teses.usp.br/teses/disponiveis/55/55134/tde-13102022-112418/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815256947711016960