Ensemble learning through Rashomon sets

Gianlucca Lodron Zuin

Ensemble learning through Rashomon sets

Detalhes bibliográficos
Autor(a) principal:	Gianlucca Lodron Zuin
Data de Publicação:	2023
Tipo de documento:	Tese
Idioma:	eng
Título da fonte:	Repositório Institucional da UFMG
Texto Completo:	http://hdl.handle.net/1843/52748 https://orcid.org/0000-0002-0429-3280
Resumo:	Creating models from previous observations and ensuring effectiveness on new data is the essence of machine learning. Therefore, estimating the generalization error of a trained model is a crucial step. Despite the existence of many capacity measures that approximate the generalization power of trained models, it is still challenging to select models that generalize to future data. In this work, we investigate how models perform in datasets that have different underlying generator functions but constitute co-related tasks. The key motivation is to study the Rashomon Effect, which appears whenever the learning problem admits a set of models that all perform roughly equally well. Many real-world problems are characterized by multiple local structures in the data space and, as a result, the corresponding learning problem has a non-convex error surface with no obvious global minimum, thus implying a multiplicity of performant models, each of them providing a different explanation, which literature suggests to being subject to the Rashomon Effect. Through an empirical study across different datasets, we devise a strategy focusing primarily on model explainability (i.e., feature importance). Our approach to deal with the Rashomon Effect is to stratify, during training, models into groups that are either coherent or contrasting. From these Rashomon groups, we can select models that increase the robustness of the production responses along with means to gauge data drift. We present performance gains on most of the evaluated scenarios by locating these models and creating an ensemble guaranteeing that each constituent covers an independent solution sub-space. We validate our approach by performing a series of experiments in both closed and open-source benchmark suites and give insights into the possible applications by analyzing real-world case studies in which our framework was employed with success. Not only does our approach prove to be superior to state-of-the-art tree-based ensembling techniques, with gains in AUC of up to .20+, but the constituent models are highly explainable and allow for the integration of humans into the decision-making pipeline, thus empowering them.

Metadados do item

id	UFMG_9e3e9336dbe10d208faea082e2a962fd
oai_identifier_str	oai:repositorio.ufmg.br:1843/52748
network_acronym_str	UFMG
network_name_str	Repositório Institucional da UFMG
repository_id_str
spelling	Adriano Alonso Velosohttp://lattes.cnpq.br/9973021912226739Wagner Meira JúniorNivio ZivianiPaulo Najberg OrensteinRam RajagopalRafael Bordinihttp://lattes.cnpq.br/5374827345329774Gianlucca Lodron Zuin2023-05-03T14:49:44Z2023-05-03T14:49:44Z2023-01-05http://hdl.handle.net/1843/52748https://orcid.org/0000-0002-0429-3280Creating models from previous observations and ensuring effectiveness on new data is the essence of machine learning. Therefore, estimating the generalization error of a trained model is a crucial step. Despite the existence of many capacity measures that approximate the generalization power of trained models, it is still challenging to select models that generalize to future data. In this work, we investigate how models perform in datasets that have different underlying generator functions but constitute co-related tasks. The key motivation is to study the Rashomon Effect, which appears whenever the learning problem admits a set of models that all perform roughly equally well. Many real-world problems are characterized by multiple local structures in the data space and, as a result, the corresponding learning problem has a non-convex error surface with no obvious global minimum, thus implying a multiplicity of performant models, each of them providing a different explanation, which literature suggests to being subject to the Rashomon Effect. Through an empirical study across different datasets, we devise a strategy focusing primarily on model explainability (i.e., feature importance). Our approach to deal with the Rashomon Effect is to stratify, during training, models into groups that are either coherent or contrasting. From these Rashomon groups, we can select models that increase the robustness of the production responses along with means to gauge data drift. We present performance gains on most of the evaluated scenarios by locating these models and creating an ensemble guaranteeing that each constituent covers an independent solution sub-space. We validate our approach by performing a series of experiments in both closed and open-source benchmark suites and give insights into the possible applications by analyzing real-world case studies in which our framework was employed with success. Not only does our approach prove to be superior to state-of-the-art tree-based ensembling techniques, with gains in AUC of up to .20+, but the constituent models are highly explainable and allow for the integration of humans into the decision-making pipeline, thus empowering them.Resumo Criar modelos a partir de observações e garantir a eficácia em novos dados é a essência do aprendizado de máquina. Portanto, estimar o erro de generalização de um modelo é um passo crucial. Apesar da existência de muitas métricas de desempenho que aproximam o poder de generalização, ainda é um desafio selecionar modelos que generalizem para dados futuros desconhecidos. Neste trabalho, investigamos como os modelos se comportam em conjuntos de dados que possuam diferentes funções geradoras, mas constituem tarefas correlatas. A principal motivação é estudar o Efeito Rashomon, que aparece sempre que o problema de aprendizagem admite um conjunto de soluções que apresentam desempenho semelhante. Muitos problemas do mundo real são caracterizados por múltiplas estruturas locais no espaço de dados e, como resultado, o problema de aprendizagem correspondente apresenta uma superfície de erro não convexa sem mínimo global óbvio, implicando assim uma multiplicidade de modelos performantes, cada um deles fornecendo uma explicação diferente. A literatura sugere este tipo de problema estar sujeito ao Efeito Rashomon. Por meio de um estudo empírico em diferentes conjuntos de dados, elaboramos uma estratégia focada na explicabilidade, especificamente na importância de variáveis. Nossa abordagem para lidar com o Efeito Rashomon é estratificar, durante o treinamento, modelos em grupos que sejam coerentes entre si ou contrastantes. A partir desses grupos, podemos selecionar modelos que aumentem a robustez das respostas em tempo de produção, sendo também capazes de medir possíveis desvios nos dados. Apresentamos ganhos de desempenho na maioria dos cenários avaliados ao criar um comitê de modelos e garantir que cada constituinte cubra um subespaço independente da solução. Validamos nossa abordagem em conjuntos de dados fechados e abertos, fornecendo intuições sobre possíveis aplicações ao analisar alguns estudos de caso do mundo real nos quais nosso método foi empregado com sucesso. Não apenas nossa abordagem provou ser superior ao estado-da-arte a comitês baseados em árvores, com ganhos em AUC de até 0,20+, mas os constituintes são altamente explicáveis e permitem a integração de humanos no processo de tomada de decisão do modelo, assim os tornando mais eficientes.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoFAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGBrasilICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃOhttp://creativecommons.org/licenses/by/3.0/pt/info:eu-repo/semantics/openAccessRashomon EffectEnsemble LearningData DriftEnsemble learning through Rashomon setsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALmain.pdfmain.pdfTeseapplication/pdf16055124https://repositorio.ufmg.br/bitstream/1843/52748/1/main.pdf22bd23173f0d09754814755ce1c1744cMD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8914https://repositorio.ufmg.br/bitstream/1843/52748/2/license_rdff9944a358a0c32770bd9bed185bb5395MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/52748/3/license.txtcda590c95a0b51b4d15f60c9642ca272MD531843/527482023-05-03 11:49:44.919oai:repositorio.ufmg.br:1843/52748TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2023-05-03T14:49:44Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv	Ensemble learning through Rashomon sets
title	Ensemble learning through Rashomon sets
spellingShingle	Ensemble learning through Rashomon sets Gianlucca Lodron Zuin Rashomon Effect Ensemble Learning Data Drift
title_short	Ensemble learning through Rashomon sets
title_full	Ensemble learning through Rashomon sets
title_fullStr	Ensemble learning through Rashomon sets
title_full_unstemmed	Ensemble learning through Rashomon sets
title_sort	Ensemble learning through Rashomon sets
author	Gianlucca Lodron Zuin
author_facet	Gianlucca Lodron Zuin
author_role	author
dc.contributor.advisor1.fl_str_mv	Adriano Alonso Veloso
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/9973021912226739
dc.contributor.referee1.fl_str_mv	Wagner Meira Júnior
dc.contributor.referee2.fl_str_mv	Nivio Ziviani
dc.contributor.referee3.fl_str_mv	Paulo Najberg Orenstein
dc.contributor.referee4.fl_str_mv	Ram Rajagopal
dc.contributor.referee5.fl_str_mv	Rafael Bordini
dc.contributor.authorLattes.fl_str_mv	http://lattes.cnpq.br/5374827345329774
dc.contributor.author.fl_str_mv	Gianlucca Lodron Zuin
contributor_str_mv	Adriano Alonso Veloso Wagner Meira Júnior Nivio Ziviani Paulo Najberg Orenstein Ram Rajagopal Rafael Bordini
dc.subject.por.fl_str_mv	Rashomon Effect Ensemble Learning Data Drift
topic	Rashomon Effect Ensemble Learning Data Drift
description	Creating models from previous observations and ensuring effectiveness on new data is the essence of machine learning. Therefore, estimating the generalization error of a trained model is a crucial step. Despite the existence of many capacity measures that approximate the generalization power of trained models, it is still challenging to select models that generalize to future data. In this work, we investigate how models perform in datasets that have different underlying generator functions but constitute co-related tasks. The key motivation is to study the Rashomon Effect, which appears whenever the learning problem admits a set of models that all perform roughly equally well. Many real-world problems are characterized by multiple local structures in the data space and, as a result, the corresponding learning problem has a non-convex error surface with no obvious global minimum, thus implying a multiplicity of performant models, each of them providing a different explanation, which literature suggests to being subject to the Rashomon Effect. Through an empirical study across different datasets, we devise a strategy focusing primarily on model explainability (i.e., feature importance). Our approach to deal with the Rashomon Effect is to stratify, during training, models into groups that are either coherent or contrasting. From these Rashomon groups, we can select models that increase the robustness of the production responses along with means to gauge data drift. We present performance gains on most of the evaluated scenarios by locating these models and creating an ensemble guaranteeing that each constituent covers an independent solution sub-space. We validate our approach by performing a series of experiments in both closed and open-source benchmark suites and give insights into the possible applications by analyzing real-world case studies in which our framework was employed with success. Not only does our approach prove to be superior to state-of-the-art tree-based ensembling techniques, with gains in AUC of up to .20+, but the constituent models are highly explainable and allow for the integration of humans into the decision-making pipeline, thus empowering them.
publishDate	2023
dc.date.accessioned.fl_str_mv	2023-05-03T14:49:44Z
dc.date.available.fl_str_mv	2023-05-03T14:49:44Z
dc.date.issued.fl_str_mv	2023-01-05
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/1843/52748
dc.identifier.orcid.pt_BR.fl_str_mv	https://orcid.org/0000-0002-0429-3280
url	http://hdl.handle.net/1843/52748 https://orcid.org/0000-0002-0429-3280
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	http://creativecommons.org/licenses/by/3.0/pt/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by/3.0/pt/
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv	UFMG
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
publisher.none.fl_str_mv	Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG
instname_str	Universidade Federal de Minas Gerais (UFMG)
instacron_str	UFMG
institution	UFMG
reponame_str	Repositório Institucional da UFMG
collection	Repositório Institucional da UFMG
bitstream.url.fl_str_mv	https://repositorio.ufmg.br/bitstream/1843/52748/1/main.pdf https://repositorio.ufmg.br/bitstream/1843/52748/2/license_rdf https://repositorio.ufmg.br/bitstream/1843/52748/3/license.txt
bitstream.checksum.fl_str_mv	22bd23173f0d09754814755ce1c1744c f9944a358a0c32770bd9bed185bb5395 cda590c95a0b51b4d15f60c9642ca272
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_	1803589381536088064

Ensemble learning through Rashomon sets

Registros relacionados