Understanding the search space of methods for automatically designing graph neural networks

Matheus Henrique do Nascimento Nunes

Understanding the search space of methods for automatically designing graph neural networks

Detalhes bibliográficos
Autor(a) principal:	Matheus Henrique do Nascimento Nunes
Data de Publicação:	2021
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Institucional da UFMG
Texto Completo:	http://hdl.handle.net/1843/47526 https://orcid.org/0000-0001-5975-7903
Resumo:	Graph-structured data has become increasingly available and, due to its ubiquity, an object of study in many areas of research. Due to the absence of the notion of sequence in graphs, Machine Learning (ML) methods have historically struggled to work on this data. Specialized methods for performing ML over graph data have gained a lot of attention from the research community, especially Graph Neural Networks (GNNs), which have been extensively used over real-world data, achieving state-of-the-art results in tasks such as circuit design, movie recommendation, and anomaly detection. Many GNN models have been recently proposed, and choosing the best model for each problem has become a cumbersome and error-prone task. Aiming at mitigating this problem, recent works have proposed strategies for applying Neural Architecture Search (NAS) - a set of methods designed to automatically configure neural networks, very successful on Convolutional Neural Networks, that deal with image data - to GNN models. Automatically configured GNNs have achieved high performance results, surpassing human-crafted ones. However, the NAS for GNNs literature is still in its early stages, and methods that have been successfully applied for NAS in CNNs have yet to be tested on GNNs as well. In this work we have conducted a comprehensive comparative analysis of a proposed Evolutionary Algorithm against a literature Reinforcement Learning and a simple Random Search baseline, considering 7 real-world datasets and two search spaces. We have shown that Random Search is just as effective in finding good performing architectures as other more complex methods. Another interesting finding is that all three search methods converge very early in the search (in about 10% of the budget). We hypothesized that this might have been happening due to the presence of Neutrality (regions in which all solutions have very similar performance values) in the search space. Shifting the focus from the first part of this work, in the second part we have conducted an extensive visual and analytical evaluation of one of the literature's search spaces, using dimensionality reduction and Fitness Landscape Analysis techniques. We have demonstrated that the search space for NAS in GNNs presents high searchability (i.e. it is not difficult for algorithms to explore and find a suitable solution) and neutrality (i.e. there are many regions in the search space in which the performance of the neighboring solutions are relatively equal). We hypothesize that in the future, less expensive methods can be used to perform the optimization in this scenario without loss of generality.

Metadados do item

id	UFMG_2d28dc1165e5b59c0398b04dcc00ddc4
oai_identifier_str	oai:repositorio.ufmg.br:1843/47526
network_acronym_str	UFMG
network_name_str	Repositório Institucional da UFMG
repository_id_str
spelling	Gisele Lobo Pappahttp://lattes.cnpq.br/5936682335701497Fabrício Murai FerreiraNuno Lourençohttp://lattes.cnpq.br/9801186721884441Matheus Henrique do Nascimento Nunes2022-11-29T12:32:33Z2022-11-29T12:32:33Z2021-12-07http://hdl.handle.net/1843/47526https://orcid.org/0000-0001-5975-7903Graph-structured data has become increasingly available and, due to its ubiquity, an object of study in many areas of research. Due to the absence of the notion of sequence in graphs, Machine Learning (ML) methods have historically struggled to work on this data. Specialized methods for performing ML over graph data have gained a lot of attention from the research community, especially Graph Neural Networks (GNNs), which have been extensively used over real-world data, achieving state-of-the-art results in tasks such as circuit design, movie recommendation, and anomaly detection. Many GNN models have been recently proposed, and choosing the best model for each problem has become a cumbersome and error-prone task. Aiming at mitigating this problem, recent works have proposed strategies for applying Neural Architecture Search (NAS) - a set of methods designed to automatically configure neural networks, very successful on Convolutional Neural Networks, that deal with image data - to GNN models. Automatically configured GNNs have achieved high performance results, surpassing human-crafted ones. However, the NAS for GNNs literature is still in its early stages, and methods that have been successfully applied for NAS in CNNs have yet to be tested on GNNs as well. In this work we have conducted a comprehensive comparative analysis of a proposed Evolutionary Algorithm against a literature Reinforcement Learning and a simple Random Search baseline, considering 7 real-world datasets and two search spaces. We have shown that Random Search is just as effective in finding good performing architectures as other more complex methods. Another interesting finding is that all three search methods converge very early in the search (in about 10% of the budget). We hypothesized that this might have been happening due to the presence of Neutrality (regions in which all solutions have very similar performance values) in the search space. Shifting the focus from the first part of this work, in the second part we have conducted an extensive visual and analytical evaluation of one of the literature's search spaces, using dimensionality reduction and Fitness Landscape Analysis techniques. We have demonstrated that the search space for NAS in GNNs presents high searchability (i.e. it is not difficult for algorithms to explore and find a suitable solution) and neutrality (i.e. there are many regions in the search space in which the performance of the neighboring solutions are relatively equal). We hypothesize that in the future, less expensive methods can be used to perform the optimization in this scenario without loss of generality.Dados estruturados em formato de grafos têm se tornado cada vez mais disponíveis, e devido à sua ubiquidade, têm se tornado objeto de estudo em várias áreas de pesquisa. Dada a ausência da noção de sequência entre elementos em um grafo, algoritmos de Aprendizado de Máquina (ML, em inglês) têm historicamente enfrentado dificuldades em trabalhar com este tipo de dados. Métodos especializados para grafos têm ganhado atenção da comunidade de pesquisa recentemente, especialmente as Redes Neurais para Grafos (GNNs, em inglês), que têm sido extensivamente utilizadas em dados reais, alcançando resultados estado-da-arte em tarefas como projeto de circuitos, recomendação de filmes e detecção de anomalias. Uma gama de modelos de GNN foi proposta recentemente, e escolher o melhor modelo para cada tarefa tem se tornado uma tarefa complicada e propensa a erros. Objetivando mitigar este problema, trabalhos recentes têm investigado estratégias para aplicar Busca de Arquitetura Neurais (NAS, em inglês) - um conjunto de métodos projetados para automaticamente configurar redes neurais, que têm obtido muito sucesso em Redes Neurais Convolucionais (CNNs, em inglês), que lidam com imagens - para modelos de GNN. GNNs automaticamente configuradas têm alcançado bons resultados em performance, superando redes configuradas por humanos. Porém, a literatura de NAS para GNNs ainda está em seus estágios iniciais, e métodos que foram aplicados com sucesso para NAS em CNNs, ainda não foram testados para GNNs. O foco deste trabalho é conduzir uma análise comparativa compreensiva de um Algoritmo Evolucionario proposto, contra um algoritmo de Aprendizado por Reforço da literatura, e uma Busca Aleatória como baseline, considerando 7 datasets reais, e dois espaços de busca. É demonstrado que a Busca Aleatória é tão efetiva quanto outros métodos mais complexos, em encontrar boas arquiteturas de GNN. Outro achado interessante é de que todos os três métodos convergem bem cedo na busca (utilizando aproximadamente 10% da cota). A hipótese é de que isto acontece devido à presença de Neutralidade no espaço (regiões do espaço em que todas as soluções tem valores de performance parecidas). Em uma segunda etapa do trabalho, o foco é em conduzir uma avaliação visual e analítica extensa de um dos espaços de busca da literatura, usando técnicas de redução de dimensionalidade e Fitness Landscape Analysis (FLA). É demonstrado que o espaço de busca para NAS em GNNs apresenta grande "Buscabilidade" (i.e. não é difícil para algoritmos explorar o espaço e encontrar boas soluções) e "Neutralidade" (i.e. existem várias regiões do espaço em que a performance de soluções vizinhas é relativamente igual). A hipótese é de que, no futuro, métodos menos computacionalmente custosos possam ser empregados para esta tarefa sem perda de generalidade.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGBrasilICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃOComputação - TesesRedes neurais ( Computação) - TesesAprendizado de máquina - TesesGraph Neural NetworksAutomated Machine LearningNeural Architecture SearchUnderstanding the search space of methods for automatically designing graph neural networksUma análise do espaço de busca de métodos para o design automático de graph neural networksinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALdissertacao_fixed_pdfa.pdfdissertacao_fixed_pdfa.pdfapplication/pdf9778665https://repositorio.ufmg.br/bitstream/1843/47526/3/dissertacao_fixed_pdfa.pdf0b47e93ca63eebc7b7dacc6d79fc0d47MD53LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/47526/4/license.txtcda590c95a0b51b4d15f60c9642ca272MD541843/475262022-11-29 09:32:34.309oai:repositorio.ufmg.br:1843/47526TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2022-11-29T12:32:34Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv	Understanding the search space of methods for automatically designing graph neural networks
dc.title.alternative.pt_BR.fl_str_mv	Uma análise do espaço de busca de métodos para o design automático de graph neural networks
title	Understanding the search space of methods for automatically designing graph neural networks
spellingShingle	Understanding the search space of methods for automatically designing graph neural networks Matheus Henrique do Nascimento Nunes Graph Neural Networks Automated Machine Learning Neural Architecture Search Computação - Teses Redes neurais ( Computação) - Teses Aprendizado de máquina - Teses
title_short	Understanding the search space of methods for automatically designing graph neural networks
title_full	Understanding the search space of methods for automatically designing graph neural networks
title_fullStr	Understanding the search space of methods for automatically designing graph neural networks
title_full_unstemmed	Understanding the search space of methods for automatically designing graph neural networks
title_sort	Understanding the search space of methods for automatically designing graph neural networks
author	Matheus Henrique do Nascimento Nunes
author_facet	Matheus Henrique do Nascimento Nunes
author_role	author
dc.contributor.advisor1.fl_str_mv	Gisele Lobo Pappa
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/5936682335701497
dc.contributor.referee1.fl_str_mv	Fabrício Murai Ferreira
dc.contributor.referee2.fl_str_mv	Nuno Lourenço
dc.contributor.authorLattes.fl_str_mv	http://lattes.cnpq.br/9801186721884441
dc.contributor.author.fl_str_mv	Matheus Henrique do Nascimento Nunes
contributor_str_mv	Gisele Lobo Pappa Fabrício Murai Ferreira Nuno Lourenço
dc.subject.por.fl_str_mv	Graph Neural Networks Automated Machine Learning Neural Architecture Search
topic	Graph Neural Networks Automated Machine Learning Neural Architecture Search Computação - Teses Redes neurais ( Computação) - Teses Aprendizado de máquina - Teses
dc.subject.other.pt_BR.fl_str_mv	Computação - Teses Redes neurais ( Computação) - Teses Aprendizado de máquina - Teses
description	Graph-structured data has become increasingly available and, due to its ubiquity, an object of study in many areas of research. Due to the absence of the notion of sequence in graphs, Machine Learning (ML) methods have historically struggled to work on this data. Specialized methods for performing ML over graph data have gained a lot of attention from the research community, especially Graph Neural Networks (GNNs), which have been extensively used over real-world data, achieving state-of-the-art results in tasks such as circuit design, movie recommendation, and anomaly detection. Many GNN models have been recently proposed, and choosing the best model for each problem has become a cumbersome and error-prone task. Aiming at mitigating this problem, recent works have proposed strategies for applying Neural Architecture Search (NAS) - a set of methods designed to automatically configure neural networks, very successful on Convolutional Neural Networks, that deal with image data - to GNN models. Automatically configured GNNs have achieved high performance results, surpassing human-crafted ones. However, the NAS for GNNs literature is still in its early stages, and methods that have been successfully applied for NAS in CNNs have yet to be tested on GNNs as well. In this work we have conducted a comprehensive comparative analysis of a proposed Evolutionary Algorithm against a literature Reinforcement Learning and a simple Random Search baseline, considering 7 real-world datasets and two search spaces. We have shown that Random Search is just as effective in finding good performing architectures as other more complex methods. Another interesting finding is that all three search methods converge very early in the search (in about 10% of the budget). We hypothesized that this might have been happening due to the presence of Neutrality (regions in which all solutions have very similar performance values) in the search space. Shifting the focus from the first part of this work, in the second part we have conducted an extensive visual and analytical evaluation of one of the literature's search spaces, using dimensionality reduction and Fitness Landscape Analysis techniques. We have demonstrated that the search space for NAS in GNNs presents high searchability (i.e. it is not difficult for algorithms to explore and find a suitable solution) and neutrality (i.e. there are many regions in the search space in which the performance of the neighboring solutions are relatively equal). We hypothesize that in the future, less expensive methods can be used to perform the optimization in this scenario without loss of generality.
publishDate	2021
dc.date.issued.fl_str_mv	2021-12-07
dc.date.accessioned.fl_str_mv	2022-11-29T12:32:33Z
dc.date.available.fl_str_mv	2022-11-29T12:32:33Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/1843/47526
dc.identifier.orcid.pt_BR.fl_str_mv	https://orcid.org/0000-0001-5975-7903
url	http://hdl.handle.net/1843/47526 https://orcid.org/0000-0001-5975-7903
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv	UFMG
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
publisher.none.fl_str_mv	Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG
instname_str	Universidade Federal de Minas Gerais (UFMG)
instacron_str	UFMG
institution	UFMG
reponame_str	Repositório Institucional da UFMG
collection	Repositório Institucional da UFMG
bitstream.url.fl_str_mv	https://repositorio.ufmg.br/bitstream/1843/47526/3/dissertacao_fixed_pdfa.pdf https://repositorio.ufmg.br/bitstream/1843/47526/4/license.txt
bitstream.checksum.fl_str_mv	0b47e93ca63eebc7b7dacc6d79fc0d47 cda590c95a0b51b4d15f60c9642ca272
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_	1803589529103237120

Understanding the search space of methods for automatically designing graph neural networks

Registros relacionados