Constrained graph-based semi-supervised learning with higher order regularization

Detalhes bibliográficos
Autor(a) principal: Sousa, Celso Andre Rodrigues de
Data de Publicação: 2017
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-08122017-102557/
Resumo: Graph-based semi-supervised learning (SSL) algorithms have been widely studied in the last few years. Most of these algorithms were designed from unconstrained optimization problems using a Laplacian regularizer term as smoothness functional in an attempt to reflect the intrinsic geometric structure of the datas marginal distribution. Although a number of recent research papers are still focusing on unconstrained methods for graph-based SSL, a recent statistical analysis showed that many of these algorithms may be unstable on transductive regression. Therefore, we focus on providing new constrained methods for graph-based SSL. We begin by analyzing the regularization framework of existing unconstrained methods. Then, we incorporate two normalization constraints into the optimization problem of three of these methods. We show that the proposed optimization problems have closed-form solution. By generalizing one of these constraints to any distribution, we provide generalized methods for constrained graph-based SSL. The proposed methods have a more flexible regularization framework than the corresponding unconstrained methods. More precisely, our methods can deal with any graph Laplacian and use higher order regularization, which is effective on general SSL taks. In order to show the effectiveness of the proposed methods, we provide comprehensive experimental analyses. Specifically, our experiments are subdivided into two parts. In the first part, we evaluate existing graph-based SSL algorithms on time series data to find their weaknesses. In the second part, we evaluate the proposed constrained methods against six state-of-the-art graph-based SSL algorithms on benchmark data sets. Since the widely used best case analysis may hide useful information concerning the SSL algorithms performance with respect to parameter selection, we used recently proposed empirical evaluation models to evaluate our results. Our results show that our methods outperforms the competing methods on most parameter settings and graph construction methods. However, we found a few experimental settings in which our methods showed poor performance. In order to facilitate the reproduction of our results, the source codes, data sets, and experimental results are freely available.
id USP_c7b51cf4cd34419867427e920bec0ee1
oai_identifier_str oai:teses.usp.br:tde-08122017-102557
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Constrained graph-based semi-supervised learning with higher order regularizationAprendizado semissupervisionado restrito baseado em grafos com regularização de ordem elevadaAprendizado semissupervisionadoConstrained optimizationGraph-based methodsHigher order regularizationMétodos baseados em grafosOtimização restritaRegularização de ordem elevadaSemi-supervised learningGraph-based semi-supervised learning (SSL) algorithms have been widely studied in the last few years. Most of these algorithms were designed from unconstrained optimization problems using a Laplacian regularizer term as smoothness functional in an attempt to reflect the intrinsic geometric structure of the datas marginal distribution. Although a number of recent research papers are still focusing on unconstrained methods for graph-based SSL, a recent statistical analysis showed that many of these algorithms may be unstable on transductive regression. Therefore, we focus on providing new constrained methods for graph-based SSL. We begin by analyzing the regularization framework of existing unconstrained methods. Then, we incorporate two normalization constraints into the optimization problem of three of these methods. We show that the proposed optimization problems have closed-form solution. By generalizing one of these constraints to any distribution, we provide generalized methods for constrained graph-based SSL. The proposed methods have a more flexible regularization framework than the corresponding unconstrained methods. More precisely, our methods can deal with any graph Laplacian and use higher order regularization, which is effective on general SSL taks. In order to show the effectiveness of the proposed methods, we provide comprehensive experimental analyses. Specifically, our experiments are subdivided into two parts. In the first part, we evaluate existing graph-based SSL algorithms on time series data to find their weaknesses. In the second part, we evaluate the proposed constrained methods against six state-of-the-art graph-based SSL algorithms on benchmark data sets. Since the widely used best case analysis may hide useful information concerning the SSL algorithms performance with respect to parameter selection, we used recently proposed empirical evaluation models to evaluate our results. Our results show that our methods outperforms the competing methods on most parameter settings and graph construction methods. However, we found a few experimental settings in which our methods showed poor performance. In order to facilitate the reproduction of our results, the source codes, data sets, and experimental results are freely available.Algoritmos de aprendizado semissupervisionado baseado em grafos foram amplamente estudados nos últimos anos. A maioria desses algoritmos foi projetada a partir de problemas de otimização sem restrições usando um termo regularizador Laplaciano como funcional de suavidade numa tentativa de refletir a estrutura geométrica intrínsica da distribuição marginal dos dados. Apesar de vários artigos científicos recentes continuarem focando em métodos sem restrição para aprendizado semissupervisionado em grafos, uma análise estatística recente mostrou que muitos desses algoritmos podem ser instáveis em regressão transdutiva. Logo, nós focamos em propor novos métodos com restrições para aprendizado semissupervisionado em grafos. Nós começamos analisando o framework de regularização de métodos sem restrições existentes. Então, nós incorporamos duas restrições de normalização no problema de otimização de três desses métodos. Mostramos que os problemas de otimização propostos possuem solução de forma fechada. Ao generalizar uma dessas restrições para qualquer distribuição, provemos métodos generalizados para aprendizado semissupervisionado restrito baseado em grafos. Os métodos propostos possuem um framework de regularização mais flexível que os métodos sem restrições correspondentes. Mais precisamente, nossos métodos podem lidar com qualquer Laplaciano em grafos e usar regularização de ordem elevada, a qual é efetiva em tarefas de aprendizado semissupervisionado em geral. Para mostrar a efetividade dos métodos propostos, nós provemos análises experimentais robustas. Especificamente, nossos experimentos são subdivididos em duas partes. Na primeira parte, avaliamos algoritmos de aprendizado semissupervisionado em grafos existentes em dados de séries temporais para encontrar possíveis fraquezas desses métodos. Na segunda parte, avaliamos os métodos restritos propostos contra seis algoritmos de aprendizado semissupervisionado baseado em grafos do estado da arte em conjuntos de dados benchmark. Como a amplamente usada análise de melhor caso pode esconder informações relevantes sobre o desempenho dos algoritmos de aprendizado semissupervisionado com respeito à seleção de parâmetros, nós usamos modelos de avaliação empírica recentemente propostos para avaliar os nossos resultados. Nossos resultados mostram que os nossos métodos superam os demais métodos na maioria das configurações de parâmetro e métodos de construção de grafos. Entretanto, encontramos algumas configurações experimentais nas quais nossos métodos mostraram baixo desempenho. Para facilitar a reprodução dos nossos resultados, os códigos fonte, conjuntos de dados e resultados experimentais estão disponíveis gratuitamente.Biblioteca Digitais de Teses e Dissertações da USPBatista, Gustavo Enrique de Almeida Prado AlvesSousa, Celso Andre Rodrigues de2017-08-10info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://www.teses.usp.br/teses/disponiveis/55/55134/tde-08122017-102557/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2018-07-17T16:38:18Zoai:teses.usp.br:tde-08122017-102557Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212018-07-17T16:38:18Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Constrained graph-based semi-supervised learning with higher order regularization
Aprendizado semissupervisionado restrito baseado em grafos com regularização de ordem elevada
title Constrained graph-based semi-supervised learning with higher order regularization
spellingShingle Constrained graph-based semi-supervised learning with higher order regularization
Sousa, Celso Andre Rodrigues de
Aprendizado semissupervisionado
Constrained optimization
Graph-based methods
Higher order regularization
Métodos baseados em grafos
Otimização restrita
Regularização de ordem elevada
Semi-supervised learning
title_short Constrained graph-based semi-supervised learning with higher order regularization
title_full Constrained graph-based semi-supervised learning with higher order regularization
title_fullStr Constrained graph-based semi-supervised learning with higher order regularization
title_full_unstemmed Constrained graph-based semi-supervised learning with higher order regularization
title_sort Constrained graph-based semi-supervised learning with higher order regularization
author Sousa, Celso Andre Rodrigues de
author_facet Sousa, Celso Andre Rodrigues de
author_role author
dc.contributor.none.fl_str_mv Batista, Gustavo Enrique de Almeida Prado Alves
dc.contributor.author.fl_str_mv Sousa, Celso Andre Rodrigues de
dc.subject.por.fl_str_mv Aprendizado semissupervisionado
Constrained optimization
Graph-based methods
Higher order regularization
Métodos baseados em grafos
Otimização restrita
Regularização de ordem elevada
Semi-supervised learning
topic Aprendizado semissupervisionado
Constrained optimization
Graph-based methods
Higher order regularization
Métodos baseados em grafos
Otimização restrita
Regularização de ordem elevada
Semi-supervised learning
description Graph-based semi-supervised learning (SSL) algorithms have been widely studied in the last few years. Most of these algorithms were designed from unconstrained optimization problems using a Laplacian regularizer term as smoothness functional in an attempt to reflect the intrinsic geometric structure of the datas marginal distribution. Although a number of recent research papers are still focusing on unconstrained methods for graph-based SSL, a recent statistical analysis showed that many of these algorithms may be unstable on transductive regression. Therefore, we focus on providing new constrained methods for graph-based SSL. We begin by analyzing the regularization framework of existing unconstrained methods. Then, we incorporate two normalization constraints into the optimization problem of three of these methods. We show that the proposed optimization problems have closed-form solution. By generalizing one of these constraints to any distribution, we provide generalized methods for constrained graph-based SSL. The proposed methods have a more flexible regularization framework than the corresponding unconstrained methods. More precisely, our methods can deal with any graph Laplacian and use higher order regularization, which is effective on general SSL taks. In order to show the effectiveness of the proposed methods, we provide comprehensive experimental analyses. Specifically, our experiments are subdivided into two parts. In the first part, we evaluate existing graph-based SSL algorithms on time series data to find their weaknesses. In the second part, we evaluate the proposed constrained methods against six state-of-the-art graph-based SSL algorithms on benchmark data sets. Since the widely used best case analysis may hide useful information concerning the SSL algorithms performance with respect to parameter selection, we used recently proposed empirical evaluation models to evaluate our results. Our results show that our methods outperforms the competing methods on most parameter settings and graph construction methods. However, we found a few experimental settings in which our methods showed poor performance. In order to facilitate the reproduction of our results, the source codes, data sets, and experimental results are freely available.
publishDate 2017
dc.date.none.fl_str_mv 2017-08-10
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://www.teses.usp.br/teses/disponiveis/55/55134/tde-08122017-102557/
url http://www.teses.usp.br/teses/disponiveis/55/55134/tde-08122017-102557/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815256818782306304