Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados

Silva, Bruno Legora Souza da

Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados

Detalhes bibliográficos
Autor(a) principal:	Silva, Bruno Legora Souza da
Data de Publicação:	2022
Tipo de documento:	Tese
Idioma:	por
Título da fonte:	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
Texto Completo:	http://repositorio.ufes.br/handle/10/15980
Resumo:	Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.

Metadados do item

id	UFES_41f0161911190787d97af02b31f03dbc
oai_identifier_str	oai:repositorio.ufes.br:10/15980
network_acronym_str	UFES
network_name_str	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
repository_id_str	2108
spelling	Ciarelli, Patrick Marqueshttps://orcid.org/0000000331774028http://lattes.cnpq.br/1267950518719423Silva, Bruno Legora Souza dahttps://orcid.org/0000-0003-1732-977Xhttp://lattes.cnpq.br/8885770833300316Bastos Filho, Carmelo Jose Albanez https://orcid.org/0000-0002-0924-5341http://lattes.cnpq.br/9745937989094036Pinto, Luiz Albertohttps://orcid.org/http://lattes.cnpq.br/3550111932609658Cavalieri, Daniel Cruzhttps://orcid.org/0000-0002-4916-1863http://lattes.cnpq.br/9583314331960942Rauber, Thomas Walterhttps://orcid.org/0000000263806584http://lattes.cnpq.br/04625494820327042024-05-30T00:53:25Z2024-05-30T00:53:25Z2022-03-18Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.O uso de Redes Neurais Artificiais (RNA) para resolução de problemas de classificação e regressão ganhou bastante popularidade, principalmente após a introdução do algoritmo backpropagation para treiná-las utilizando conjuntos de dados. Nos últimos anos, o grande volume de dados gerados e a capacidade de processamento de computadores e placas gráficas tornou possível treinar grandes arquiteturas (profundas) capazes de extrair e predizer informações sobre problemas complexos, usualmente usando grandes quantidades de tempo. Em contrapartida, algoritmos rápidos para o treinamento de redes simples, como a composta por apenas uma camada oculta, chamadas de Single Layer Feedforward Network (SLFN), mas capazes de aproximar qualquer função contínua, foram propostos. Um deles é o chamado Extreme Learning Machine (ELM), que possui solução rápida e fechada, sendo aplicado em diversas áreas do conhecimento e obtendo desempenhos superiores a outros métodos, como as próprias RNA treinadas com backpropagation e Support Vector Machines (SVM). Variantes do ELM foram propostas para resolver problemas de subajuste e sobreajuste, outliers, entre outros, mas ainda sofrem na presença de grandes volumes de dados e/ou quando é necessária uma arquitetura com mais neurônios para extrair mais informações. Nesse sentido, foi proposta uma versão empilhada, chamada Stacked ELM, que põe várias SLFN treinadas por ELM em cascata, aproveitando informações de um módulo em sua posterior, mas que possui limitação quanto ao consumo de memória, além de não ser adequada para lidar com problemas que envolvem uma única saída, como típicas tarefas de regressão. Outro método empilhado é chamado de Deep Stacked Network (DSN), que possui problemas quanto ao tempo de treinamento e uso de memória, mas sem apresentar a limitação de aplicação do Stacked ELM. Assim, este trabalho propõe combinar a arquitetura DSN com o algoritmo ELM e o Kernel ELM a fim de obter arquiteturas que empilham módulos pequenos, com treinamento rápido e utilizando pouca memória, capaz de atingir desempenhos equivalentes a modelos mais complexos. Também é proposta uma forma desta arquitetura lidar com dados que vão chegando aos poucos, chamado aprendizado incremental (ou online, no contexto de ELM). Vários experimentos foram conduzidos para avaliar os métodos propostos, tanto para problemas de classificação quanto regressão. No caso do método online, foram considerados apenas os problemas de regressão. Os resultados mostram que as técnicas são capazes de treinar arquiteturas empilhadas com desempenhos estatisticamente equivalentes às SLFN com muitos neurônios ou a métodos online propostos na literatura, quando as métricas acurácia e erro médio são avaliados. Quanto ao tempo de treinamento, os métodos se mostraram mais rápidos em diversos casos. Já quanto ao uso de memória, os métodos propostos se mostraram estatisticamente superiores, o que facilita o seu uso em ambientes com hardware restrito.Fundação Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Texthttp://repositorio.ufes.br/handle/10/15980porUniversidade Federal do Espírito SantoDoutorado em Engenharia ElétricaPrograma de Pós-Graduação em Engenharia ElétricaUFESBRCentro Tecnológicosubject.br-rjbnEngenharia ElétricaDeep Stacked NetworkExtreme Learning MachineClassificaçãoRegressãoModelos EmpilhadosAprendizado IncrementalFast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhadostitle.alternativeinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)instname:Universidade Federal do Espírito Santo (UFES)instacron:UFESORIGINALBrunoLegoraSouzadaSilva-2022-tese.pdf.pdfapplication/pdf2482790http://repositorio.ufes.br/bitstreams/1d913fae-a8e5-4c17-81b4-f5f228de4e6f/download38dc15e2ee6884f6212315f164fa1c32MD5110/159802024-08-21 11:07:01.909oai:repositorio.ufes.br:10/15980http://repositorio.ufes.brRepositório InstitucionalPUBhttp://repositorio.ufes.br/oai/requestopendoar:21082024-10-15T17:52:27.745757Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)false
dc.title.none.fl_str_mv	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
dc.title.alternative.none.fl_str_mv	title.alternative
title	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
spellingShingle	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados Silva, Bruno Legora Souza da Engenharia Elétrica Deep Stacked Network Extreme Learning Machine Classificação Regressão Modelos Empilhados Aprendizado Incremental subject.br-rjbn
title_short	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_full	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_fullStr	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_full_unstemmed	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_sort	Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
author	Silva, Bruno Legora Souza da
author_facet	Silva, Bruno Legora Souza da
author_role	author
dc.contributor.authorID.none.fl_str_mv	https://orcid.org/0000-0003-1732-977X
dc.contributor.authorLattes.none.fl_str_mv	http://lattes.cnpq.br/8885770833300316
dc.contributor.advisor1.fl_str_mv	Ciarelli, Patrick Marques
dc.contributor.advisor1ID.fl_str_mv	https://orcid.org/0000000331774028
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/1267950518719423
dc.contributor.author.fl_str_mv	Silva, Bruno Legora Souza da
dc.contributor.referee1.fl_str_mv	Bastos Filho, Carmelo Jose Albanez
dc.contributor.referee1ID.fl_str_mv	https://orcid.org/0000-0002-0924-5341
dc.contributor.referee1Lattes.fl_str_mv	http://lattes.cnpq.br/9745937989094036
dc.contributor.referee2.fl_str_mv	Pinto, Luiz Alberto
dc.contributor.referee2ID.fl_str_mv	https://orcid.org/
dc.contributor.referee2Lattes.fl_str_mv	http://lattes.cnpq.br/3550111932609658
dc.contributor.referee3.fl_str_mv	Cavalieri, Daniel Cruz
dc.contributor.referee3ID.fl_str_mv	https://orcid.org/0000-0002-4916-1863
dc.contributor.referee3Lattes.fl_str_mv	http://lattes.cnpq.br/9583314331960942
dc.contributor.referee4.fl_str_mv	Rauber, Thomas Walter
dc.contributor.referee4ID.fl_str_mv	https://orcid.org/0000000263806584
dc.contributor.referee4Lattes.fl_str_mv	http://lattes.cnpq.br/0462549482032704
contributor_str_mv	Ciarelli, Patrick Marques Bastos Filho, Carmelo Jose Albanez Pinto, Luiz Alberto Cavalieri, Daniel Cruz Rauber, Thomas Walter
dc.subject.cnpq.fl_str_mv	Engenharia Elétrica
topic	Engenharia Elétrica Deep Stacked Network Extreme Learning Machine Classificação Regressão Modelos Empilhados Aprendizado Incremental subject.br-rjbn
dc.subject.por.fl_str_mv	Deep Stacked Network Extreme Learning Machine Classificação Regressão Modelos Empilhados Aprendizado Incremental
dc.subject.br-rjbn.none.fl_str_mv	subject.br-rjbn
description	Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.
publishDate	2022
dc.date.issued.fl_str_mv	2022-03-18
dc.date.accessioned.fl_str_mv	2024-05-30T00:53:25Z
dc.date.available.fl_str_mv	2024-05-30T00:53:25Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://repositorio.ufes.br/handle/10/15980
url	http://repositorio.ufes.br/handle/10/15980
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	Text
dc.publisher.none.fl_str_mv	Universidade Federal do Espírito Santo Doutorado em Engenharia Elétrica
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Engenharia Elétrica
dc.publisher.initials.fl_str_mv	UFES
dc.publisher.country.fl_str_mv	BR
dc.publisher.department.fl_str_mv	Centro Tecnológico
publisher.none.fl_str_mv	Universidade Federal do Espírito Santo Doutorado em Engenharia Elétrica
dc.source.none.fl_str_mv	reponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) instname:Universidade Federal do Espírito Santo (UFES) instacron:UFES
instname_str	Universidade Federal do Espírito Santo (UFES)
instacron_str	UFES
institution	UFES
reponame_str	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
collection	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
bitstream.url.fl_str_mv	http://repositorio.ufes.br/bitstreams/1d913fae-a8e5-4c17-81b4-f5f228de4e6f/download
bitstream.checksum.fl_str_mv	38dc15e2ee6884f6212315f164fa1c32
bitstream.checksumAlgorithm.fl_str_mv	MD5
repository.name.fl_str_mv	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)
repository.mail.fl_str_mv
_version_	1813022507541725184

Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados

Registros relacionados