Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados

Detalhes bibliográficos
Autor(a) principal: Silva, Bruno Legora Souza da
Data de Publicação: 2022
Tipo de documento: Tese
Idioma: por
Título da fonte: Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
Texto Completo: http://repositorio.ufes.br/handle/10/15980
Resumo: Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.
id UFES_41f0161911190787d97af02b31f03dbc
oai_identifier_str oai:repositorio.ufes.br:10/15980
network_acronym_str UFES
network_name_str Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
repository_id_str 2108
spelling Ciarelli, Patrick Marqueshttps://orcid.org/0000000331774028http://lattes.cnpq.br/1267950518719423Silva, Bruno Legora Souza dahttps://orcid.org/0000-0003-1732-977Xhttp://lattes.cnpq.br/8885770833300316Bastos Filho, Carmelo Jose Albanez https://orcid.org/0000-0002-0924-5341http://lattes.cnpq.br/9745937989094036Pinto, Luiz Albertohttps://orcid.org/http://lattes.cnpq.br/3550111932609658Cavalieri, Daniel Cruzhttps://orcid.org/0000-0002-4916-1863http://lattes.cnpq.br/9583314331960942Rauber, Thomas Walterhttps://orcid.org/0000000263806584http://lattes.cnpq.br/04625494820327042024-05-30T00:53:25Z2024-05-30T00:53:25Z2022-03-18Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.O uso de Redes Neurais Artificiais (RNA) para resolução de problemas de classificação e regressão ganhou bastante popularidade, principalmente após a introdução do algoritmo backpropagation para treiná-las utilizando conjuntos de dados. Nos últimos anos, o grande volume de dados gerados e a capacidade de processamento de computadores e placas gráficas tornou possível treinar grandes arquiteturas (profundas) capazes de extrair e predizer informações sobre problemas complexos, usualmente usando grandes quantidades de tempo. Em contrapartida, algoritmos rápidos para o treinamento de redes simples, como a composta por apenas uma camada oculta, chamadas de Single Layer Feedforward Network (SLFN), mas capazes de aproximar qualquer função contínua, foram propostos. Um deles é o chamado Extreme Learning Machine (ELM), que possui solução rápida e fechada, sendo aplicado em diversas áreas do conhecimento e obtendo desempenhos superiores a outros métodos, como as próprias RNA treinadas com backpropagation e Support Vector Machines (SVM). Variantes do ELM foram propostas para resolver problemas de subajuste e sobreajuste, outliers, entre outros, mas ainda sofrem na presença de grandes volumes de dados e/ou quando é necessária uma arquitetura com mais neurônios para extrair mais informações. Nesse sentido, foi proposta uma versão empilhada, chamada Stacked ELM, que põe várias SLFN treinadas por ELM em cascata, aproveitando informações de um módulo em sua posterior, mas que possui limitação quanto ao consumo de memória, além de não ser adequada para lidar com problemas que envolvem uma única saída, como típicas tarefas de regressão. Outro método empilhado é chamado de Deep Stacked Network (DSN), que possui problemas quanto ao tempo de treinamento e uso de memória, mas sem apresentar a limitação de aplicação do Stacked ELM. Assim, este trabalho propõe combinar a arquitetura DSN com o algoritmo ELM e o Kernel ELM a fim de obter arquiteturas que empilham módulos pequenos, com treinamento rápido e utilizando pouca memória, capaz de atingir desempenhos equivalentes a modelos mais complexos. Também é proposta uma forma desta arquitetura lidar com dados que vão chegando aos poucos, chamado aprendizado incremental (ou online, no contexto de ELM). Vários experimentos foram conduzidos para avaliar os métodos propostos, tanto para problemas de classificação quanto regressão. No caso do método online, foram considerados apenas os problemas de regressão. Os resultados mostram que as técnicas são capazes de treinar arquiteturas empilhadas com desempenhos estatisticamente equivalentes às SLFN com muitos neurônios ou a métodos online propostos na literatura, quando as métricas acurácia e erro médio são avaliados. Quanto ao tempo de treinamento, os métodos se mostraram mais rápidos em diversos casos. Já quanto ao uso de memória, os métodos propostos se mostraram estatisticamente superiores, o que facilita o seu uso em ambientes com hardware restrito.Fundação Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Texthttp://repositorio.ufes.br/handle/10/15980porUniversidade Federal do Espírito SantoDoutorado em Engenharia ElétricaPrograma de Pós-Graduação em Engenharia ElétricaUFESBRCentro Tecnológicosubject.br-rjbnEngenharia ElétricaDeep Stacked NetworkExtreme Learning MachineClassificaçãoRegressãoModelos EmpilhadosAprendizado IncrementalFast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhadostitle.alternativeinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)instname:Universidade Federal do Espírito Santo (UFES)instacron:UFESORIGINALBrunoLegoraSouzadaSilva-2022-tese.pdf.pdfapplication/pdf2482790http://repositorio.ufes.br/bitstreams/1d913fae-a8e5-4c17-81b4-f5f228de4e6f/download38dc15e2ee6884f6212315f164fa1c32MD5110/159802024-08-21 11:07:01.909oai:repositorio.ufes.br:10/15980http://repositorio.ufes.brRepositório InstitucionalPUBhttp://repositorio.ufes.br/oai/requestopendoar:21082024-10-15T17:52:27.745757Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)false
dc.title.none.fl_str_mv Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
dc.title.alternative.none.fl_str_mv title.alternative
title Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
spellingShingle Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
Silva, Bruno Legora Souza da
Engenharia Elétrica
Deep Stacked Network
Extreme Learning Machine
Classificação
Regressão
Modelos Empilhados
Aprendizado Incremental
subject.br-rjbn
title_short Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_full Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_fullStr Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_full_unstemmed Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
title_sort Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
author Silva, Bruno Legora Souza da
author_facet Silva, Bruno Legora Souza da
author_role author
dc.contributor.authorID.none.fl_str_mv https://orcid.org/0000-0003-1732-977X
dc.contributor.authorLattes.none.fl_str_mv http://lattes.cnpq.br/8885770833300316
dc.contributor.advisor1.fl_str_mv Ciarelli, Patrick Marques
dc.contributor.advisor1ID.fl_str_mv https://orcid.org/0000000331774028
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/1267950518719423
dc.contributor.author.fl_str_mv Silva, Bruno Legora Souza da
dc.contributor.referee1.fl_str_mv Bastos Filho, Carmelo Jose Albanez
dc.contributor.referee1ID.fl_str_mv https://orcid.org/0000-0002-0924-5341
dc.contributor.referee1Lattes.fl_str_mv http://lattes.cnpq.br/9745937989094036
dc.contributor.referee2.fl_str_mv Pinto, Luiz Alberto
dc.contributor.referee2ID.fl_str_mv https://orcid.org/
dc.contributor.referee2Lattes.fl_str_mv http://lattes.cnpq.br/3550111932609658
dc.contributor.referee3.fl_str_mv Cavalieri, Daniel Cruz
dc.contributor.referee3ID.fl_str_mv https://orcid.org/0000-0002-4916-1863
dc.contributor.referee3Lattes.fl_str_mv http://lattes.cnpq.br/9583314331960942
dc.contributor.referee4.fl_str_mv Rauber, Thomas Walter
dc.contributor.referee4ID.fl_str_mv https://orcid.org/0000000263806584
dc.contributor.referee4Lattes.fl_str_mv http://lattes.cnpq.br/0462549482032704
contributor_str_mv Ciarelli, Patrick Marques
Bastos Filho, Carmelo Jose Albanez
Pinto, Luiz Alberto
Cavalieri, Daniel Cruz
Rauber, Thomas Walter
dc.subject.cnpq.fl_str_mv Engenharia Elétrica
topic Engenharia Elétrica
Deep Stacked Network
Extreme Learning Machine
Classificação
Regressão
Modelos Empilhados
Aprendizado Incremental
subject.br-rjbn
dc.subject.por.fl_str_mv Deep Stacked Network
Extreme Learning Machine
Classificação
Regressão
Modelos Empilhados
Aprendizado Incremental
dc.subject.br-rjbn.none.fl_str_mv subject.br-rjbn
description Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.
publishDate 2022
dc.date.issued.fl_str_mv 2022-03-18
dc.date.accessioned.fl_str_mv 2024-05-30T00:53:25Z
dc.date.available.fl_str_mv 2024-05-30T00:53:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://repositorio.ufes.br/handle/10/15980
url http://repositorio.ufes.br/handle/10/15980
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv Text
dc.publisher.none.fl_str_mv Universidade Federal do Espírito Santo
Doutorado em Engenharia Elétrica
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Engenharia Elétrica
dc.publisher.initials.fl_str_mv UFES
dc.publisher.country.fl_str_mv BR
dc.publisher.department.fl_str_mv Centro Tecnológico
publisher.none.fl_str_mv Universidade Federal do Espírito Santo
Doutorado em Engenharia Elétrica
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
instname:Universidade Federal do Espírito Santo (UFES)
instacron:UFES
instname_str Universidade Federal do Espírito Santo (UFES)
instacron_str UFES
institution UFES
reponame_str Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
collection Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
bitstream.url.fl_str_mv http://repositorio.ufes.br/bitstreams/1d913fae-a8e5-4c17-81b4-f5f228de4e6f/download
bitstream.checksum.fl_str_mv 38dc15e2ee6884f6212315f164fa1c32
bitstream.checksumAlgorithm.fl_str_mv MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)
repository.mail.fl_str_mv
_version_ 1813022507541725184