Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Tese |
Idioma: | por |
Título da fonte: | Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
Texto Completo: | http://repositorio.ufes.br/handle/10/15980 |
Resumo: | Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments. |
id |
UFES_41f0161911190787d97af02b31f03dbc |
---|---|
oai_identifier_str |
oai:repositorio.ufes.br:10/15980 |
network_acronym_str |
UFES |
network_name_str |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
repository_id_str |
2108 |
spelling |
Ciarelli, Patrick Marqueshttps://orcid.org/0000000331774028http://lattes.cnpq.br/1267950518719423Silva, Bruno Legora Souza dahttps://orcid.org/0000-0003-1732-977Xhttp://lattes.cnpq.br/8885770833300316Bastos Filho, Carmelo Jose Albanez https://orcid.org/0000-0002-0924-5341http://lattes.cnpq.br/9745937989094036Pinto, Luiz Albertohttps://orcid.org/http://lattes.cnpq.br/3550111932609658Cavalieri, Daniel Cruzhttps://orcid.org/0000-0002-4916-1863http://lattes.cnpq.br/9583314331960942Rauber, Thomas Walterhttps://orcid.org/0000000263806584http://lattes.cnpq.br/04625494820327042024-05-30T00:53:25Z2024-05-30T00:53:25Z2022-03-18Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.O uso de Redes Neurais Artificiais (RNA) para resolução de problemas de classificação e regressão ganhou bastante popularidade, principalmente após a introdução do algoritmo backpropagation para treiná-las utilizando conjuntos de dados. Nos últimos anos, o grande volume de dados gerados e a capacidade de processamento de computadores e placas gráficas tornou possível treinar grandes arquiteturas (profundas) capazes de extrair e predizer informações sobre problemas complexos, usualmente usando grandes quantidades de tempo. Em contrapartida, algoritmos rápidos para o treinamento de redes simples, como a composta por apenas uma camada oculta, chamadas de Single Layer Feedforward Network (SLFN), mas capazes de aproximar qualquer função contínua, foram propostos. Um deles é o chamado Extreme Learning Machine (ELM), que possui solução rápida e fechada, sendo aplicado em diversas áreas do conhecimento e obtendo desempenhos superiores a outros métodos, como as próprias RNA treinadas com backpropagation e Support Vector Machines (SVM). Variantes do ELM foram propostas para resolver problemas de subajuste e sobreajuste, outliers, entre outros, mas ainda sofrem na presença de grandes volumes de dados e/ou quando é necessária uma arquitetura com mais neurônios para extrair mais informações. Nesse sentido, foi proposta uma versão empilhada, chamada Stacked ELM, que põe várias SLFN treinadas por ELM em cascata, aproveitando informações de um módulo em sua posterior, mas que possui limitação quanto ao consumo de memória, além de não ser adequada para lidar com problemas que envolvem uma única saída, como típicas tarefas de regressão. Outro método empilhado é chamado de Deep Stacked Network (DSN), que possui problemas quanto ao tempo de treinamento e uso de memória, mas sem apresentar a limitação de aplicação do Stacked ELM. Assim, este trabalho propõe combinar a arquitetura DSN com o algoritmo ELM e o Kernel ELM a fim de obter arquiteturas que empilham módulos pequenos, com treinamento rápido e utilizando pouca memória, capaz de atingir desempenhos equivalentes a modelos mais complexos. Também é proposta uma forma desta arquitetura lidar com dados que vão chegando aos poucos, chamado aprendizado incremental (ou online, no contexto de ELM). Vários experimentos foram conduzidos para avaliar os métodos propostos, tanto para problemas de classificação quanto regressão. No caso do método online, foram considerados apenas os problemas de regressão. Os resultados mostram que as técnicas são capazes de treinar arquiteturas empilhadas com desempenhos estatisticamente equivalentes às SLFN com muitos neurônios ou a métodos online propostos na literatura, quando as métricas acurácia e erro médio são avaliados. Quanto ao tempo de treinamento, os métodos se mostraram mais rápidos em diversos casos. Já quanto ao uso de memória, os métodos propostos se mostraram estatisticamente superiores, o que facilita o seu uso em ambientes com hardware restrito.Fundação Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Texthttp://repositorio.ufes.br/handle/10/15980porUniversidade Federal do Espírito SantoDoutorado em Engenharia ElétricaPrograma de Pós-Graduação em Engenharia ElétricaUFESBRCentro Tecnológicosubject.br-rjbnEngenharia ElétricaDeep Stacked NetworkExtreme Learning MachineClassificaçãoRegressãoModelos EmpilhadosAprendizado IncrementalFast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhadostitle.alternativeinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)instname:Universidade Federal do Espírito Santo (UFES)instacron:UFESORIGINALBrunoLegoraSouzadaSilva-2022-tese.pdf.pdfapplication/pdf2482790http://repositorio.ufes.br/bitstreams/1d913fae-a8e5-4c17-81b4-f5f228de4e6f/download38dc15e2ee6884f6212315f164fa1c32MD5110/159802024-08-21 11:07:01.909oai:repositorio.ufes.br:10/15980http://repositorio.ufes.brRepositório InstitucionalPUBhttp://repositorio.ufes.br/oai/requestopendoar:21082024-10-15T17:52:27.745757Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)false |
dc.title.none.fl_str_mv |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
dc.title.alternative.none.fl_str_mv |
title.alternative |
title |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
spellingShingle |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados Silva, Bruno Legora Souza da Engenharia Elétrica Deep Stacked Network Extreme Learning Machine Classificação Regressão Modelos Empilhados Aprendizado Incremental subject.br-rjbn |
title_short |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
title_full |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
title_fullStr |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
title_full_unstemmed |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
title_sort |
Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados |
author |
Silva, Bruno Legora Souza da |
author_facet |
Silva, Bruno Legora Souza da |
author_role |
author |
dc.contributor.authorID.none.fl_str_mv |
https://orcid.org/0000-0003-1732-977X |
dc.contributor.authorLattes.none.fl_str_mv |
http://lattes.cnpq.br/8885770833300316 |
dc.contributor.advisor1.fl_str_mv |
Ciarelli, Patrick Marques |
dc.contributor.advisor1ID.fl_str_mv |
https://orcid.org/0000000331774028 |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/1267950518719423 |
dc.contributor.author.fl_str_mv |
Silva, Bruno Legora Souza da |
dc.contributor.referee1.fl_str_mv |
Bastos Filho, Carmelo Jose Albanez |
dc.contributor.referee1ID.fl_str_mv |
https://orcid.org/0000-0002-0924-5341 |
dc.contributor.referee1Lattes.fl_str_mv |
http://lattes.cnpq.br/9745937989094036 |
dc.contributor.referee2.fl_str_mv |
Pinto, Luiz Alberto |
dc.contributor.referee2ID.fl_str_mv |
https://orcid.org/ |
dc.contributor.referee2Lattes.fl_str_mv |
http://lattes.cnpq.br/3550111932609658 |
dc.contributor.referee3.fl_str_mv |
Cavalieri, Daniel Cruz |
dc.contributor.referee3ID.fl_str_mv |
https://orcid.org/0000-0002-4916-1863 |
dc.contributor.referee3Lattes.fl_str_mv |
http://lattes.cnpq.br/9583314331960942 |
dc.contributor.referee4.fl_str_mv |
Rauber, Thomas Walter |
dc.contributor.referee4ID.fl_str_mv |
https://orcid.org/0000000263806584 |
dc.contributor.referee4Lattes.fl_str_mv |
http://lattes.cnpq.br/0462549482032704 |
contributor_str_mv |
Ciarelli, Patrick Marques Bastos Filho, Carmelo Jose Albanez Pinto, Luiz Alberto Cavalieri, Daniel Cruz Rauber, Thomas Walter |
dc.subject.cnpq.fl_str_mv |
Engenharia Elétrica |
topic |
Engenharia Elétrica Deep Stacked Network Extreme Learning Machine Classificação Regressão Modelos Empilhados Aprendizado Incremental subject.br-rjbn |
dc.subject.por.fl_str_mv |
Deep Stacked Network Extreme Learning Machine Classificação Regressão Modelos Empilhados Aprendizado Incremental |
dc.subject.br-rjbn.none.fl_str_mv |
subject.br-rjbn |
description |
Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments. |
publishDate |
2022 |
dc.date.issued.fl_str_mv |
2022-03-18 |
dc.date.accessioned.fl_str_mv |
2024-05-30T00:53:25Z |
dc.date.available.fl_str_mv |
2024-05-30T00:53:25Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://repositorio.ufes.br/handle/10/15980 |
url |
http://repositorio.ufes.br/handle/10/15980 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
Text |
dc.publisher.none.fl_str_mv |
Universidade Federal do Espírito Santo Doutorado em Engenharia Elétrica |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Engenharia Elétrica |
dc.publisher.initials.fl_str_mv |
UFES |
dc.publisher.country.fl_str_mv |
BR |
dc.publisher.department.fl_str_mv |
Centro Tecnológico |
publisher.none.fl_str_mv |
Universidade Federal do Espírito Santo Doutorado em Engenharia Elétrica |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) instname:Universidade Federal do Espírito Santo (UFES) instacron:UFES |
instname_str |
Universidade Federal do Espírito Santo (UFES) |
instacron_str |
UFES |
institution |
UFES |
reponame_str |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
collection |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
bitstream.url.fl_str_mv |
http://repositorio.ufes.br/bitstreams/1d913fae-a8e5-4c17-81b4-f5f228de4e6f/download |
bitstream.checksum.fl_str_mv |
38dc15e2ee6884f6212315f164fa1c32 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 |
repository.name.fl_str_mv |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES) |
repository.mail.fl_str_mv |
|
_version_ |
1813022507541725184 |