Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation

Vitor Angelo Maria Ferreira Torres

Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation

Detalhes bibliográficos
Autor(a) principal:	Vitor Angelo Maria Ferreira Torres
Data de Publicação:	2019
Tipo de documento:	Tese
Idioma:	eng
Título da fonte:	Repositório Institucional da UFMG
Texto Completo:	http://hdl.handle.net/1843/32399
Resumo:	As Machine Learning applications drastically increase their demand for optimized implementations, both in embedded environments and in high-end parallel processing platforms, the industry and research community have been responding with different approaches to provide the required solutions. This work presents approximations to arithmetic operations and mathematical functions that, associated with adaptive Artificial Neural Networks training methods and an automatic precision adjustment mechanism, provide reliable and efficient implementations of classifiers, without depending on mixed operations with higher precision or complex rounding methods that are commonly proposed only with highly redundant datasets and large networks. This research investigates Approximate Computing concepts that simplify the design of classifier training accelerators based on hardware with Application Specific Integrated Circuits or Field Programmable Gate Arrays (FPGAs). The goal was not to find the optimal simplifications for each problem but to build a method, based on currently available technology, that can be used as reliably as one implemented with precise operations and standard training algorithms. Reducing the number of bits in the Floating Point (FP) format from 32 to 16 has an immediate effect of dividing by half the memory requirements and is a commonly used technique. By not using mixed precision and performing further simplifications to the smaller format, this thesis reduces the implementation complexity of the FP software emulation by 53%. Exponentiation and division by square root operations are also simplified, without requiring Look-Up Tables and with implicit interpolation. A preliminary migration of the design to an FPGA has confirmed that the area optimizations are also relevant in this environment, even when compared to other optimized implementation which lack the mechanism to adapt the FP representation range. A logical resource reduction of 64% is achieved when compared to mixed-precision approaches.

Metadados do item

id	UFMG_1ac6dd9edad3aaf2102c68bce99bb16a
oai_identifier_str	oai:repositorio.ufmg.br:1843/32399
network_acronym_str	UFMG
network_name_str	Repositório Institucional da UFMG
repository_id_str
spelling	Frank Sill Torreshttp://lattes.cnpq.br/6435692548198017Antônio de Pádua BragaCristiano Leite de CastroJosé Augusto Miranda NacifLuiz Carlos Bambirra Torreshttp://lattes.cnpq.br/1801633347473015Vitor Angelo Maria Ferreira Torres2020-02-07T17:38:26Z2020-02-07T17:38:26Z2019-12-16http://hdl.handle.net/1843/32399As Machine Learning applications drastically increase their demand for optimized implementations, both in embedded environments and in high-end parallel processing platforms, the industry and research community have been responding with different approaches to provide the required solutions. This work presents approximations to arithmetic operations and mathematical functions that, associated with adaptive Artificial Neural Networks training methods and an automatic precision adjustment mechanism, provide reliable and efficient implementations of classifiers, without depending on mixed operations with higher precision or complex rounding methods that are commonly proposed only with highly redundant datasets and large networks. This research investigates Approximate Computing concepts that simplify the design of classifier training accelerators based on hardware with Application Specific Integrated Circuits or Field Programmable Gate Arrays (FPGAs). The goal was not to find the optimal simplifications for each problem but to build a method, based on currently available technology, that can be used as reliably as one implemented with precise operations and standard training algorithms. Reducing the number of bits in the Floating Point (FP) format from 32 to 16 has an immediate effect of dividing by half the memory requirements and is a commonly used technique. By not using mixed precision and performing further simplifications to the smaller format, this thesis reduces the implementation complexity of the FP software emulation by 53%. Exponentiation and division by square root operations are also simplified, without requiring Look-Up Tables and with implicit interpolation. A preliminary migration of the design to an FPGA has confirmed that the area optimizations are also relevant in this environment, even when compared to other optimized implementation which lack the mechanism to adapt the FP representation range. A logical resource reduction of 64% is achieved when compared to mixed-precision approaches.À medida em que aplicações de Aprendizado de Máquinas aumentam drasticamente sua demanda por implementações otimizadas, tanto em ambientes embarcadas quanto em plataformas de processamento paralelo de alto desempenho, a indústria e a comunidade de pesquisa têm respondido com diferentes propostas para prover as soluções requeridas. Esse trabalho apresenta aproximações em operações aritméticas e funções matemáticas que, associadas a métodos adaptativos para treinamento de Redes Neurais Artificiais e um mecanismo automático de ajuste de precisão, proporcionam implementações confiáveis e eficientes de classificadores, sem a dependência de algumas operações com maior precisão ou métodos complexos de arredondamento, que são frequentemente propostos somente com conjuntos de treinamento redundantes e grandes redes. Essa pesquisa investiga conceitos de Computação Aproximativa que simplificam o projeto de aceleradores para o treinamento de classificadores implementados em hardware com Circuitos Integrados de Aplicação Específica ou Field Programmable Gate Arrays (FPGA). O objetivo não era encontrar as simplificações ótimas para cada problema mas construir um método, baseado em tecnologia atualmente disponível, que possa ser usado de forma tão confiável quanto um implementado com operações precisas e métodos de treinamento padrão. A redução do número de bits no formato de Ponto Flutuante (PF) de 32 para 16 tem efeito imediato na divisão pela metade dos requisitos de memória e é uma técnica comumente usada. Por não utilizar parcialmente operações precisas e propor outras modificações no menor formato, essa tese reduz a complexidade de implementação da emulação de PF em software por 53%. Operações de exponenciação e divisão pela raiz quadrada também são simplificadas, sem requerer Look-Up Tables e com interpolação implícita. Uma migração preliminar do projeto para uma FPGA confirmou que as otimizações de área também são relevantes nesse ambiente, mesmo quando comparadas com outra implementação otimizada que não provê o mecanismo para adaptação da faixa de representação do PF. Uma redução de recursos lógicos de 64% é obtida quando comparada com soluções parciais (mixed-precision).CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Engenharia ElétricaUFMGBrasilENG - DEPARTAMENTO DE ENGENHARIA ELETRÔNICAhttp://creativecommons.org/licenses/by-nc-nd/3.0/pt/info:eu-repo/semantics/openAccessEngenharia elétricaRedes neurais (Computação)Computação aproximativaApproximate computingArtificial neural networksHardware implementationResilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementationinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALthesis-200109-pdfa.pdfthesis-200109-pdfa.pdfTese de Doutorado: PPGEE UFMGapplication/pdf32073050https://repositorio.ufmg.br/bitstream/1843/32399/1/thesis-200109-pdfa.pdf4df2fad576f528ea573aa8aedf70caecMD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufmg.br/bitstream/1843/32399/2/license_rdfcfd6801dba008cb6adbd9838b81582abMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82119https://repositorio.ufmg.br/bitstream/1843/32399/3/license.txt34badce4be7e31e3adb4575ae96af679MD53TEXTthesis-200109-pdfa.pdf.txtthesis-200109-pdfa.pdf.txtExtracted texttext/plain225382https://repositorio.ufmg.br/bitstream/1843/32399/4/thesis-200109-pdfa.pdf.txtdc29d612fbf98dad6e7bc43f80422cf1MD541843/323992020-02-08 03:42:32.417oai:repositorio.ufmg.br:1843/32399TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KCg==Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2020-02-08T06:42:32Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
title	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
spellingShingle	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation Vitor Angelo Maria Ferreira Torres Approximate computing Artificial neural networks Hardware implementation Engenharia elétrica Redes neurais (Computação) Computação aproximativa
title_short	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
title_full	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
title_fullStr	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
title_full_unstemmed	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
title_sort	Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation
author	Vitor Angelo Maria Ferreira Torres
author_facet	Vitor Angelo Maria Ferreira Torres
author_role	author
dc.contributor.advisor1.fl_str_mv	Frank Sill Torres
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/6435692548198017
dc.contributor.referee1.fl_str_mv	Antônio de Pádua Braga
dc.contributor.referee2.fl_str_mv	Cristiano Leite de Castro
dc.contributor.referee3.fl_str_mv	José Augusto Miranda Nacif
dc.contributor.referee4.fl_str_mv	Luiz Carlos Bambirra Torres
dc.contributor.authorLattes.fl_str_mv	http://lattes.cnpq.br/1801633347473015
dc.contributor.author.fl_str_mv	Vitor Angelo Maria Ferreira Torres
contributor_str_mv	Frank Sill Torres Antônio de Pádua Braga Cristiano Leite de Castro José Augusto Miranda Nacif Luiz Carlos Bambirra Torres
dc.subject.por.fl_str_mv	Approximate computing Artificial neural networks Hardware implementation
topic	Approximate computing Artificial neural networks Hardware implementation Engenharia elétrica Redes neurais (Computação) Computação aproximativa
dc.subject.other.pt_BR.fl_str_mv	Engenharia elétrica Redes neurais (Computação) Computação aproximativa
description	As Machine Learning applications drastically increase their demand for optimized implementations, both in embedded environments and in high-end parallel processing platforms, the industry and research community have been responding with different approaches to provide the required solutions. This work presents approximations to arithmetic operations and mathematical functions that, associated with adaptive Artificial Neural Networks training methods and an automatic precision adjustment mechanism, provide reliable and efficient implementations of classifiers, without depending on mixed operations with higher precision or complex rounding methods that are commonly proposed only with highly redundant datasets and large networks. This research investigates Approximate Computing concepts that simplify the design of classifier training accelerators based on hardware with Application Specific Integrated Circuits or Field Programmable Gate Arrays (FPGAs). The goal was not to find the optimal simplifications for each problem but to build a method, based on currently available technology, that can be used as reliably as one implemented with precise operations and standard training algorithms. Reducing the number of bits in the Floating Point (FP) format from 32 to 16 has an immediate effect of dividing by half the memory requirements and is a commonly used technique. By not using mixed precision and performing further simplifications to the smaller format, this thesis reduces the implementation complexity of the FP software emulation by 53%. Exponentiation and division by square root operations are also simplified, without requiring Look-Up Tables and with implicit interpolation. A preliminary migration of the design to an FPGA has confirmed that the area optimizations are also relevant in this environment, even when compared to other optimized implementation which lack the mechanism to adapt the FP representation range. A logical resource reduction of 64% is achieved when compared to mixed-precision approaches.
publishDate	2019
dc.date.issued.fl_str_mv	2019-12-16
dc.date.accessioned.fl_str_mv	2020-02-07T17:38:26Z
dc.date.available.fl_str_mv	2020-02-07T17:38:26Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/1843/32399
url	http://hdl.handle.net/1843/32399
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/pt/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/pt/
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Engenharia Elétrica
dc.publisher.initials.fl_str_mv	UFMG
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	ENG - DEPARTAMENTO DE ENGENHARIA ELETRÔNICA
publisher.none.fl_str_mv	Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG
instname_str	Universidade Federal de Minas Gerais (UFMG)
instacron_str	UFMG
institution	UFMG
reponame_str	Repositório Institucional da UFMG
collection	Repositório Institucional da UFMG
bitstream.url.fl_str_mv	https://repositorio.ufmg.br/bitstream/1843/32399/1/thesis-200109-pdfa.pdf https://repositorio.ufmg.br/bitstream/1843/32399/2/license_rdf https://repositorio.ufmg.br/bitstream/1843/32399/3/license.txt https://repositorio.ufmg.br/bitstream/1843/32399/4/thesis-200109-pdfa.pdf.txt
bitstream.checksum.fl_str_mv	4df2fad576f528ea573aa8aedf70caec cfd6801dba008cb6adbd9838b81582ab 34badce4be7e31e3adb4575ae96af679 dc29d612fbf98dad6e7bc43f80422cf1
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_	1803589279468748800

Resilient training of neural network classifiers with approximate computing techniques for a hardware-optimized implementation

Registros relacionados