ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais

Santana, Gabriel Dias de

ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais

Detalhes bibliográficos
Autor(a) principal:	Santana, Gabriel Dias de
Data de Publicação:	2020
Tipo de documento:	Trabalho de conclusão de curso
Idioma:	por
Título da fonte:	Repositório Institucional da UFS
Texto Completo:	http://ri.ufs.br/jspui/handle/riufs/13507
Resumo:	The need to run real-time and latency-sensitive systems has led to the development of a new paradigm in relation to dominant cloud computing, which is called edge computing. Thus, the new data distribution model seeks to bring the processing and treatment of data close to the edge of the network, a reason that leads to the development of platforms with low power processors close to sensors and mobile applications. Within this context, the growth in the use of Machine Learning algorithms and the possibility of transferring part of the execution of these models locally, opens up a range of questions regarding how to execute them efficiently. That said, this work implemented the convolution operation for an FPGA platform in order to optimize the execution time of a CNN (Convolutional Neural Network) in a microcontroller. The proposed approach was designed to perform in a specialized manner three convolutional layers of a CNN, which was trained to perform the classification of 10 classes of images present in the CIFAR-10 dataset. For comparison, two approaches were tested using an ARM CORTEX-M4 processor: without any software optimization and with optimization provided by special instructions SIMD (Single Instruction, Multiple Data) through the CMSIS-NN library. As a result, the execution time of the convolutional layers achieved in this work was up to 25% less than the fastest time using only the processor.

Metadados do item

id	UFS-2_30968f2df1d992cd56b1491ae839dd32
oai_identifier_str	oai:ufs.br:riufs/13507
network_acronym_str	UFS-2
network_name_str	Repositório Institucional da UFS
repository_id_str
spelling	Santana, Gabriel Dias dePrado, Bruno Otávio Piedade2020-06-27T01:37:24Z2020-06-27T01:37:24Z2020-05-29Santana, Gabriel Dias de. ArchLearn : implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais. São Cristóvão, 2020. Monografia (graduação em Engenharia da Computação) – Departamento de Computação, Centro de Ciências Exatas e Tecnologia, Universidade Federal de Sergipe, São Cristóvão, SE, 2020http://ri.ufs.br/jspui/handle/riufs/13507The need to run real-time and latency-sensitive systems has led to the development of a new paradigm in relation to dominant cloud computing, which is called edge computing. Thus, the new data distribution model seeks to bring the processing and treatment of data close to the edge of the network, a reason that leads to the development of platforms with low power processors close to sensors and mobile applications. Within this context, the growth in the use of Machine Learning algorithms and the possibility of transferring part of the execution of these models locally, opens up a range of questions regarding how to execute them efficiently. That said, this work implemented the convolution operation for an FPGA platform in order to optimize the execution time of a CNN (Convolutional Neural Network) in a microcontroller. The proposed approach was designed to perform in a specialized manner three convolutional layers of a CNN, which was trained to perform the classification of 10 classes of images present in the CIFAR-10 dataset. For comparison, two approaches were tested using an ARM CORTEX-M4 processor: without any software optimization and with optimization provided by special instructions SIMD (Single Instruction, Multiple Data) through the CMSIS-NN library. As a result, the execution time of the convolutional layers achieved in this work was up to 25% less than the fastest time using only the processor.A necessidade de executar sistemas em tempo real e sensíveis a latência levou ao desenvolvimento de um novo paradigma em relação a dominante computação em nuvem, que é chamado de edge computing. Dessa forma, o novo modelo de distribuição dos dados busca trazer o processamento e tratamento dos dados para próximo da borda da rede, motivo que leva o desenvolvimento de plataformas com processadores de baixa potência próximo a sensores e aplicações móveis. Dentro desse contexto, o crescimento do uso de algoritmos de Machine Learning e a possibilidade de transferir parte da execução desses modelos localmente, abre um leque de questões referentes a maneira de como executá-los eficientemente. Isto posto, este trabalho implementou a operação de convolução para uma plataforma FPGA com o objetivo de otimizar o tempo de execução de uma CNN (Convolutional Neural Network) em um microcontrolador. A abordagem proposta foi projetada para executar de forma especializada três camadas convolucionais de uma CNN, que foi treinada para realizar a classificação de 10 classes de imagens presentes no dataset CIFAR-10. Para comparação duas abordagens foram testadas usando um processador ARM CORTEX-M4: sem nenhuma otimização de software e com otimização fornecida por instruções especiais SIMD (Single Instruction, Multiple Data) por meio da biblioteca CMSIS-NN. Como resultado, o tempo de execução das camadas convolucionais alcançado neste trabalho foi até 25% menor em relação ao tempo mais rápido usando apenas o processador.São Cristóvão, SEporCiência da computaçãoEnsino de engenharia de computaçãoEngenharia da computaçãoFPGAAceleração de hardwareRedes neurais artificiaisHardwareHardware accelerationArtificial neural networksCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAOArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiaisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisUniversidade Federal de SergipeDCOMP - Departamento de Computação – Engenharia de Computação – São Cristóvão - Presencialreponame:Repositório Institucional da UFSinstname:Universidade Federal de Sergipe (UFS)instacron:UFSinfo:eu-repo/semantics/openAccessLICENSElicense.txtlicense.txttext/plain; charset=utf-81475https://ri.ufs.br/jspui/bitstream/riufs/13507/1/license.txt098cbbf65c2c15e1fb2e49c5d306a44cMD51ORIGINALGabriel_Dias_Santana.pdfGabriel_Dias_Santana.pdfapplication/pdf2964760https://ri.ufs.br/jspui/bitstream/riufs/13507/2/Gabriel_Dias_Santana.pdfd04bd0935ad5f7298749c25013628be7MD52TEXTGabriel_Dias_Santana.pdf.txtGabriel_Dias_Santana.pdf.txtExtracted texttext/plain111637https://ri.ufs.br/jspui/bitstream/riufs/13507/3/Gabriel_Dias_Santana.pdf.txt45c4956afabda0f25015bb25cdc769ebMD53THUMBNAILGabriel_Dias_Santana.pdf.jpgGabriel_Dias_Santana.pdf.jpgGenerated Thumbnailimage/jpeg1337https://ri.ufs.br/jspui/bitstream/riufs/13507/4/Gabriel_Dias_Santana.pdf.jpgc2c66c36e28d4d60e7d3b2ca37561839MD54riufs/135072021-01-12 10:03:20.774oai:ufs.br:riufs/13507TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvcihlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTZXJnaXBlIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSByZXByb2R1emlyIHNldSB0cmFiYWxobyBubyBmb3JtYXRvIGVsZXRyw7RuaWNvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRlIFNlcmdpcGUgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250ZcO6ZG8sIHRyYW5zcG9yIHNldSB0cmFiYWxobyBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgU2VyZ2lwZSBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgZGUgc2V1IHRyYWJhbGhvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIHNldSB0cmFiYWxobyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0bywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgbsOjbyBpbmZyaW5nZSBkaXJlaXRvcyBhdXRvcmFpcyBkZSBuaW5ndcOpbS4KCkNhc28gbyB0cmFiYWxobyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgw6AgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgU2VyZ2lwZSBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvLgoKQSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTZXJnaXBlIHNlIGNvbXByb21ldGUgYSBpZGVudGlmaWNhciBjbGFyYW1lbnRlIG8gc2V1IG5vbWUocykgb3UgbyhzKSBub21lKHMpIGRvKHMpIApkZXRlbnRvcihlcykgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRvIHRyYWJhbGhvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIGNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuIAo=Repositório InstitucionalPUBhttps://ri.ufs.br/oai/requestrepositorio@academico.ufs.bropendoar:2021-01-12T13:03:20Repositório Institucional da UFS - Universidade Federal de Sergipe (UFS)false
dc.title.pt_BR.fl_str_mv	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
title	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
spellingShingle	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais Santana, Gabriel Dias de Ciência da computação Ensino de engenharia de computação Engenharia da computação FPGA Aceleração de hardware Redes neurais artificiais Hardware Hardware acceleration Artificial neural networks CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
title_short	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
title_full	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
title_fullStr	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
title_full_unstemmed	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
title_sort	ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais
author	Santana, Gabriel Dias de
author_facet	Santana, Gabriel Dias de
author_role	author
dc.contributor.author.fl_str_mv	Santana, Gabriel Dias de
dc.contributor.advisor1.fl_str_mv	Prado, Bruno Otávio Piedade
contributor_str_mv	Prado, Bruno Otávio Piedade
dc.subject.por.fl_str_mv	Ciência da computação Ensino de engenharia de computação Engenharia da computação FPGA Aceleração de hardware Redes neurais artificiais Hardware
topic	Ciência da computação Ensino de engenharia de computação Engenharia da computação FPGA Aceleração de hardware Redes neurais artificiais Hardware Hardware acceleration Artificial neural networks CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
dc.subject.eng.fl_str_mv	Hardware acceleration Artificial neural networks
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
description	The need to run real-time and latency-sensitive systems has led to the development of a new paradigm in relation to dominant cloud computing, which is called edge computing. Thus, the new data distribution model seeks to bring the processing and treatment of data close to the edge of the network, a reason that leads to the development of platforms with low power processors close to sensors and mobile applications. Within this context, the growth in the use of Machine Learning algorithms and the possibility of transferring part of the execution of these models locally, opens up a range of questions regarding how to execute them efficiently. That said, this work implemented the convolution operation for an FPGA platform in order to optimize the execution time of a CNN (Convolutional Neural Network) in a microcontroller. The proposed approach was designed to perform in a specialized manner three convolutional layers of a CNN, which was trained to perform the classification of 10 classes of images present in the CIFAR-10 dataset. For comparison, two approaches were tested using an ARM CORTEX-M4 processor: without any software optimization and with optimization provided by special instructions SIMD (Single Instruction, Multiple Data) through the CMSIS-NN library. As a result, the execution time of the convolutional layers achieved in this work was up to 25% less than the fastest time using only the processor.
publishDate	2020
dc.date.accessioned.fl_str_mv	2020-06-27T01:37:24Z
dc.date.available.fl_str_mv	2020-06-27T01:37:24Z
dc.date.issued.fl_str_mv	2020-05-29
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/bachelorThesis
format	bachelorThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	Santana, Gabriel Dias de. ArchLearn : implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais. São Cristóvão, 2020. Monografia (graduação em Engenharia da Computação) – Departamento de Computação, Centro de Ciências Exatas e Tecnologia, Universidade Federal de Sergipe, São Cristóvão, SE, 2020
dc.identifier.uri.fl_str_mv	http://ri.ufs.br/jspui/handle/riufs/13507
identifier_str_mv	Santana, Gabriel Dias de. ArchLearn : implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais. São Cristóvão, 2020. Monografia (graduação em Engenharia da Computação) – Departamento de Computação, Centro de Ciências Exatas e Tecnologia, Universidade Federal de Sergipe, São Cristóvão, SE, 2020
url	http://ri.ufs.br/jspui/handle/riufs/13507
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.initials.fl_str_mv	Universidade Federal de Sergipe
dc.publisher.department.fl_str_mv	DCOMP - Departamento de Computação – Engenharia de Computação – São Cristóvão - Presencial
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFS instname:Universidade Federal de Sergipe (UFS) instacron:UFS
instname_str	Universidade Federal de Sergipe (UFS)
instacron_str	UFS
institution	UFS
reponame_str	Repositório Institucional da UFS
collection	Repositório Institucional da UFS
bitstream.url.fl_str_mv	https://ri.ufs.br/jspui/bitstream/riufs/13507/1/license.txt https://ri.ufs.br/jspui/bitstream/riufs/13507/2/Gabriel_Dias_Santana.pdf https://ri.ufs.br/jspui/bitstream/riufs/13507/3/Gabriel_Dias_Santana.pdf.txt https://ri.ufs.br/jspui/bitstream/riufs/13507/4/Gabriel_Dias_Santana.pdf.jpg
bitstream.checksum.fl_str_mv	098cbbf65c2c15e1fb2e49c5d306a44c d04bd0935ad5f7298749c25013628be7 45c4956afabda0f25015bb25cdc769eb c2c66c36e28d4d60e7d3b2ca37561839
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFS - Universidade Federal de Sergipe (UFS)
repository.mail.fl_str_mv	repositorio@academico.ufs.br
_version_	1802110659106701312

ArchLearn : Implementação de acelerador em hardware baseado em FPGA para redes neurais artificiais

Registros relacionados