OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.

Muyal, Thomas Araújo

OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.

Detalhes bibliográficos
Autor(a) principal:	Muyal, Thomas Araújo
Data de Publicação:	2023
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Biblioteca Digital de Teses e Dissertações da USP
Texto Completo:	https://www.teses.usp.br/teses/disponiveis/3/3142/tde-26042023-153703/
Resumo:	Artificial neural networks are a group of artificial intelligence algorithms widely used contemporarily in the industry and research, implemented to solve a great variety of issues. Its recent popularity is in part due to the advance of hardware technologies, which provide computational resources for the resource-intensive processing necessary for its implementation. Historically, neural network software is executed in Central Processing Units (CPUs), which are general purpose devices. However, metrics such as inference speed, energy efficiency and circuit area are improved with the use of specialized hardware accelerators. In the context of edge computing, the processing on location of data gathered by Internet of Things (IoT) devices, there are significant restrictions as to those metrics. One of the devices frequently used for this purpose are Field-Programmable Gate Arrays (FPGAs), which offer integrated circuits whose functionality may be programmed for the synthesis of hardware specialized in specific algorithms, offering high performance and execution efficiency, as well as other benefits such as being able to be reconfigured after manufacturing, shorter design cycles, faster time-to-market and lower cost than the alternatives. One of the issues for the use of FPGAs is that its programming is difficult and requires specialist knowledge to fully make use of these devices characteristics, and not every team of artificial intelligence developers has such knowledge. Consequently, there is interest in systems that automatize the synthesis of neural network hardware accelerators in FPGAs, to bring the benefits of this technique to a wider audience. Another approach to solve the high restriction environment of edge computing is neural network quantization, which means reducing the precision of representation of parameters so they consume less memory, and operations using these numbers are faster. This work studies the state of the art of this manner of software, diagnosing existing gaps. The main objective of this research is the creation and validation of a proof of concept of automatic generation of neural network hardware accelerators that are using parameter quantization techniques to enable its synthesis on small FPGAs aimed towards edge computing. For the validation of this system, an accelerator was generated and its behavior was measured in metrics of latency, end result throughput, energy efficiency, circuit area and compared to the execution of the same neural network in a high-end personal computer CPU. The results indicate that the generated hardware accelerator is synthesizeable, significantly faster and consumes considerably less energy than the CPU implementation.

Metadados do item

id	USP_7485324ee2129f25e65c37108b30abde
oai_identifier_str	oai:teses.usp.br:tde-26042023-153703
network_acronym_str	USP
network_name_str	Biblioteca Digital de Teses e Dissertações da USP
repository_id_str	2721
spelling	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.OpenNPU: uma plataforma open source para síntese automática de redes neurais para FPGAs.Artificial intelligenceCircuitos integrados VLSIDeep learningEletrônica digitalFPGAHardware acceleratorInteligência artificialInternet das coisasInternet of thingsMachine learningNeural networksRedes neuraisArtificial neural networks are a group of artificial intelligence algorithms widely used contemporarily in the industry and research, implemented to solve a great variety of issues. Its recent popularity is in part due to the advance of hardware technologies, which provide computational resources for the resource-intensive processing necessary for its implementation. Historically, neural network software is executed in Central Processing Units (CPUs), which are general purpose devices. However, metrics such as inference speed, energy efficiency and circuit area are improved with the use of specialized hardware accelerators. In the context of edge computing, the processing on location of data gathered by Internet of Things (IoT) devices, there are significant restrictions as to those metrics. One of the devices frequently used for this purpose are Field-Programmable Gate Arrays (FPGAs), which offer integrated circuits whose functionality may be programmed for the synthesis of hardware specialized in specific algorithms, offering high performance and execution efficiency, as well as other benefits such as being able to be reconfigured after manufacturing, shorter design cycles, faster time-to-market and lower cost than the alternatives. One of the issues for the use of FPGAs is that its programming is difficult and requires specialist knowledge to fully make use of these devices characteristics, and not every team of artificial intelligence developers has such knowledge. Consequently, there is interest in systems that automatize the synthesis of neural network hardware accelerators in FPGAs, to bring the benefits of this technique to a wider audience. Another approach to solve the high restriction environment of edge computing is neural network quantization, which means reducing the precision of representation of parameters so they consume less memory, and operations using these numbers are faster. This work studies the state of the art of this manner of software, diagnosing existing gaps. The main objective of this research is the creation and validation of a proof of concept of automatic generation of neural network hardware accelerators that are using parameter quantization techniques to enable its synthesis on small FPGAs aimed towards edge computing. For the validation of this system, an accelerator was generated and its behavior was measured in metrics of latency, end result throughput, energy efficiency, circuit area and compared to the execution of the same neural network in a high-end personal computer CPU. The results indicate that the generated hardware accelerator is synthesizeable, significantly faster and consumes considerably less energy than the CPU implementation.Redes neurais artificiais são técnicas de inteligência artificial amplamente utilizadas na indústria atualmente, implementadas para a solução de uma grande variedade de problemas. Sua recente popularidade é em parte explicada pelo avanço de tecnologias de hardware, que fornecem recursos computacionais para o processamento dos seus cálculos. Software geralmente é executado em Unidades de Processamento Central (CPUs), de propósito geral, mas para a melhoria de métricas como performance e eficiência energética, utilizam-se aceleradores em hardware especializados. No contexto de computação de borda no âmbito da Internet das Coisas, estas restrições de hardware são ainda mais rigorosas. Uma abordagem para resolver o ambiente restritivo de computação de borda é a quantização de redes neurais, que consiste na redução da precisão de representação dos seus parâmetros para que consumam menos memória, e operações sejam mais rápidas e menos custosas. Um dos dispositivos utilizados para este propósito são Field-Programmable Gate Arrays (FPGAs), que oferecem circuitos integrados cuja funcionalidade pode ser programada para a implementação especializada de algoritmos, fornecendo alto desempenho e eficiência de execução, juntamente de benefícios como reconfigurabilidade, ciclos de design e time-to-market mais curtos, e custo mais baixo que alternativas. Um dos problemas para o uso de FPGAs é que sua programação é difícil e requer conhecimento especializado para aproveitamento de suas características, e nem toda equipe de desenvolvimento de inteligência artificial o detêm. Assim, há interesse em sistemas que automatizem a síntese de aceleradores de redes neurais em FPGAs, para trazer a maiores públicos as vantagens desta técnica. Este trabalho estuda o estado da arte deste tipo de software, estudando lacunas na área. O objetivo principal desta pesquisa é a criação e validação de uma prova de conceito de geração automática de aceleradores de hardware para redes neurais utilizando técnicas de quantização para possibilitar sua síntese em FPGAs pequenas, feitas para computação de borda. Para a validação deste sistema, aceleradores foram gerados e seus comportamentos foram testados nas métricas de latência, vazão de resultados, eficiência energética e área de circuito, e comparada com sua execução em uma CPU de computador pessoal. Os resultados indicam que os aceleradores gerados são sintetizáveis, significativamente mais rápidos e eficientes energéticamente que a implementação em CPU.Biblioteca Digitais de Teses e Dissertações da USPZuffo, Marcelo KnorichMuyal, Thomas Araújo2023-02-16info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/3/3142/tde-26042023-153703/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2023-04-27T17:33:07Zoai:teses.usp.br:tde-26042023-153703Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.bropendoar:27212023-04-27T17:33:07Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs. OpenNPU: uma plataforma open source para síntese automática de redes neurais para FPGAs.
title	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.
spellingShingle	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs. Muyal, Thomas Araújo Artificial intelligence Circuitos integrados VLSI Deep learning Eletrônica digital FPGA Hardware accelerator Inteligência artificial Internet das coisas Internet of things Machine learning Neural networks Redes neurais
title_short	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.
title_full	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.
title_fullStr	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.
title_full_unstemmed	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.
title_sort	OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.
author	Muyal, Thomas Araújo
author_facet	Muyal, Thomas Araújo
author_role	author
dc.contributor.none.fl_str_mv	Zuffo, Marcelo Knorich
dc.contributor.author.fl_str_mv	Muyal, Thomas Araújo
dc.subject.por.fl_str_mv	Artificial intelligence Circuitos integrados VLSI Deep learning Eletrônica digital FPGA Hardware accelerator Inteligência artificial Internet das coisas Internet of things Machine learning Neural networks Redes neurais
topic	Artificial intelligence Circuitos integrados VLSI Deep learning Eletrônica digital FPGA Hardware accelerator Inteligência artificial Internet das coisas Internet of things Machine learning Neural networks Redes neurais
description	Artificial neural networks are a group of artificial intelligence algorithms widely used contemporarily in the industry and research, implemented to solve a great variety of issues. Its recent popularity is in part due to the advance of hardware technologies, which provide computational resources for the resource-intensive processing necessary for its implementation. Historically, neural network software is executed in Central Processing Units (CPUs), which are general purpose devices. However, metrics such as inference speed, energy efficiency and circuit area are improved with the use of specialized hardware accelerators. In the context of edge computing, the processing on location of data gathered by Internet of Things (IoT) devices, there are significant restrictions as to those metrics. One of the devices frequently used for this purpose are Field-Programmable Gate Arrays (FPGAs), which offer integrated circuits whose functionality may be programmed for the synthesis of hardware specialized in specific algorithms, offering high performance and execution efficiency, as well as other benefits such as being able to be reconfigured after manufacturing, shorter design cycles, faster time-to-market and lower cost than the alternatives. One of the issues for the use of FPGAs is that its programming is difficult and requires specialist knowledge to fully make use of these devices characteristics, and not every team of artificial intelligence developers has such knowledge. Consequently, there is interest in systems that automatize the synthesis of neural network hardware accelerators in FPGAs, to bring the benefits of this technique to a wider audience. Another approach to solve the high restriction environment of edge computing is neural network quantization, which means reducing the precision of representation of parameters so they consume less memory, and operations using these numbers are faster. This work studies the state of the art of this manner of software, diagnosing existing gaps. The main objective of this research is the creation and validation of a proof of concept of automatic generation of neural network hardware accelerators that are using parameter quantization techniques to enable its synthesis on small FPGAs aimed towards edge computing. For the validation of this system, an accelerator was generated and its behavior was measured in metrics of latency, end result throughput, energy efficiency, circuit area and compared to the execution of the same neural network in a high-end personal computer CPU. The results indicate that the generated hardware accelerator is synthesizeable, significantly faster and consumes considerably less energy than the CPU implementation.
publishDate	2023
dc.date.none.fl_str_mv	2023-02-16
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://www.teses.usp.br/teses/disponiveis/3/3142/tde-26042023-153703/
url	https://www.teses.usp.br/teses/disponiveis/3/3142/tde-26042023-153703/
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv	Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Liberar o conteúdo para acesso público.
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP
instname_str	Universidade de São Paulo (USP)
instacron_str	USP
institution	USP
reponame_str	Biblioteca Digital de Teses e Dissertações da USP
collection	Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv	virginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.br
_version_	1809091093111767040

OpenNPU: an open source platform for automatic neural network synthesis for FPGAs.

Registros relacionados