A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs

Detalhes bibliográficos
Autor(a) principal: Véstias, Mário
Data de Publicação: 2020
Outros Autores: Duarte, Rui, De Sousa, Jose, Cláudio de Campos Neto, Horácio
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.21/12726
Resumo: Convolutional neural networks have become the state of the art of machine learning for a vast set of applications, especially for image classification and object detection. There are several advantages to running inference on these models at the edge, including real-time performance and data privacy. The high computing and memory requirements of convolutional neural networks have been major obstacles to the broader deployment of CNNs on edge devices. Data quantization is an optimization method that reduces the number of bits used to represent weights and activations of a network model, minimizing storage requirements and computing complexity. Quantization can be applied at the layer level, by using different bit widths in different layers: this is called hybrid quantization. This article proposes a new efficient and configurable architecture for running CNNs with hybrid quantization in low-density Field-Programmable Gate Arrays (FPGAs) targeting edge devices. The architecture has been implemented on the Xilinx ZYNQ7020/45 devices and is running the AlexNet and VGG16 networks. Running AlexNet, the architecture has a throughput up to 508 images per second on the ZYNQ7020 device, and 1639 images per second on the ZYNQ7045 device. Considering VGG16, the architecture delivers up to 43 images per second on the ZYNQ7020 device, and 81 images per second on the ZYNQ7045 device. The proposed hybrid architecture achieves up to 13.7 x improvement in performance compared to state-of-the-art solutions, with small accuracy degradation.
id RCAP_73f711169aca9c32a09c04786e1519a6
oai_identifier_str oai:repositorio.ipl.pt:10400.21/12726
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A configurable architecture for running hybrid convolutional neural networks in low-density FPGAsConvolutional neural networkDeep learningEmbedded computingField-programmable gate arrayHybrid quantizationConvolutional neural networks have become the state of the art of machine learning for a vast set of applications, especially for image classification and object detection. There are several advantages to running inference on these models at the edge, including real-time performance and data privacy. The high computing and memory requirements of convolutional neural networks have been major obstacles to the broader deployment of CNNs on edge devices. Data quantization is an optimization method that reduces the number of bits used to represent weights and activations of a network model, minimizing storage requirements and computing complexity. Quantization can be applied at the layer level, by using different bit widths in different layers: this is called hybrid quantization. This article proposes a new efficient and configurable architecture for running CNNs with hybrid quantization in low-density Field-Programmable Gate Arrays (FPGAs) targeting edge devices. The architecture has been implemented on the Xilinx ZYNQ7020/45 devices and is running the AlexNet and VGG16 networks. Running AlexNet, the architecture has a throughput up to 508 images per second on the ZYNQ7020 device, and 1639 images per second on the ZYNQ7045 device. Considering VGG16, the architecture delivers up to 43 images per second on the ZYNQ7020 device, and 81 images per second on the ZYNQ7045 device. The proposed hybrid architecture achieves up to 13.7 x improvement in performance compared to state-of-the-art solutions, with small accuracy degradation.IEEERCIPLVéstias, MárioDuarte, RuiDe Sousa, JoseCláudio de Campos Neto, Horácio2021-01-28T16:48:41Z2020-06-082020-06-08T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.21/12726engVÉSTIAS, Mário P.; [et al] – A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs. IEEE Access. ISSN 2169-3536. Vol. 8 (2020), pp. 107229-1072432169-353610.1109/ACCESS.2020.3000444info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-08-03T10:06:12Zoai:repositorio.ipl.pt:10400.21/12726Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:20:45.710824Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
title A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
spellingShingle A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
Véstias, Mário
Convolutional neural network
Deep learning
Embedded computing
Field-programmable gate array
Hybrid quantization
title_short A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
title_full A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
title_fullStr A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
title_full_unstemmed A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
title_sort A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs
author Véstias, Mário
author_facet Véstias, Mário
Duarte, Rui
De Sousa, Jose
Cláudio de Campos Neto, Horácio
author_role author
author2 Duarte, Rui
De Sousa, Jose
Cláudio de Campos Neto, Horácio
author2_role author
author
author
dc.contributor.none.fl_str_mv RCIPL
dc.contributor.author.fl_str_mv Véstias, Mário
Duarte, Rui
De Sousa, Jose
Cláudio de Campos Neto, Horácio
dc.subject.por.fl_str_mv Convolutional neural network
Deep learning
Embedded computing
Field-programmable gate array
Hybrid quantization
topic Convolutional neural network
Deep learning
Embedded computing
Field-programmable gate array
Hybrid quantization
description Convolutional neural networks have become the state of the art of machine learning for a vast set of applications, especially for image classification and object detection. There are several advantages to running inference on these models at the edge, including real-time performance and data privacy. The high computing and memory requirements of convolutional neural networks have been major obstacles to the broader deployment of CNNs on edge devices. Data quantization is an optimization method that reduces the number of bits used to represent weights and activations of a network model, minimizing storage requirements and computing complexity. Quantization can be applied at the layer level, by using different bit widths in different layers: this is called hybrid quantization. This article proposes a new efficient and configurable architecture for running CNNs with hybrid quantization in low-density Field-Programmable Gate Arrays (FPGAs) targeting edge devices. The architecture has been implemented on the Xilinx ZYNQ7020/45 devices and is running the AlexNet and VGG16 networks. Running AlexNet, the architecture has a throughput up to 508 images per second on the ZYNQ7020 device, and 1639 images per second on the ZYNQ7045 device. Considering VGG16, the architecture delivers up to 43 images per second on the ZYNQ7020 device, and 81 images per second on the ZYNQ7045 device. The proposed hybrid architecture achieves up to 13.7 x improvement in performance compared to state-of-the-art solutions, with small accuracy degradation.
publishDate 2020
dc.date.none.fl_str_mv 2020-06-08
2020-06-08T00:00:00Z
2021-01-28T16:48:41Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.21/12726
url http://hdl.handle.net/10400.21/12726
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv VÉSTIAS, Mário P.; [et al] – A configurable architecture for running hybrid convolutional neural networks in low-density FPGAs. IEEE Access. ISSN 2169-3536. Vol. 8 (2020), pp. 107229-107243
2169-3536
10.1109/ACCESS.2020.3000444
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv IEEE
publisher.none.fl_str_mv IEEE
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133477459722240