A full featured configurable accelerator for object detection with YOLO

Detalhes bibliográficos
Autor(a) principal: Pestana, Daniel
Data de Publicação: 2021
Outros Autores: Miranda, Pedro R., Lopes, João D., Duarte, Rui, Véstias, Mário, Neto, Horácio C, De Sousa, Jose
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.21/13689
Resumo: Object detection and classification is an essential task of computer vision. A very efficient algorithm for detection and classification is YOLO (You Look Only Once). We consider hardware architectures to run YOLO in real-time on embedded platforms. Designing a new dedicated accelerator for each new version of YOLO is not feasible given the fast delivery of new versions. This work's primary goal is to design a configurable and scalable core for creating specific object detection and classification systems based on YOLO, targeting embedded platforms. The core accelerates the execution of all the algorithm steps, including pre-processing, model inference and post-processing. It considers a fixed-point format, linearised activation functions, batch-normalisation, folding, and a hardware structure that exploits most of the available parallelism in CNN processing. The proposed core is configured for real-time execution of YOLOv3-Tiny and YOLOv4-Tiny, integrated into a RISC-V-based system-on-chip architecture and prototyped in an UltraScale XCKU040 FPGA (Field Programmable Gate Array). The solution achieves a performance of 32 and 31 frames per second for YOLOv3-Tiny and YOLOv4-Tiny, respectively, with a 16-bit fixed-point format. Compared to previous proposals, it improves the frame rate at a higher performance efficiency. The performance, area efficiency and configurability of the proposed core enable the fast development of real-time YOLO-based object detectors on embedded systems.
id RCAP_052dbc0f962f513820d106275121653f
oai_identifier_str oai:repositorio.ipl.pt:10400.21/13689
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A full featured configurable accelerator for object detection with YOLOObject detectionConvolutional neural networkFPGALightweight YOLOObject detection and classification is an essential task of computer vision. A very efficient algorithm for detection and classification is YOLO (You Look Only Once). We consider hardware architectures to run YOLO in real-time on embedded platforms. Designing a new dedicated accelerator for each new version of YOLO is not feasible given the fast delivery of new versions. This work's primary goal is to design a configurable and scalable core for creating specific object detection and classification systems based on YOLO, targeting embedded platforms. The core accelerates the execution of all the algorithm steps, including pre-processing, model inference and post-processing. It considers a fixed-point format, linearised activation functions, batch-normalisation, folding, and a hardware structure that exploits most of the available parallelism in CNN processing. The proposed core is configured for real-time execution of YOLOv3-Tiny and YOLOv4-Tiny, integrated into a RISC-V-based system-on-chip architecture and prototyped in an UltraScale XCKU040 FPGA (Field Programmable Gate Array). The solution achieves a performance of 32 and 31 frames per second for YOLOv3-Tiny and YOLOv4-Tiny, respectively, with a 16-bit fixed-point format. Compared to previous proposals, it improves the frame rate at a higher performance efficiency. The performance, area efficiency and configurability of the proposed core enable the fast development of real-time YOLO-based object detectors on embedded systems.IEEERCIPLPestana, DanielMiranda, Pedro R.Lopes, João D.Duarte, RuiVéstias, MárioNeto, Horácio CDe Sousa, Jose2021-09-07T09:41:09Z2021-05-192021-05-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.21/13689engPESTANA, Daniel; [et al] – A full featured configurable accelerator for object detection with YOLO. IEEE Access. ISSN 2169-3536. Vol. 9 (2021), pp. 75864-758772169-353610.1109/ACCESS.2021.3081818metadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-08-03T10:08:49Zoai:repositorio.ipl.pt:10400.21/13689Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:21:35.297738Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A full featured configurable accelerator for object detection with YOLO
title A full featured configurable accelerator for object detection with YOLO
spellingShingle A full featured configurable accelerator for object detection with YOLO
Pestana, Daniel
Object detection
Convolutional neural network
FPGA
Lightweight YOLO
title_short A full featured configurable accelerator for object detection with YOLO
title_full A full featured configurable accelerator for object detection with YOLO
title_fullStr A full featured configurable accelerator for object detection with YOLO
title_full_unstemmed A full featured configurable accelerator for object detection with YOLO
title_sort A full featured configurable accelerator for object detection with YOLO
author Pestana, Daniel
author_facet Pestana, Daniel
Miranda, Pedro R.
Lopes, João D.
Duarte, Rui
Véstias, Mário
Neto, Horácio C
De Sousa, Jose
author_role author
author2 Miranda, Pedro R.
Lopes, João D.
Duarte, Rui
Véstias, Mário
Neto, Horácio C
De Sousa, Jose
author2_role author
author
author
author
author
author
dc.contributor.none.fl_str_mv RCIPL
dc.contributor.author.fl_str_mv Pestana, Daniel
Miranda, Pedro R.
Lopes, João D.
Duarte, Rui
Véstias, Mário
Neto, Horácio C
De Sousa, Jose
dc.subject.por.fl_str_mv Object detection
Convolutional neural network
FPGA
Lightweight YOLO
topic Object detection
Convolutional neural network
FPGA
Lightweight YOLO
description Object detection and classification is an essential task of computer vision. A very efficient algorithm for detection and classification is YOLO (You Look Only Once). We consider hardware architectures to run YOLO in real-time on embedded platforms. Designing a new dedicated accelerator for each new version of YOLO is not feasible given the fast delivery of new versions. This work's primary goal is to design a configurable and scalable core for creating specific object detection and classification systems based on YOLO, targeting embedded platforms. The core accelerates the execution of all the algorithm steps, including pre-processing, model inference and post-processing. It considers a fixed-point format, linearised activation functions, batch-normalisation, folding, and a hardware structure that exploits most of the available parallelism in CNN processing. The proposed core is configured for real-time execution of YOLOv3-Tiny and YOLOv4-Tiny, integrated into a RISC-V-based system-on-chip architecture and prototyped in an UltraScale XCKU040 FPGA (Field Programmable Gate Array). The solution achieves a performance of 32 and 31 frames per second for YOLOv3-Tiny and YOLOv4-Tiny, respectively, with a 16-bit fixed-point format. Compared to previous proposals, it improves the frame rate at a higher performance efficiency. The performance, area efficiency and configurability of the proposed core enable the fast development of real-time YOLO-based object detectors on embedded systems.
publishDate 2021
dc.date.none.fl_str_mv 2021-09-07T09:41:09Z
2021-05-19
2021-05-19T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.21/13689
url http://hdl.handle.net/10400.21/13689
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv PESTANA, Daniel; [et al] – A full featured configurable accelerator for object detection with YOLO. IEEE Access. ISSN 2169-3536. Vol. 9 (2021), pp. 75864-75877
2169-3536
10.1109/ACCESS.2021.3081818
dc.rights.driver.fl_str_mv metadata only access
info:eu-repo/semantics/openAccess
rights_invalid_str_mv metadata only access
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv IEEE
publisher.none.fl_str_mv IEEE
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133487156953088