Object Detection in Omnidirectional Images

Detalhes bibliográficos
Autor(a) principal: Henriques, Francisco António Agostinho
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.8/5591
Resumo: Nowadays, computer vision (CV) is widely used to solve real-world problems, which pose increasingly higher challenges. In this context, the use of omnidirectional video in a growing number of applications, along with the fast development of Deep Learning (DL) algorithms for object detection, drives the need for further research to improve existing methods originally developed for conventional 2D planar images. However, the geometric distortion that common sphere-to-plane projections produce, mostly visible in objects near the poles, in addition to the lack of omnidirectional open-source labeled image datasets has made an accurate spherical image-based object detection algorithm a hard goal to achieve. This work is a contribution to develop datasets and machine learning models particularly suited for omnidirectional images, represented in planar format through the well-known Equirectangular Projection (ERP). To this aim, DL methods are explored to improve the detection of visual objects in omnidirectional images, by considering the inherent distortions of ERP. An experimental study was, firstly, carried out to find out whether the error rate and type of detection errors were related to the characteristics of ERP images. Such study revealed that the error rate of object detection using existing DL models with ERP images, actually, depends on the object spherical location in the image. Then, based on such findings, a new object detection framework is proposed to obtain a uniform error rate across the whole spherical image regions. The results show that the pre and post-processing stages of the implemented framework effectively contribute to reducing the performance dependency on the image region, evaluated by the above-mentioned metric.
id RCAP_0a8508e833376817ab58431f08c0448f
oai_identifier_str oai:iconline.ipleiria.pt:10400.8/5591
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Object Detection in Omnidirectional ImagesComputer VisionDeep LearningObject DetectionEquirectangular ProjectionOmnidirectional imagesDomínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaNowadays, computer vision (CV) is widely used to solve real-world problems, which pose increasingly higher challenges. In this context, the use of omnidirectional video in a growing number of applications, along with the fast development of Deep Learning (DL) algorithms for object detection, drives the need for further research to improve existing methods originally developed for conventional 2D planar images. However, the geometric distortion that common sphere-to-plane projections produce, mostly visible in objects near the poles, in addition to the lack of omnidirectional open-source labeled image datasets has made an accurate spherical image-based object detection algorithm a hard goal to achieve. This work is a contribution to develop datasets and machine learning models particularly suited for omnidirectional images, represented in planar format through the well-known Equirectangular Projection (ERP). To this aim, DL methods are explored to improve the detection of visual objects in omnidirectional images, by considering the inherent distortions of ERP. An experimental study was, firstly, carried out to find out whether the error rate and type of detection errors were related to the characteristics of ERP images. Such study revealed that the error rate of object detection using existing DL models with ERP images, actually, depends on the object spherical location in the image. Then, based on such findings, a new object detection framework is proposed to obtain a uniform error rate across the whole spherical image regions. The results show that the pre and post-processing stages of the implemented framework effectively contribute to reducing the performance dependency on the image region, evaluated by the above-mentioned metric.Costa, Joana Madeira MartinsAssunção, Pedro António Amado deSilva, Catarina Helena Branco Simões daIC-OnlineHenriques, Francisco António Agostinho2021-03-31T09:40:50Z2021-01-262021-01-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.8/5591TID:202689514enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-17T15:51:26Zoai:iconline.ipleiria.pt:10400.8/5591Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:49:03.658213Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Object Detection in Omnidirectional Images
title Object Detection in Omnidirectional Images
spellingShingle Object Detection in Omnidirectional Images
Henriques, Francisco António Agostinho
Computer Vision
Deep Learning
Object Detection
Equirectangular Projection
Omnidirectional images
Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
title_short Object Detection in Omnidirectional Images
title_full Object Detection in Omnidirectional Images
title_fullStr Object Detection in Omnidirectional Images
title_full_unstemmed Object Detection in Omnidirectional Images
title_sort Object Detection in Omnidirectional Images
author Henriques, Francisco António Agostinho
author_facet Henriques, Francisco António Agostinho
author_role author
dc.contributor.none.fl_str_mv Costa, Joana Madeira Martins
Assunção, Pedro António Amado de
Silva, Catarina Helena Branco Simões da
IC-Online
dc.contributor.author.fl_str_mv Henriques, Francisco António Agostinho
dc.subject.por.fl_str_mv Computer Vision
Deep Learning
Object Detection
Equirectangular Projection
Omnidirectional images
Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
topic Computer Vision
Deep Learning
Object Detection
Equirectangular Projection
Omnidirectional images
Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
description Nowadays, computer vision (CV) is widely used to solve real-world problems, which pose increasingly higher challenges. In this context, the use of omnidirectional video in a growing number of applications, along with the fast development of Deep Learning (DL) algorithms for object detection, drives the need for further research to improve existing methods originally developed for conventional 2D planar images. However, the geometric distortion that common sphere-to-plane projections produce, mostly visible in objects near the poles, in addition to the lack of omnidirectional open-source labeled image datasets has made an accurate spherical image-based object detection algorithm a hard goal to achieve. This work is a contribution to develop datasets and machine learning models particularly suited for omnidirectional images, represented in planar format through the well-known Equirectangular Projection (ERP). To this aim, DL methods are explored to improve the detection of visual objects in omnidirectional images, by considering the inherent distortions of ERP. An experimental study was, firstly, carried out to find out whether the error rate and type of detection errors were related to the characteristics of ERP images. Such study revealed that the error rate of object detection using existing DL models with ERP images, actually, depends on the object spherical location in the image. Then, based on such findings, a new object detection framework is proposed to obtain a uniform error rate across the whole spherical image regions. The results show that the pre and post-processing stages of the implemented framework effectively contribute to reducing the performance dependency on the image region, evaluated by the above-mentioned metric.
publishDate 2021
dc.date.none.fl_str_mv 2021-03-31T09:40:50Z
2021-01-26
2021-01-26T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.8/5591
TID:202689514
url http://hdl.handle.net/10400.8/5591
identifier_str_mv TID:202689514
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799136983600070656