Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations

Detalhes bibliográficos
Autor(a) principal: Diogo, João Paulo Cabete Gonçalves
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/151152
Resumo: With the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.
id RCAP_1adea7bfbb52de6fa7a0f2983b90c9ff
oai_identifier_str oai:run.unl.pt:10362/151152
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Integrating 3D Objects and Pose Estimation for Multimodal Video AnnotationsVideo AnnotationNote-makingVirtual 3D ModelsPose EstimationMultimodal InterfacesHCIDomínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaWith the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.Ao longo dos anos, a utilização de video tornou-se um aspecto fundamental em várias das atividades realizadas no quotidiano como seja em demonstrações e apresentações profissionais, para a análise minuciosa de detalhes visuais ou até simplesmente para preservar videos considerados relevantes. Deste modo, o uso de anotações no decorrer destes processos e semelhantes, constitui um fator de elevada importância ao melhorar potencialmente a nossa compreensão relativa aos conteúdos em causa e também a ajudar a reter características importantes ou a documentar informação pertinente. Efetivamente, nesta tese pretende-se introduzir novas funcionalidades para uma fer- ramenta de anotação multimodal, nomeadamente, a integração de componentes 3D no sistema atual e algorítmos de Pose Estimation com vista à deteção de elementos em mo- vimento em video. Assim, com estas features procura-se proporcionar um experiência mais imersiva ao utilizador ao permitir, por exemplo, a visualização preliminar de objec- tos num plano tridimensional em fundos neutros ou até 360º antes de os utilizar como elementos de anotação tradicionais. Com efeito, serão explorados mecanismos para a integração eficiente destes modelos 3D em video juntamente com o uso de keypoints (pose estimation) permitindo acentuar pormenores neste ambiente de visualização. O objetivo desta tese será, assim, o desenvol- vimento e avaliação continuada destas funcionalidades de modo a potenciar o seu uso em ambientes virtuais em simultaneo com as diferentes tipos de anotações já existentes.Correia, NunoRUNDiogo, João Paulo Cabete Gonçalves2023-03-24T11:39:05Z2022-122022-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/151152enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:33:37Zoai:run.unl.pt:10362/151152Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:54:29.340266Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
spellingShingle Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
Diogo, João Paulo Cabete Gonçalves
Video Annotation
Note-making
Virtual 3D Models
Pose Estimation
Multimodal Interfaces
HCI
Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
title_short Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_full Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_fullStr Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_full_unstemmed Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_sort Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
author Diogo, João Paulo Cabete Gonçalves
author_facet Diogo, João Paulo Cabete Gonçalves
author_role author
dc.contributor.none.fl_str_mv Correia, Nuno
RUN
dc.contributor.author.fl_str_mv Diogo, João Paulo Cabete Gonçalves
dc.subject.por.fl_str_mv Video Annotation
Note-making
Virtual 3D Models
Pose Estimation
Multimodal Interfaces
HCI
Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
topic Video Annotation
Note-making
Virtual 3D Models
Pose Estimation
Multimodal Interfaces
HCI
Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
description With the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.
publishDate 2022
dc.date.none.fl_str_mv 2022-12
2022-12-01T00:00:00Z
2023-03-24T11:39:05Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/151152
url http://hdl.handle.net/10362/151152
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138133726461952