Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations

Diogo, João Paulo Cabete Gonçalves

Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations

Detalhes bibliográficos
Autor(a) principal:	Diogo, João Paulo Cabete Gonçalves
Data de Publicação:	2022
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10362/151152
Resumo:	With the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.

Metadados do item

id	RCAP_1adea7bfbb52de6fa7a0f2983b90c9ff
oai_identifier_str	oai:run.unl.pt:10362/151152
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Integrating 3D Objects and Pose Estimation for Multimodal Video AnnotationsVideo AnnotationNote-makingVirtual 3D ModelsPose EstimationMultimodal InterfacesHCIDomínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaWith the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.Ao longo dos anos, a utilização de video tornou-se um aspecto fundamental em várias das atividades realizadas no quotidiano como seja em demonstrações e apresentações profissionais, para a análise minuciosa de detalhes visuais ou até simplesmente para preservar videos considerados relevantes. Deste modo, o uso de anotações no decorrer destes processos e semelhantes, constitui um fator de elevada importância ao melhorar potencialmente a nossa compreensão relativa aos conteúdos em causa e também a ajudar a reter características importantes ou a documentar informação pertinente. Efetivamente, nesta tese pretende-se introduzir novas funcionalidades para uma fer- ramenta de anotação multimodal, nomeadamente, a integração de componentes 3D no sistema atual e algorítmos de Pose Estimation com vista à deteção de elementos em mo- vimento em video. Assim, com estas features procura-se proporcionar um experiência mais imersiva ao utilizador ao permitir, por exemplo, a visualização preliminar de objec- tos num plano tridimensional em fundos neutros ou até 360º antes de os utilizar como elementos de anotação tradicionais. Com efeito, serão explorados mecanismos para a integração eficiente destes modelos 3D em video juntamente com o uso de keypoints (pose estimation) permitindo acentuar pormenores neste ambiente de visualização. O objetivo desta tese será, assim, o desenvol- vimento e avaliação continuada destas funcionalidades de modo a potenciar o seu uso em ambientes virtuais em simultaneo com as diferentes tipos de anotações já existentes.Correia, NunoRUNDiogo, João Paulo Cabete Gonçalves2023-03-24T11:39:05Z2022-122022-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/151152enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:33:37Zoai:run.unl.pt:10362/151152Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:54:29.340266Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
spellingShingle	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations Diogo, João Paulo Cabete Gonçalves Video Annotation Note-making Virtual 3D Models Pose Estimation Multimodal Interfaces HCI Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
title_short	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_full	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_fullStr	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_full_unstemmed	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
title_sort	Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations
author	Diogo, João Paulo Cabete Gonçalves
author_facet	Diogo, João Paulo Cabete Gonçalves
author_role	author
dc.contributor.none.fl_str_mv	Correia, Nuno RUN
dc.contributor.author.fl_str_mv	Diogo, João Paulo Cabete Gonçalves
dc.subject.por.fl_str_mv	Video Annotation Note-making Virtual 3D Models Pose Estimation Multimodal Interfaces HCI Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
topic	Video Annotation Note-making Virtual 3D Models Pose Estimation Multimodal Interfaces HCI Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
description	With the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.
publishDate	2022
dc.date.none.fl_str_mv	2022-12 2022-12-01T00:00:00Z 2023-03-24T11:39:05Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10362/151152
url	http://hdl.handle.net/10362/151152
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799138133726461952

Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations

Registros relacionados