GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1109/ICIP46576.2022.9897911 http://hdl.handle.net/11449/248246 |
Resumo: | Despite the impressive advances obtained by supervised deep learning approaches on retrieval and classification tasks, how to acquire labeled data for training remains a challenging bottleneck. In this scenario, the need for developing more effective content-based retrieval approaches capable of taking advantage of multimodal information and advances in unsupervised learning becomes imperative. Based on such observations, we propose two novel approaches that combine Graph Convolutional Networks (GCNs) with rank-based manifold learning methods. The GCN models were trained in an unsupervised way, using the Deep Graph Infomax algorithm, and the proposed approaches employ recent rank-based manifold learning methods. Multimodal information is exploited through pre-trained CNNs via transfer learning for extracting audio, image, and video features. The proposed approaches were evaluated on three public action recognition datasets. High-effective results were obtained, reaching relative gains up to +29.44% of MAP compared to baseline approaches without GCNs. The experimental evaluation also considered classical and recent baselines in the literature. |
id |
UNSP_11abca217d7dd95523d8baae420ee56e |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/248246 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVALgraph convolutional networksmanifold learningrank aggregationvideo multimodal retrievalDespite the impressive advances obtained by supervised deep learning approaches on retrieval and classification tasks, how to acquire labeled data for training remains a challenging bottleneck. In this scenario, the need for developing more effective content-based retrieval approaches capable of taking advantage of multimodal information and advances in unsupervised learning becomes imperative. Based on such observations, we propose two novel approaches that combine Graph Convolutional Networks (GCNs) with rank-based manifold learning methods. The GCN models were trained in an unsupervised way, using the Deep Graph Infomax algorithm, and the proposed approaches employ recent rank-based manifold learning methods. Multimodal information is exploited through pre-trained CNNs via transfer learning for extracting audio, image, and video features. The proposed approaches were evaluated on three public action recognition datasets. High-effective results were obtained, reaching relative gains up to +29.44% of MAP compared to baseline approaches without GCNs. The experimental evaluation also considered classical and recent baselines in the literature.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Department of Statistics Applied Mathematics and Computing (DEMAC) São Paulo State University (UNESP)Department of Statistics Applied Mathematics and Computing (DEMAC) São Paulo State University (UNESP)FAPESP: #2018/15597-6FAPESP: #2020/03311-0FAPESP: #2020/11366-0Universidade Estadual Paulista (UNESP)de Almeida, Lucas Barbosa [UNESP]Valem, Lucas Pascotti [UNESP]Pedronette, Daniel Carlos Guimarães [UNESP]2023-07-29T13:38:35Z2023-07-29T13:38:35Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject2811-2815http://dx.doi.org/10.1109/ICIP46576.2022.9897911Proceedings - International Conference on Image Processing, ICIP, p. 2811-2815.1522-4880http://hdl.handle.net/11449/24824610.1109/ICIP46576.2022.98979112-s2.0-85146715017Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengProceedings - International Conference on Image Processing, ICIPinfo:eu-repo/semantics/openAccess2023-07-29T13:38:35Zoai:repositorio.unesp.br:11449/248246Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T23:11:50.553686Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
title |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
spellingShingle |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL de Almeida, Lucas Barbosa [UNESP] graph convolutional networks manifold learning rank aggregation video multimodal retrieval |
title_short |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
title_full |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
title_fullStr |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
title_full_unstemmed |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
title_sort |
GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL |
author |
de Almeida, Lucas Barbosa [UNESP] |
author_facet |
de Almeida, Lucas Barbosa [UNESP] Valem, Lucas Pascotti [UNESP] Pedronette, Daniel Carlos Guimarães [UNESP] |
author_role |
author |
author2 |
Valem, Lucas Pascotti [UNESP] Pedronette, Daniel Carlos Guimarães [UNESP] |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (UNESP) |
dc.contributor.author.fl_str_mv |
de Almeida, Lucas Barbosa [UNESP] Valem, Lucas Pascotti [UNESP] Pedronette, Daniel Carlos Guimarães [UNESP] |
dc.subject.por.fl_str_mv |
graph convolutional networks manifold learning rank aggregation video multimodal retrieval |
topic |
graph convolutional networks manifold learning rank aggregation video multimodal retrieval |
description |
Despite the impressive advances obtained by supervised deep learning approaches on retrieval and classification tasks, how to acquire labeled data for training remains a challenging bottleneck. In this scenario, the need for developing more effective content-based retrieval approaches capable of taking advantage of multimodal information and advances in unsupervised learning becomes imperative. Based on such observations, we propose two novel approaches that combine Graph Convolutional Networks (GCNs) with rank-based manifold learning methods. The GCN models were trained in an unsupervised way, using the Deep Graph Infomax algorithm, and the proposed approaches employ recent rank-based manifold learning methods. Multimodal information is exploited through pre-trained CNNs via transfer learning for extracting audio, image, and video features. The proposed approaches were evaluated on three public action recognition datasets. High-effective results were obtained, reaching relative gains up to +29.44% of MAP compared to baseline approaches without GCNs. The experimental evaluation also considered classical and recent baselines in the literature. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-01-01 2023-07-29T13:38:35Z 2023-07-29T13:38:35Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1109/ICIP46576.2022.9897911 Proceedings - International Conference on Image Processing, ICIP, p. 2811-2815. 1522-4880 http://hdl.handle.net/11449/248246 10.1109/ICIP46576.2022.9897911 2-s2.0-85146715017 |
url |
http://dx.doi.org/10.1109/ICIP46576.2022.9897911 http://hdl.handle.net/11449/248246 |
identifier_str_mv |
Proceedings - International Conference on Image Processing, ICIP, p. 2811-2815. 1522-4880 10.1109/ICIP46576.2022.9897911 2-s2.0-85146715017 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Proceedings - International Conference on Image Processing, ICIP |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
2811-2815 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808129498505281536 |