On the role of multimodal learning in the recognition of sign language

Detalhes bibliográficos
Autor(a) principal: Ferreira, Pedro M.
Data de Publicação: 2018
Outros Autores: Cardoso, Jaime S., Rebelo, Ana
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/11328/2500
Resumo: Sign Language Recognition (SLR) has become one of the most important research areas in the field of human computer interaction. SLR systems are meant to automatically translate sign language into text or speech, in order to reduce the communicational gap between deaf and hearing people. The aim of this paper is to exploit multimodal learning techniques for an accurate SLR, making use of data provided by Kinect and Leap Motion. In this regard, single-modality approaches as well as different multimodal methods, mainly based on convolutional neural networks, are proposed. Our main contribution is a novel multimodal end-to-end neural network that explicitly models private feature representations that are specific to each modality and shared feature representations that are similar between modalities. By imposing such regularization in the learning process, the underlying idea is to increase the discriminative ability of the learned features and, hence, improve the generalization capability of the model. Experimental results demonstrate that multimodal learning yields an overall improvement in the sign recognition performance. In particular, the novel neural network architecture outperforms the current state-of-the-art methods for the SLR task.
id RCAP_0083286d275bf1c3996317fbc111d588
oai_identifier_str oai:repositorio.uportu.pt:11328/2500
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str
spelling On the role of multimodal learning in the recognition of sign languageSign language recognitionMultimodal learningConvolutional neural networksKinect Leap motionSign Language Recognition (SLR) has become one of the most important research areas in the field of human computer interaction. SLR systems are meant to automatically translate sign language into text or speech, in order to reduce the communicational gap between deaf and hearing people. The aim of this paper is to exploit multimodal learning techniques for an accurate SLR, making use of data provided by Kinect and Leap Motion. In this regard, single-modality approaches as well as different multimodal methods, mainly based on convolutional neural networks, are proposed. Our main contribution is a novel multimodal end-to-end neural network that explicitly models private feature representations that are specific to each modality and shared feature representations that are similar between modalities. By imposing such regularization in the learning process, the underlying idea is to increase the discriminative ability of the learned features and, hence, improve the generalization capability of the model. Experimental results demonstrate that multimodal learning yields an overall improvement in the sign recognition performance. In particular, the novel neural network architecture outperforms the current state-of-the-art methods for the SLR task.Springer2019-01-02T16:46:57Z2019-12-31T00:00:00Z2018-01-01T00:00:00Z2018info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/11328/2500eng1573-7721https://doi.org/10.1007/s11042-018-6565-5Ferreira, Pedro M.Cardoso, Jaime S.Rebelo, Anainfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-06-15T02:10:50ZPortal AgregadorONG
dc.title.none.fl_str_mv On the role of multimodal learning in the recognition of sign language
title On the role of multimodal learning in the recognition of sign language
spellingShingle On the role of multimodal learning in the recognition of sign language
Ferreira, Pedro M.
Sign language recognition
Multimodal learning
Convolutional neural networks
Kinect Leap motion
title_short On the role of multimodal learning in the recognition of sign language
title_full On the role of multimodal learning in the recognition of sign language
title_fullStr On the role of multimodal learning in the recognition of sign language
title_full_unstemmed On the role of multimodal learning in the recognition of sign language
title_sort On the role of multimodal learning in the recognition of sign language
author Ferreira, Pedro M.
author_facet Ferreira, Pedro M.
Cardoso, Jaime S.
Rebelo, Ana
author_role author
author2 Cardoso, Jaime S.
Rebelo, Ana
author2_role author
author
dc.contributor.author.fl_str_mv Ferreira, Pedro M.
Cardoso, Jaime S.
Rebelo, Ana
dc.subject.por.fl_str_mv Sign language recognition
Multimodal learning
Convolutional neural networks
Kinect Leap motion
topic Sign language recognition
Multimodal learning
Convolutional neural networks
Kinect Leap motion
description Sign Language Recognition (SLR) has become one of the most important research areas in the field of human computer interaction. SLR systems are meant to automatically translate sign language into text or speech, in order to reduce the communicational gap between deaf and hearing people. The aim of this paper is to exploit multimodal learning techniques for an accurate SLR, making use of data provided by Kinect and Leap Motion. In this regard, single-modality approaches as well as different multimodal methods, mainly based on convolutional neural networks, are proposed. Our main contribution is a novel multimodal end-to-end neural network that explicitly models private feature representations that are specific to each modality and shared feature representations that are similar between modalities. By imposing such regularization in the learning process, the underlying idea is to increase the discriminative ability of the learned features and, hence, improve the generalization capability of the model. Experimental results demonstrate that multimodal learning yields an overall improvement in the sign recognition performance. In particular, the novel neural network architecture outperforms the current state-of-the-art methods for the SLR task.
publishDate 2018
dc.date.none.fl_str_mv 2018-01-01T00:00:00Z
2018
2019-01-02T16:46:57Z
2019-12-31T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/11328/2500
url http://hdl.handle.net/11328/2500
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1573-7721
https://doi.org/10.1007/s11042-018-6565-5
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Springer
publisher.none.fl_str_mv Springer
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1777302553030033408