On the role of multimodal learning in the recognition of sign language
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/11328/2500 |
Resumo: | Sign Language Recognition (SLR) has become one of the most important research areas in the field of human computer interaction. SLR systems are meant to automatically translate sign language into text or speech, in order to reduce the communicational gap between deaf and hearing people. The aim of this paper is to exploit multimodal learning techniques for an accurate SLR, making use of data provided by Kinect and Leap Motion. In this regard, single-modality approaches as well as different multimodal methods, mainly based on convolutional neural networks, are proposed. Our main contribution is a novel multimodal end-to-end neural network that explicitly models private feature representations that are specific to each modality and shared feature representations that are similar between modalities. By imposing such regularization in the learning process, the underlying idea is to increase the discriminative ability of the learned features and, hence, improve the generalization capability of the model. Experimental results demonstrate that multimodal learning yields an overall improvement in the sign recognition performance. In particular, the novel neural network architecture outperforms the current state-of-the-art methods for the SLR task. |
id |
RCAP_0083286d275bf1c3996317fbc111d588 |
---|---|
oai_identifier_str |
oai:repositorio.uportu.pt:11328/2500 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
|
spelling |
On the role of multimodal learning in the recognition of sign languageSign language recognitionMultimodal learningConvolutional neural networksKinect Leap motionSign Language Recognition (SLR) has become one of the most important research areas in the field of human computer interaction. SLR systems are meant to automatically translate sign language into text or speech, in order to reduce the communicational gap between deaf and hearing people. The aim of this paper is to exploit multimodal learning techniques for an accurate SLR, making use of data provided by Kinect and Leap Motion. In this regard, single-modality approaches as well as different multimodal methods, mainly based on convolutional neural networks, are proposed. Our main contribution is a novel multimodal end-to-end neural network that explicitly models private feature representations that are specific to each modality and shared feature representations that are similar between modalities. By imposing such regularization in the learning process, the underlying idea is to increase the discriminative ability of the learned features and, hence, improve the generalization capability of the model. Experimental results demonstrate that multimodal learning yields an overall improvement in the sign recognition performance. In particular, the novel neural network architecture outperforms the current state-of-the-art methods for the SLR task.Springer2019-01-02T16:46:57Z2019-12-31T00:00:00Z2018-01-01T00:00:00Z2018info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/11328/2500eng1573-7721https://doi.org/10.1007/s11042-018-6565-5Ferreira, Pedro M.Cardoso, Jaime S.Rebelo, Anainfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-06-15T02:10:50ZPortal AgregadorONG |
dc.title.none.fl_str_mv |
On the role of multimodal learning in the recognition of sign language |
title |
On the role of multimodal learning in the recognition of sign language |
spellingShingle |
On the role of multimodal learning in the recognition of sign language Ferreira, Pedro M. Sign language recognition Multimodal learning Convolutional neural networks Kinect Leap motion |
title_short |
On the role of multimodal learning in the recognition of sign language |
title_full |
On the role of multimodal learning in the recognition of sign language |
title_fullStr |
On the role of multimodal learning in the recognition of sign language |
title_full_unstemmed |
On the role of multimodal learning in the recognition of sign language |
title_sort |
On the role of multimodal learning in the recognition of sign language |
author |
Ferreira, Pedro M. |
author_facet |
Ferreira, Pedro M. Cardoso, Jaime S. Rebelo, Ana |
author_role |
author |
author2 |
Cardoso, Jaime S. Rebelo, Ana |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Ferreira, Pedro M. Cardoso, Jaime S. Rebelo, Ana |
dc.subject.por.fl_str_mv |
Sign language recognition Multimodal learning Convolutional neural networks Kinect Leap motion |
topic |
Sign language recognition Multimodal learning Convolutional neural networks Kinect Leap motion |
description |
Sign Language Recognition (SLR) has become one of the most important research areas in the field of human computer interaction. SLR systems are meant to automatically translate sign language into text or speech, in order to reduce the communicational gap between deaf and hearing people. The aim of this paper is to exploit multimodal learning techniques for an accurate SLR, making use of data provided by Kinect and Leap Motion. In this regard, single-modality approaches as well as different multimodal methods, mainly based on convolutional neural networks, are proposed. Our main contribution is a novel multimodal end-to-end neural network that explicitly models private feature representations that are specific to each modality and shared feature representations that are similar between modalities. By imposing such regularization in the learning process, the underlying idea is to increase the discriminative ability of the learned features and, hence, improve the generalization capability of the model. Experimental results demonstrate that multimodal learning yields an overall improvement in the sign recognition performance. In particular, the novel neural network architecture outperforms the current state-of-the-art methods for the SLR task. |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-01-01T00:00:00Z 2018 2019-01-02T16:46:57Z 2019-12-31T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/11328/2500 |
url |
http://hdl.handle.net/11328/2500 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1573-7721 https://doi.org/10.1007/s11042-018-6565-5 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Springer |
publisher.none.fl_str_mv |
Springer |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
|
repository.mail.fl_str_mv |
|
_version_ |
1777302553030033408 |