Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results

Freitas, J.; Teixeira, A.; Dias, M. S.

Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results

Detalhes bibliográficos
Autor(a) principal:	Freitas, J.
Data de Publicação:	2013
Outros Autores:	Teixeira, A., Dias, M. S.
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10071/29220
Resumo:	Silent Speech Interfaces use data from the speech production process, such as visual information of face movements. However, using a single modality limits the amount of available information. In this study we start to explore the use of multiple data input modalities in order to acquire a more complete representation of the speech production model. We have selected 4 non-invasive modalities – Visual data from Video and Depth, Surface Electromyography and Ultrasonic Doppler - and created a system that explores the synchronous combination of all 4, or of a subset of them, into a multimodal Silent Speech Interface (SSI). This paper describes the system design, data collection and first word recognition results. As the first acquired corpora are necessarily small for this SSI, we use for classification an example based recognition approach based on Dynamic Time Warping followed by a weighted k-Nearest Neighbor classifier. The first classification results using different vocabularies, with digits, a small set of commands related to Ambient Assisted Living and minimal nasal pairs, show that word recognition benefits can be obtained from a multimodal approach.

Metadados do item

id	RCAP_55e46d9bd1ff0a64de089543f41236c9
oai_identifier_str	oai:repositorio.iscte-iul.pt:10071/29220
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition resultsSilent speech interfacesMultimodalVideo and depth informationSurface electromyographyUltrasonic doppler sensingSilent Speech Interfaces use data from the speech production process, such as visual information of face movements. However, using a single modality limits the amount of available information. In this study we start to explore the use of multiple data input modalities in order to acquire a more complete representation of the speech production model. We have selected 4 non-invasive modalities – Visual data from Video and Depth, Surface Electromyography and Ultrasonic Doppler - and created a system that explores the synchronous combination of all 4, or of a subset of them, into a multimodal Silent Speech Interface (SSI). This paper describes the system design, data collection and first word recognition results. As the first acquired corpora are necessarily small for this SSI, we use for classification an example based recognition approach based on Dynamic Time Warping followed by a weighted k-Nearest Neighbor classifier. The first classification results using different vocabularies, with digits, a small set of commands related to Ambient Assisted Living and minimal nasal pairs, show that word recognition benefits can be obtained from a multimodal approach.International Speech and Communication Association2023-08-30T14:09:41Z2013-01-01T00:00:00Z20132023-08-30T15:06:56Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10071/29220eng2308-457XFreitas, J.Teixeira, A.Dias, M. S.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-07-07T03:21:31Zoai:repositorio.iscte-iul.pt:10071/29220Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-07-07T03:21:31Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
spellingShingle	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results Freitas, J. Silent speech interfaces Multimodal Video and depth information Surface electromyography Ultrasonic doppler sensing
title_short	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_full	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_fullStr	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_full_unstemmed	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_sort	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
author	Freitas, J.
author_facet	Freitas, J. Teixeira, A. Dias, M. S.
author_role	author
author2	Teixeira, A. Dias, M. S.
author2_role	author author
dc.contributor.author.fl_str_mv	Freitas, J. Teixeira, A. Dias, M. S.
dc.subject.por.fl_str_mv	Silent speech interfaces Multimodal Video and depth information Surface electromyography Ultrasonic doppler sensing
topic	Silent speech interfaces Multimodal Video and depth information Surface electromyography Ultrasonic doppler sensing
description	Silent Speech Interfaces use data from the speech production process, such as visual information of face movements. However, using a single modality limits the amount of available information. In this study we start to explore the use of multiple data input modalities in order to acquire a more complete representation of the speech production model. We have selected 4 non-invasive modalities – Visual data from Video and Depth, Surface Electromyography and Ultrasonic Doppler - and created a system that explores the synchronous combination of all 4, or of a subset of them, into a multimodal Silent Speech Interface (SSI). This paper describes the system design, data collection and first word recognition results. As the first acquired corpora are necessarily small for this SSI, we use for classification an example based recognition approach based on Dynamic Time Warping followed by a weighted k-Nearest Neighbor classifier. The first classification results using different vocabularies, with digits, a small set of commands related to Ambient Assisted Living and minimal nasal pairs, show that word recognition benefits can be obtained from a multimodal approach.
publishDate	2013
dc.date.none.fl_str_mv	2013-01-01T00:00:00Z 2013 2023-08-30T14:09:41Z 2023-08-30T15:06:56Z
dc.type.driver.fl_str_mv	conference object
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10071/29220
url	http://hdl.handle.net/10071/29220
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	2308-457X
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	International Speech and Communication Association
publisher.none.fl_str_mv	International Speech and Communication Association
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv	mluisa.alvim@gmail.com
_version_	1817546449517281280

Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results

Registros relacionados