HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?

Detalhes bibliográficos
Autor(a) principal: Souza, Hanna Kivistö
Data de Publicação: 2022
Outros Autores: Gottardi, William
Tipo de documento: preprint
Idioma: por
Título da fonte: SciELO Preprints
Texto Completo: https://preprints.scielo.org/index.php/scielo/preprint/view/4913
Resumo: Following the Covid-19 pandemic, digital technology is more present in classrooms than ever. Automatic Speech Recognition (ASR) offers interesting possibilities for language learners to produce more output in a foreign language (FL). ASR is especially suited for autonomous pronunciation learning when used as a dictation tool that transcribes the learner’s speech (McCROCKLIN, 2016). However, ASR tools are trained with monolingual native speakers in mind, not reflecting the global reality of English speakers. Consequently, the present study examined how well two ASR-based dictation tools understand foreign-accented speech, and which FL speech features cause intelligibility breakdowns. English speech samples of 15 Brazilian Portuguese and 15 Spanish speakers were obtained from an online database (WEINBERGER, 2015) and submitted to two ASR dictation tools: Microsoft Word and VoiceNotebook. The resulting transcriptions were manually inspected, coded and categorized. The results show that overall intelligibility was high for both tools. However, many features of normal FL speech, such as vowel and consonant substitution, caused the ASR dictation tools to misinterpret the message leading to communication breakdowns. The results are discussed from a pedagogical viewpoint.
id SCI-1_79a28ddc65556f318ab6b18bf89b205d
oai_identifier_str oai:ops.preprints.scielo.org:preprint/4913
network_acronym_str SCI-1
network_name_str SciELO Preprints
repository_id_str
spelling HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?QUÃO BEM A TECNOLOGIA RAF PODE ENTENDER A FALA COM SOTAQUE ESTRANGEIRO?inteligibilidadereconhecimento automático da faladesenvolvimento de pronúncia em LEaprendizagem autônomaintelligibilityautomatic speech recognitionL2 pronunciation developmentautonomous learning Following the Covid-19 pandemic, digital technology is more present in classrooms than ever. Automatic Speech Recognition (ASR) offers interesting possibilities for language learners to produce more output in a foreign language (FL). ASR is especially suited for autonomous pronunciation learning when used as a dictation tool that transcribes the learner’s speech (McCROCKLIN, 2016). However, ASR tools are trained with monolingual native speakers in mind, not reflecting the global reality of English speakers. Consequently, the present study examined how well two ASR-based dictation tools understand foreign-accented speech, and which FL speech features cause intelligibility breakdowns. English speech samples of 15 Brazilian Portuguese and 15 Spanish speakers were obtained from an online database (WEINBERGER, 2015) and submitted to two ASR dictation tools: Microsoft Word and VoiceNotebook. The resulting transcriptions were manually inspected, coded and categorized. The results show that overall intelligibility was high for both tools. However, many features of normal FL speech, such as vowel and consonant substitution, caused the ASR dictation tools to misinterpret the message leading to communication breakdowns. The results are discussed from a pedagogical viewpoint. Após a pandemia de Covid-19, as tecnologias digitais estão mais presente nas salas de aula do que nunca. O Reconhecimento Automático da Fala (RAF) oferece possibilidades interessantes para os aprendizes de uma língua estrangeira (LE) aumentarem sua produção oral. O RAF é especialmente adequado para a aprendizagem autônoma de pronúncia quando usado como uma ferramenta de ditado que transcreve a fala do estudante (McCROCKLIN, 2016). No entanto, as ferramentas de RAF são treinadas com falantes nativos monolíngues em mente, não refletindo a realidade dos falantes de inglês em uma escala global. Consequentemente, o presente estudo examinou quão bem duas ferramentas de ditado que utilizam ASR entendem a fala com sotaque estrangeiro e quais características causam falhas de inteligibilidade. Amostras de fala em inglês de 15 falantes de português brasileiro e 15 falantes de espanhol foram obtidas de um banco de dados online (WEINBERGER, 2015) e submetidas a duas ferramentas de ASR: Microsoft Word e VoiceNotebook. As transcrições foram manualmente inspecionadas, codificadas e categorizadas. Os resultados mostram que a inteligibilidade geral dos falantes foi alta para ambas as ferramentas. No entanto, muitas características normais, como modificações vocálicas e consonantais, da fala em LE fizeram com que as ferramentas de ditado ASR interpretassem mal a mensagem, levando a falhas de comunicação. Os resultados são discutidos do ponto de vista pedagógico. SciELO PreprintsSciELO PreprintsSciELO Preprints2022-10-26info:eu-repo/semantics/preprintinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://preprints.scielo.org/index.php/scielo/preprint/view/491310.1590/010318138668782v61n32022porhttps://preprints.scielo.org/index.php/scielo/article/view/4913/9547Copyright (c) 2022 Hanna Kivistö Souza, William Gottardihttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessSouza, Hanna KivistöGottardi, Williamreponame:SciELO Preprintsinstname:SciELOinstacron:SCI2022-10-26T14:02:59Zoai:ops.preprints.scielo.org:preprint/4913Servidor de preprintshttps://preprints.scielo.org/index.php/scieloONGhttps://preprints.scielo.org/index.php/scielo/oaiscielo.submission@scielo.orgopendoar:2022-10-26T14:02:59SciELO Preprints - SciELOfalse
dc.title.none.fl_str_mv HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
QUÃO BEM A TECNOLOGIA RAF PODE ENTENDER A FALA COM SOTAQUE ESTRANGEIRO?
title HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
spellingShingle HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
Souza, Hanna Kivistö
inteligibilidade
reconhecimento automático da fala
desenvolvimento de pronúncia em LE
aprendizagem autônoma
intelligibility
automatic speech recognition
L2 pronunciation development
autonomous learning
title_short HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
title_full HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
title_fullStr HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
title_full_unstemmed HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
title_sort HOW WELL CAN ASR TECHNOLOGY UNDERSTAND FOREIGN-ACCENTED SPEECH?
author Souza, Hanna Kivistö
author_facet Souza, Hanna Kivistö
Gottardi, William
author_role author
author2 Gottardi, William
author2_role author
dc.contributor.author.fl_str_mv Souza, Hanna Kivistö
Gottardi, William
dc.subject.por.fl_str_mv inteligibilidade
reconhecimento automático da fala
desenvolvimento de pronúncia em LE
aprendizagem autônoma
intelligibility
automatic speech recognition
L2 pronunciation development
autonomous learning
topic inteligibilidade
reconhecimento automático da fala
desenvolvimento de pronúncia em LE
aprendizagem autônoma
intelligibility
automatic speech recognition
L2 pronunciation development
autonomous learning
description Following the Covid-19 pandemic, digital technology is more present in classrooms than ever. Automatic Speech Recognition (ASR) offers interesting possibilities for language learners to produce more output in a foreign language (FL). ASR is especially suited for autonomous pronunciation learning when used as a dictation tool that transcribes the learner’s speech (McCROCKLIN, 2016). However, ASR tools are trained with monolingual native speakers in mind, not reflecting the global reality of English speakers. Consequently, the present study examined how well two ASR-based dictation tools understand foreign-accented speech, and which FL speech features cause intelligibility breakdowns. English speech samples of 15 Brazilian Portuguese and 15 Spanish speakers were obtained from an online database (WEINBERGER, 2015) and submitted to two ASR dictation tools: Microsoft Word and VoiceNotebook. The resulting transcriptions were manually inspected, coded and categorized. The results show that overall intelligibility was high for both tools. However, many features of normal FL speech, such as vowel and consonant substitution, caused the ASR dictation tools to misinterpret the message leading to communication breakdowns. The results are discussed from a pedagogical viewpoint.
publishDate 2022
dc.date.none.fl_str_mv 2022-10-26
dc.type.driver.fl_str_mv info:eu-repo/semantics/preprint
info:eu-repo/semantics/publishedVersion
format preprint
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://preprints.scielo.org/index.php/scielo/preprint/view/4913
10.1590/010318138668782v61n32022
url https://preprints.scielo.org/index.php/scielo/preprint/view/4913
identifier_str_mv 10.1590/010318138668782v61n32022
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://preprints.scielo.org/index.php/scielo/article/view/4913/9547
dc.rights.driver.fl_str_mv Copyright (c) 2022 Hanna Kivistö Souza, William Gottardi
https://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2022 Hanna Kivistö Souza, William Gottardi
https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv SciELO Preprints
SciELO Preprints
SciELO Preprints
publisher.none.fl_str_mv SciELO Preprints
SciELO Preprints
SciELO Preprints
dc.source.none.fl_str_mv reponame:SciELO Preprints
instname:SciELO
instacron:SCI
instname_str SciELO
instacron_str SCI
institution SCI
reponame_str SciELO Preprints
collection SciELO Preprints
repository.name.fl_str_mv SciELO Preprints - SciELO
repository.mail.fl_str_mv scielo.submission@scielo.org
_version_ 1797047830470197248