Mispronunciation Detection in Children's Reading of Sentences

Proença, Jorge; Lopes, Carla Alexandra; Tjalve, Michael; Stolcke, Andreas; Candeias, Sara; Perdigão, Fernando

Mispronunciation Detection in Children's Reading of Sentences

Detalhes bibliográficos
Autor(a) principal:	Proença, Jorge
Data de Publicação:	2018
Outros Autores:	Lopes, Carla Alexandra, Tjalve, Michael, Stolcke, Andreas, Candeias, Sara, Perdigão, Fernando
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.8/3353
Resumo:	This work proposes an approach to automatically parse children’s reading of sentences by detecting word pronunciations and extra content, and to classify words as correctly or incorrectly pronounced. This approach can be directly helpful for automatic assessment of reading level or for automatic reading tutors, where a correct reading must be identified. We propose a first segmentation stage to locate candidate word pronunciations based on allowing repetitions and false starts of a word’s syllables. A decoding grammar based solely on syllables allows silence to appear during a word pronunciation. At a second stage, word candidates are classified as mispronounced or not. The feature that best classifies mispronunciations is found to be the log-likelihood ratio between a free phone loop and a word spotting model in the very close vicinity of the candidate segmentation. Additional features are combined in multi-feature models to further improve classification, including: normalizations of the log-likelihood ratio, derivations from phone likelihoods, and Levenshtein distances between the correct pronunciation and recognized phonemes through two phoneme recognition approaches. Results show that most extra events were detected (close to 2% word error rate achieved) and that using automatic segmentation for mispronunciation classification approaches the performance of manual segmentation. Although the log-likelihood ratio from a spotting approach is already a good metric to classify word pronunciations, the combination of additional features provides a relative reduction of the miss rate of 18% (from 34.03% to 27.79% using manual segmentation and from 35.58% to 29.35% using automatic segmentation, at constant 5% false alarm rate).

Metadados do item

id	RCAP_cf493c51392c28a27e3f8cf03bb3097a
oai_identifier_str	oai:iconline.ipleiria.pt:10400.8/3353
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Mispronunciation Detection in Children's Reading of SentencesSpeech analysisMispronunciation detectionChildren’s readingAutomatic reading annotationThis work proposes an approach to automatically parse children’s reading of sentences by detecting word pronunciations and extra content, and to classify words as correctly or incorrectly pronounced. This approach can be directly helpful for automatic assessment of reading level or for automatic reading tutors, where a correct reading must be identified. We propose a first segmentation stage to locate candidate word pronunciations based on allowing repetitions and false starts of a word’s syllables. A decoding grammar based solely on syllables allows silence to appear during a word pronunciation. At a second stage, word candidates are classified as mispronounced or not. The feature that best classifies mispronunciations is found to be the log-likelihood ratio between a free phone loop and a word spotting model in the very close vicinity of the candidate segmentation. Additional features are combined in multi-feature models to further improve classification, including: normalizations of the log-likelihood ratio, derivations from phone likelihoods, and Levenshtein distances between the correct pronunciation and recognized phonemes through two phoneme recognition approaches. Results show that most extra events were detected (close to 2% word error rate achieved) and that using automatic segmentation for mispronunciation classification approaches the performance of manual segmentation. Although the log-likelihood ratio from a spotting approach is already a good metric to classify word pronunciations, the combination of additional features provides a relative reduction of the miss rate of 18% (from 34.03% to 27.79% using manual segmentation and from 35.58% to 29.35% using automatic segmentation, at constant 5% false alarm rate).Institute of Electrical and Electronics EngineersIC-OnlineProença, JorgeLopes, Carla AlexandraTjalve, MichaelStolcke, AndreasCandeias, SaraPerdigão, Fernando2018-07-23T16:49:29Z2018-03-282018-03-28T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.8/3353eng10.1109/TASLP.2018.2820429info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-17T15:47:00Zoai:iconline.ipleiria.pt:10400.8/3353Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:47:28.875922Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Mispronunciation Detection in Children's Reading of Sentences
title	Mispronunciation Detection in Children's Reading of Sentences
spellingShingle	Mispronunciation Detection in Children's Reading of Sentences Proença, Jorge Speech analysis Mispronunciation detection Children’s reading Automatic reading annotation
title_short	Mispronunciation Detection in Children's Reading of Sentences
title_full	Mispronunciation Detection in Children's Reading of Sentences
title_fullStr	Mispronunciation Detection in Children's Reading of Sentences
title_full_unstemmed	Mispronunciation Detection in Children's Reading of Sentences
title_sort	Mispronunciation Detection in Children's Reading of Sentences
author	Proença, Jorge
author_facet	Proença, Jorge Lopes, Carla Alexandra Tjalve, Michael Stolcke, Andreas Candeias, Sara Perdigão, Fernando
author_role	author
author2	Lopes, Carla Alexandra Tjalve, Michael Stolcke, Andreas Candeias, Sara Perdigão, Fernando
author2_role	author author author author author
dc.contributor.none.fl_str_mv	IC-Online
dc.contributor.author.fl_str_mv	Proença, Jorge Lopes, Carla Alexandra Tjalve, Michael Stolcke, Andreas Candeias, Sara Perdigão, Fernando
dc.subject.por.fl_str_mv	Speech analysis Mispronunciation detection Children’s reading Automatic reading annotation
topic	Speech analysis Mispronunciation detection Children’s reading Automatic reading annotation
description	This work proposes an approach to automatically parse children’s reading of sentences by detecting word pronunciations and extra content, and to classify words as correctly or incorrectly pronounced. This approach can be directly helpful for automatic assessment of reading level or for automatic reading tutors, where a correct reading must be identified. We propose a first segmentation stage to locate candidate word pronunciations based on allowing repetitions and false starts of a word’s syllables. A decoding grammar based solely on syllables allows silence to appear during a word pronunciation. At a second stage, word candidates are classified as mispronounced or not. The feature that best classifies mispronunciations is found to be the log-likelihood ratio between a free phone loop and a word spotting model in the very close vicinity of the candidate segmentation. Additional features are combined in multi-feature models to further improve classification, including: normalizations of the log-likelihood ratio, derivations from phone likelihoods, and Levenshtein distances between the correct pronunciation and recognized phonemes through two phoneme recognition approaches. Results show that most extra events were detected (close to 2% word error rate achieved) and that using automatic segmentation for mispronunciation classification approaches the performance of manual segmentation. Although the log-likelihood ratio from a spotting approach is already a good metric to classify word pronunciations, the combination of additional features provides a relative reduction of the miss rate of 18% (from 34.03% to 27.79% using manual segmentation and from 35.58% to 29.35% using automatic segmentation, at constant 5% false alarm rate).
publishDate	2018
dc.date.none.fl_str_mv	2018-07-23T16:49:29Z 2018-03-28 2018-03-28T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.8/3353
url	http://hdl.handle.net/10400.8/3353
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	10.1109/TASLP.2018.2820429
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Institute of Electrical and Electronics Engineers
publisher.none.fl_str_mv	Institute of Electrical and Electronics Engineers
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136968886452224

Mispronunciation Detection in Children's Reading of Sentences

Registros relacionados