Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set

Detalhes bibliográficos
Autor(a) principal: Teixeira, António
Data de Publicação: 2006
Outros Autores: Oliveira, Catarina, Moutinho, Lurdes Castro
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://proa.ua.pt/index.php/revdeti/article/view/17286
Resumo: In this study evaluation of two self-learning methods (MBL and TBL) on European Portuguese grapheme-tophone conversion is presented. Combinations (parallel andcascade) of the two systems were also tested. The usefulness of using syllable related information in machine learning approaches is also investigated. Systems with good performance were obtained both using a single self-learning method and combinations. Best performance was obtained with MBL and the parallel combination. The use of syllable information contributes to a better performance in all systems tested, being the effect significant statistically. Our best machine based systems present Word Error Rate and Mean Normalized Levenshtein Distance similar to those recently obtained for German when using similar features.
id RCAP_ff95df1ee76b770037ffc947299444e7
oai_identifier_str oai:proa.ua.pt:article/17286
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature setIn this study evaluation of two self-learning methods (MBL and TBL) on European Portuguese grapheme-tophone conversion is presented. Combinations (parallel andcascade) of the two systems were also tested. The usefulness of using syllable related information in machine learning approaches is also investigated. Systems with good performance were obtained both using a single self-learning method and combinations. Best performance was obtained with MBL and the parallel combination. The use of syllable information contributes to a better performance in all systems tested, being the effect significant statistically. Our best machine based systems present Word Error Rate and Mean Normalized Levenshtein Distance similar to those recently obtained for German when using similar features.Neste trabalho, são testados dois métodos de aprendizagem automática (MBL e TBL), bem como combinações destes métodos (em paralelo e em cascata), aplicados à tarefa de conversão grafema-fone do Português Europeu. É ainda investigado o interesse em utilizar informação silábica neste tipo de abordagem automática. Os melhores resultados são alcançados com o MBL e uma combinação dos dois métodos em paralelo. Em todos os sistemas testados, a inclusão de informação relativa à sílaba contribui para uma melhoria do desempenho, sendo a diferença estatisticamente significativa. Os sistemas com desempenhos mais elevados apresentam uma taxa de erro e uma Distância de Levenshtein similar à recentemente obtida para o Alemão, usando os mesmos modelos de treino.UA Editora2006-01-01T00:00:00Zjournal articleinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://proa.ua.pt/index.php/revdeti/article/view/17286oai:proa.ua.pt:article/17286Eletrónica e Telecomunicações; Vol 4 No 6 (2006); 746-751Eletrónica e Telecomunicações; vol. 4 n.º 6 (2006); 746-7512182-97721645-0493reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPenghttps://proa.ua.pt/index.php/revdeti/article/view/17286https://proa.ua.pt/index.php/revdeti/article/view/17286/12318https://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessTeixeira, AntónioOliveira, CatarinaMoutinho, Lurdes Castro2022-09-26T11:00:12Zoai:proa.ua.pt:article/17286Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T16:08:09.128722Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
title Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
spellingShingle Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
Teixeira, António
title_short Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
title_full Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
title_fullStr Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
title_full_unstemmed Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
title_sort Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
author Teixeira, António
author_facet Teixeira, António
Oliveira, Catarina
Moutinho, Lurdes Castro
author_role author
author2 Oliveira, Catarina
Moutinho, Lurdes Castro
author2_role author
author
dc.contributor.author.fl_str_mv Teixeira, António
Oliveira, Catarina
Moutinho, Lurdes Castro
description In this study evaluation of two self-learning methods (MBL and TBL) on European Portuguese grapheme-tophone conversion is presented. Combinations (parallel andcascade) of the two systems were also tested. The usefulness of using syllable related information in machine learning approaches is also investigated. Systems with good performance were obtained both using a single self-learning method and combinations. Best performance was obtained with MBL and the parallel combination. The use of syllable information contributes to a better performance in all systems tested, being the effect significant statistically. Our best machine based systems present Word Error Rate and Mean Normalized Levenshtein Distance similar to those recently obtained for German when using similar features.
publishDate 2006
dc.date.none.fl_str_mv 2006-01-01T00:00:00Z
dc.type.driver.fl_str_mv journal article
info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://proa.ua.pt/index.php/revdeti/article/view/17286
oai:proa.ua.pt:article/17286
url https://proa.ua.pt/index.php/revdeti/article/view/17286
identifier_str_mv oai:proa.ua.pt:article/17286
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://proa.ua.pt/index.php/revdeti/article/view/17286
https://proa.ua.pt/index.php/revdeti/article/view/17286/12318
dc.rights.driver.fl_str_mv https://creativecommons.org/licenses/by/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv UA Editora
publisher.none.fl_str_mv UA Editora
dc.source.none.fl_str_mv Eletrónica e Telecomunicações; Vol 4 No 6 (2006); 746-751
Eletrónica e Telecomunicações; vol. 4 n.º 6 (2006); 746-751
2182-9772
1645-0493
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799130539100209152