Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set
Autor(a) principal: | |
---|---|
Data de Publicação: | 2006 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://proa.ua.pt/index.php/revdeti/article/view/17286 |
Resumo: | In this study evaluation of two self-learning methods (MBL and TBL) on European Portuguese grapheme-tophone conversion is presented. Combinations (parallel andcascade) of the two systems were also tested. The usefulness of using syllable related information in machine learning approaches is also investigated. Systems with good performance were obtained both using a single self-learning method and combinations. Best performance was obtained with MBL and the parallel combination. The use of syllable information contributes to a better performance in all systems tested, being the effect significant statistically. Our best machine based systems present Word Error Rate and Mean Normalized Levenshtein Distance similar to those recently obtained for German when using similar features. |
id |
RCAP_ff95df1ee76b770037ffc947299444e7 |
---|---|
oai_identifier_str |
oai:proa.ua.pt:article/17286 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature setIn this study evaluation of two self-learning methods (MBL and TBL) on European Portuguese grapheme-tophone conversion is presented. Combinations (parallel andcascade) of the two systems were also tested. The usefulness of using syllable related information in machine learning approaches is also investigated. Systems with good performance were obtained both using a single self-learning method and combinations. Best performance was obtained with MBL and the parallel combination. The use of syllable information contributes to a better performance in all systems tested, being the effect significant statistically. Our best machine based systems present Word Error Rate and Mean Normalized Levenshtein Distance similar to those recently obtained for German when using similar features.Neste trabalho, são testados dois métodos de aprendizagem automática (MBL e TBL), bem como combinações destes métodos (em paralelo e em cascata), aplicados à tarefa de conversão grafema-fone do Português Europeu. É ainda investigado o interesse em utilizar informação silábica neste tipo de abordagem automática. Os melhores resultados são alcançados com o MBL e uma combinação dos dois métodos em paralelo. Em todos os sistemas testados, a inclusão de informação relativa à sílaba contribui para uma melhoria do desempenho, sendo a diferença estatisticamente significativa. Os sistemas com desempenhos mais elevados apresentam uma taxa de erro e uma Distância de Levenshtein similar à recentemente obtida para o Alemão, usando os mesmos modelos de treino.UA Editora2006-01-01T00:00:00Zjournal articleinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://proa.ua.pt/index.php/revdeti/article/view/17286oai:proa.ua.pt:article/17286Eletrónica e Telecomunicações; Vol 4 No 6 (2006); 746-751Eletrónica e Telecomunicações; vol. 4 n.º 6 (2006); 746-7512182-97721645-0493reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPenghttps://proa.ua.pt/index.php/revdeti/article/view/17286https://proa.ua.pt/index.php/revdeti/article/view/17286/12318https://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessTeixeira, AntónioOliveira, CatarinaMoutinho, Lurdes Castro2022-09-26T11:00:12Zoai:proa.ua.pt:article/17286Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T16:08:09.128722Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
title |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
spellingShingle |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set Teixeira, António |
title_short |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
title_full |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
title_fullStr |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
title_full_unstemmed |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
title_sort |
Machine learning of European Portuguese grapheme-to-phone conversion using a richer feature set |
author |
Teixeira, António |
author_facet |
Teixeira, António Oliveira, Catarina Moutinho, Lurdes Castro |
author_role |
author |
author2 |
Oliveira, Catarina Moutinho, Lurdes Castro |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Teixeira, António Oliveira, Catarina Moutinho, Lurdes Castro |
description |
In this study evaluation of two self-learning methods (MBL and TBL) on European Portuguese grapheme-tophone conversion is presented. Combinations (parallel andcascade) of the two systems were also tested. The usefulness of using syllable related information in machine learning approaches is also investigated. Systems with good performance were obtained both using a single self-learning method and combinations. Best performance was obtained with MBL and the parallel combination. The use of syllable information contributes to a better performance in all systems tested, being the effect significant statistically. Our best machine based systems present Word Error Rate and Mean Normalized Levenshtein Distance similar to those recently obtained for German when using similar features. |
publishDate |
2006 |
dc.date.none.fl_str_mv |
2006-01-01T00:00:00Z |
dc.type.driver.fl_str_mv |
journal article info:eu-repo/semantics/article |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://proa.ua.pt/index.php/revdeti/article/view/17286 oai:proa.ua.pt:article/17286 |
url |
https://proa.ua.pt/index.php/revdeti/article/view/17286 |
identifier_str_mv |
oai:proa.ua.pt:article/17286 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://proa.ua.pt/index.php/revdeti/article/view/17286 https://proa.ua.pt/index.php/revdeti/article/view/17286/12318 |
dc.rights.driver.fl_str_mv |
https://creativecommons.org/licenses/by/4.0/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/4.0/ |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
UA Editora |
publisher.none.fl_str_mv |
UA Editora |
dc.source.none.fl_str_mv |
Eletrónica e Telecomunicações; Vol 4 No 6 (2006); 746-751 Eletrónica e Telecomunicações; vol. 4 n.º 6 (2006); 746-751 2182-9772 1645-0493 reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799130539100209152 |