Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://doi.org/10.26334/2183-9077/rapln10ano2023a6 |
Resumo: | This paper describes the linguistic preprocessing methods on hybrid systems provided by an Artificial Intelligence (AI) international company, Defined.ai. The startup focuses on providing high-quality data, models, and AI tools. The main goal of this work is to enhance and advance the quality of preprocessing models by applying linguistic knowledge. Thus, we focus on two introductory linguistic models in a speech pipeline: Normalizer and Grapheme-to-Phone (G2P). To do so, two initiatives were conducted in collaboration with the Defined.ai Machine Learning team. The first project focuses on expanding and improving a European Portuguese Normalizer model. The second project covers creating G2P models for two different languages – Swedish and Russian. Results show that having a rule-based approach to the Normalizer and G2P increases its accuracy and performance, representing a significant advantage in improving Defined.ai tools and speech pipelines. Also, with the results obtained on the first project, we improved the normalizer in ease of use by increasing each rule with linguistic knowledge. Accordingly, our research demonstrates the added value of linguistic knowledge in preprocessing models. |
id |
RCAP_f8bcc15a53eba9338ab47824ef507077 |
---|---|
oai_identifier_str |
oai:ojs3.ojs.apl.pt:article/167 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systemsModelos de pré-processamento para tecnologias da fala : O impacto do normalizador e do grafema-fone nos sistemas híbridostecnologias de falanormalizadorgrafema-foneconhecimento linguísticomodelosspeech technologiesnormalizergrapheme-to-phonelinguistic knowledgemodelsThis paper describes the linguistic preprocessing methods on hybrid systems provided by an Artificial Intelligence (AI) international company, Defined.ai. The startup focuses on providing high-quality data, models, and AI tools. The main goal of this work is to enhance and advance the quality of preprocessing models by applying linguistic knowledge. Thus, we focus on two introductory linguistic models in a speech pipeline: Normalizer and Grapheme-to-Phone (G2P). To do so, two initiatives were conducted in collaboration with the Defined.ai Machine Learning team. The first project focuses on expanding and improving a European Portuguese Normalizer model. The second project covers creating G2P models for two different languages – Swedish and Russian. Results show that having a rule-based approach to the Normalizer and G2P increases its accuracy and performance, representing a significant advantage in improving Defined.ai tools and speech pipelines. Also, with the results obtained on the first project, we improved the normalizer in ease of use by increasing each rule with linguistic knowledge. Accordingly, our research demonstrates the added value of linguistic knowledge in preprocessing models.Este artigo descreve os métodos de pré-processamento linguístico em sistemas híbridos fornecidos por uma empresa internacional de Inteligência Artificial (IA), a Defined.ai. A startup concentra-se em fornecer dados, modelos e ferramentas de IA de alta qualidade. O objetivo principal deste trabalho é aprimorar e avançar a qualidade dos modelos de pré-processamento aplicando conhecimento linguístico. Assim, focamo-nos em dois modelos linguísticos introdutórios numa arquitetura de fala: o Normalizador e o Grafema-para-fone (G2P). Para isso, foram realizadas duas iniciativas em colaboração com a equipa de Machine Learning da Defined.ai. O primeiro projeto concentra-se em expandir e melhorar um modelo de Normalizador para o Português Europeu. O segundo projeto cobre a criação de modelos G2P para duas línguas – Sueco e Russo. Os resultados mostram que ter uma abordagem baseada em regras para o Normalizador e G2P aumenta a sua precisão e o seu desempenho, representando uma vantagem significativa na melhoria das ferramentas e das arquiteturas de fala da empresa. Além disso, com os resultados obtidos no primeiro projeto, melhoramos o normalizador em termos de facilidade de uso, aumentado cada regra com conhecimento linguístico. Dessa forma, a nossa pesquisa demonstra o valor do conhecimento linguístico em modelos de pré-processamento.Associação Portuguesa de Linguística2023-10-22info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.26334/2183-9077/rapln10ano2023a6https://doi.org/10.26334/2183-9077/rapln10ano2023a6Revista da Associação Portuguesa de Linguística; No. 10 (2023): Journal of the Portuguese Linguistics Association; 92–114Revista da Associação Portuguesa de Linguística; N.º 10 (2023): Revista da Associação Portuguesa de Linguística; 92–1142183-907710.26334/2183-9077/rapln10ano2023tdreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPporhttps://ojs.apl.pt/index.php/rapl/article/view/167https://ojs.apl.pt/index.php/rapl/article/view/167/221Direitos de Autor (c) 2023 Bruna Carriço, Christopher Shulby, Helena Monizinfo:eu-repo/semantics/openAccessCarriço, BrunaShulby, ChristopherMoniz, Helena2023-12-09T10:16:16Zoai:ojs3.ojs.apl.pt:article/167Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:26:03.738671Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems Modelos de pré-processamento para tecnologias da fala : O impacto do normalizador e do grafema-fone nos sistemas híbridos |
title |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems |
spellingShingle |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems Carriço, Bruna tecnologias de fala normalizador grafema-fone conhecimento linguístico modelos speech technologies normalizer grapheme-to-phone linguistic knowledge models |
title_short |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems |
title_full |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems |
title_fullStr |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems |
title_full_unstemmed |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems |
title_sort |
Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems |
author |
Carriço, Bruna |
author_facet |
Carriço, Bruna Shulby, Christopher Moniz, Helena |
author_role |
author |
author2 |
Shulby, Christopher Moniz, Helena |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Carriço, Bruna Shulby, Christopher Moniz, Helena |
dc.subject.por.fl_str_mv |
tecnologias de fala normalizador grafema-fone conhecimento linguístico modelos speech technologies normalizer grapheme-to-phone linguistic knowledge models |
topic |
tecnologias de fala normalizador grafema-fone conhecimento linguístico modelos speech technologies normalizer grapheme-to-phone linguistic knowledge models |
description |
This paper describes the linguistic preprocessing methods on hybrid systems provided by an Artificial Intelligence (AI) international company, Defined.ai. The startup focuses on providing high-quality data, models, and AI tools. The main goal of this work is to enhance and advance the quality of preprocessing models by applying linguistic knowledge. Thus, we focus on two introductory linguistic models in a speech pipeline: Normalizer and Grapheme-to-Phone (G2P). To do so, two initiatives were conducted in collaboration with the Defined.ai Machine Learning team. The first project focuses on expanding and improving a European Portuguese Normalizer model. The second project covers creating G2P models for two different languages – Swedish and Russian. Results show that having a rule-based approach to the Normalizer and G2P increases its accuracy and performance, representing a significant advantage in improving Defined.ai tools and speech pipelines. Also, with the results obtained on the first project, we improved the normalizer in ease of use by increasing each rule with linguistic knowledge. Accordingly, our research demonstrates the added value of linguistic knowledge in preprocessing models. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-10-22 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://doi.org/10.26334/2183-9077/rapln10ano2023a6 https://doi.org/10.26334/2183-9077/rapln10ano2023a6 |
url |
https://doi.org/10.26334/2183-9077/rapln10ano2023a6 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
https://ojs.apl.pt/index.php/rapl/article/view/167 https://ojs.apl.pt/index.php/rapl/article/view/167/221 |
dc.rights.driver.fl_str_mv |
Direitos de Autor (c) 2023 Bruna Carriço, Christopher Shulby, Helena Moniz info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Direitos de Autor (c) 2023 Bruna Carriço, Christopher Shulby, Helena Moniz |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Associação Portuguesa de Linguística |
publisher.none.fl_str_mv |
Associação Portuguesa de Linguística |
dc.source.none.fl_str_mv |
Revista da Associação Portuguesa de Linguística; No. 10 (2023): Journal of the Portuguese Linguistics Association; 92–114 Revista da Associação Portuguesa de Linguística; N.º 10 (2023): Revista da Associação Portuguesa de Linguística; 92–114 2183-9077 10.26334/2183-9077/rapln10ano2023td reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134142503321600 |