Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems

Detalhes bibliográficos
Autor(a) principal: Carriço, Bruna
Data de Publicação: 2023
Outros Autores: Shulby, Christopher, Moniz, Helena
Tipo de documento: Artigo
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://doi.org/10.26334/2183-9077/rapln10ano2023a6
Resumo: This paper describes the linguistic preprocessing methods on hybrid systems provided by an Artificial Intelligence (AI) international company, Defined.ai. The startup focuses on providing high-quality data, models, and AI tools. The main goal of this work is to enhance and advance the quality of preprocessing models by applying linguistic knowledge. Thus, we focus on two introductory linguistic models in a speech pipeline: Normalizer and Grapheme-to-Phone (G2P). To do so, two initiatives were conducted in collaboration with the Defined.ai Machine Learning team. The first project focuses on expanding and improving a European Portuguese Normalizer model. The second project covers creating G2P models for two different languages – Swedish and Russian. Results show that having a rule-based approach to the Normalizer and G2P increases its accuracy and performance, representing a significant advantage in improving Defined.ai tools and speech pipelines. Also, with the results obtained on the first project, we improved the normalizer in ease of use by increasing each rule with linguistic knowledge. Accordingly, our research demonstrates the added value of linguistic knowledge in preprocessing models.
id RCAP_f8bcc15a53eba9338ab47824ef507077
oai_identifier_str oai:ojs3.ojs.apl.pt:article/167
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systemsModelos de pré-processamento para tecnologias da fala : O impacto do normalizador e do grafema-fone nos sistemas híbridostecnologias de falanormalizadorgrafema-foneconhecimento linguísticomodelosspeech technologiesnormalizergrapheme-to-phonelinguistic knowledgemodelsThis paper describes the linguistic preprocessing methods on hybrid systems provided by an Artificial Intelligence (AI) international company, Defined.ai. The startup focuses on providing high-quality data, models, and AI tools. The main goal of this work is to enhance and advance the quality of preprocessing models by applying linguistic knowledge. Thus, we focus on two introductory linguistic models in a speech pipeline: Normalizer and Grapheme-to-Phone (G2P). To do so, two initiatives were conducted in collaboration with the Defined.ai Machine Learning team. The first project focuses on expanding and improving a European Portuguese Normalizer model. The second project covers creating G2P models for two different languages – Swedish and Russian. Results show that having a rule-based approach to the Normalizer and G2P increases its accuracy and performance, representing a significant advantage in improving Defined.ai tools and speech pipelines. Also, with the results obtained on the first project, we improved the normalizer in ease of use by increasing each rule with linguistic knowledge. Accordingly, our research demonstrates the added value of linguistic knowledge in preprocessing models.Este artigo descreve os métodos de pré-processamento linguístico em sistemas híbridos fornecidos por uma empresa internacional de Inteligência Artificial (IA), a Defined.ai. A startup concentra-se em fornecer dados, modelos e ferramentas de IA de alta qualidade. O objetivo principal deste trabalho é aprimorar e avançar a qualidade dos modelos de pré-processamento aplicando conhecimento linguístico. Assim, focamo-nos em dois modelos linguísticos introdutórios numa arquitetura de fala: o Normalizador e o Grafema-para-fone (G2P). Para isso, foram realizadas duas iniciativas em colaboração com a equipa de Machine Learning da Defined.ai. O primeiro projeto concentra-se em expandir e melhorar um modelo de Normalizador para o Português Europeu. O segundo projeto cobre a criação de modelos G2P para duas línguas – Sueco e Russo. Os resultados mostram que ter uma abordagem baseada em regras para o Normalizador e G2P aumenta a sua precisão e o seu desempenho, representando uma vantagem significativa na melhoria das ferramentas e das arquiteturas de fala da empresa. Além disso, com os resultados obtidos no primeiro projeto, melhoramos o normalizador em termos de facilidade de uso, aumentado cada regra com conhecimento linguístico. Dessa forma, a nossa pesquisa demonstra o valor do conhecimento linguístico em modelos de pré-processamento.Associação Portuguesa de Linguística2023-10-22info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.26334/2183-9077/rapln10ano2023a6https://doi.org/10.26334/2183-9077/rapln10ano2023a6Revista da Associação Portuguesa de Linguística; No. 10 (2023): Journal of the Portuguese Linguistics Association; 92–114Revista da Associação Portuguesa de Linguística; N.º 10 (2023): Revista da Associação Portuguesa de Linguística; 92–1142183-907710.26334/2183-9077/rapln10ano2023tdreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPporhttps://ojs.apl.pt/index.php/rapl/article/view/167https://ojs.apl.pt/index.php/rapl/article/view/167/221Direitos de Autor (c) 2023 Bruna Carriço, Christopher Shulby, Helena Monizinfo:eu-repo/semantics/openAccessCarriço, BrunaShulby, ChristopherMoniz, Helena2023-12-09T10:16:16Zoai:ojs3.ojs.apl.pt:article/167Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:26:03.738671Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
Modelos de pré-processamento para tecnologias da fala : O impacto do normalizador e do grafema-fone nos sistemas híbridos
title Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
spellingShingle Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
Carriço, Bruna
tecnologias de fala
normalizador
grafema-fone
conhecimento linguístico
modelos
speech technologies
normalizer
grapheme-to-phone
linguistic knowledge
models
title_short Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
title_full Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
title_fullStr Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
title_full_unstemmed Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
title_sort Preprocessing models for speech technologies: The impact of the Normalizer and the Grapheme-to-Phone on hybrid systems
author Carriço, Bruna
author_facet Carriço, Bruna
Shulby, Christopher
Moniz, Helena
author_role author
author2 Shulby, Christopher
Moniz, Helena
author2_role author
author
dc.contributor.author.fl_str_mv Carriço, Bruna
Shulby, Christopher
Moniz, Helena
dc.subject.por.fl_str_mv tecnologias de fala
normalizador
grafema-fone
conhecimento linguístico
modelos
speech technologies
normalizer
grapheme-to-phone
linguistic knowledge
models
topic tecnologias de fala
normalizador
grafema-fone
conhecimento linguístico
modelos
speech technologies
normalizer
grapheme-to-phone
linguistic knowledge
models
description This paper describes the linguistic preprocessing methods on hybrid systems provided by an Artificial Intelligence (AI) international company, Defined.ai. The startup focuses on providing high-quality data, models, and AI tools. The main goal of this work is to enhance and advance the quality of preprocessing models by applying linguistic knowledge. Thus, we focus on two introductory linguistic models in a speech pipeline: Normalizer and Grapheme-to-Phone (G2P). To do so, two initiatives were conducted in collaboration with the Defined.ai Machine Learning team. The first project focuses on expanding and improving a European Portuguese Normalizer model. The second project covers creating G2P models for two different languages – Swedish and Russian. Results show that having a rule-based approach to the Normalizer and G2P increases its accuracy and performance, representing a significant advantage in improving Defined.ai tools and speech pipelines. Also, with the results obtained on the first project, we improved the normalizer in ease of use by increasing each rule with linguistic knowledge. Accordingly, our research demonstrates the added value of linguistic knowledge in preprocessing models.
publishDate 2023
dc.date.none.fl_str_mv 2023-10-22
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.26334/2183-9077/rapln10ano2023a6
https://doi.org/10.26334/2183-9077/rapln10ano2023a6
url https://doi.org/10.26334/2183-9077/rapln10ano2023a6
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://ojs.apl.pt/index.php/rapl/article/view/167
https://ojs.apl.pt/index.php/rapl/article/view/167/221
dc.rights.driver.fl_str_mv Direitos de Autor (c) 2023 Bruna Carriço, Christopher Shulby, Helena Moniz
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Direitos de Autor (c) 2023 Bruna Carriço, Christopher Shulby, Helena Moniz
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Associação Portuguesa de Linguística
publisher.none.fl_str_mv Associação Portuguesa de Linguística
dc.source.none.fl_str_mv Revista da Associação Portuguesa de Linguística; No. 10 (2023): Journal of the Portuguese Linguistics Association; 92–114
Revista da Associação Portuguesa de Linguística; N.º 10 (2023): Revista da Associação Portuguesa de Linguística; 92–114
2183-9077
10.26334/2183-9077/rapln10ano2023td
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134142503321600