GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://doi.org/10.34630/polissema.vi6.3320 |
Resumo: | In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese (EP). A G2P, together with the stress determination and the syllable division, is an essential tool in the general architecture of a Text-to-Speech (TTS) system. The G2P is part of the text pre-processing module of the TTS system and its purpose is to convert text into a phonetic transcription that is interpreted by the synthesis engine.A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented under the C++ framework and tested by using online newspaper articles. The obtained experimental results gave rise to 98,80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of TTS.The present paper is organized as follows: in section 1, it is made the state-of-the-art on this subject and the justification of our approach; in section 2, the annotation conventions are described, the G2P algorithm is presented and some details on the implementation are shown; in section 3, results are discussed and in section 4 some conclusions and future work are presented. |
id |
RCAP_2bd5ebd07c03ebcfbd422c90f5a9b209 |
---|---|
oai_identifier_str |
oai:oai.parc.ipp.pt:article/3320 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESEConversão Grafema-foneRegras FonológicasProcessamento da FalaSistemas de Conversão Texto-falaGrapheme-to-phone ConversionPhonological RulesSpeech ProcessingText-to-speech SystemsIn this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese (EP). A G2P, together with the stress determination and the syllable division, is an essential tool in the general architecture of a Text-to-Speech (TTS) system. The G2P is part of the text pre-processing module of the TTS system and its purpose is to convert text into a phonetic transcription that is interpreted by the synthesis engine.A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented under the C++ framework and tested by using online newspaper articles. The obtained experimental results gave rise to 98,80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of TTS.The present paper is organized as follows: in section 1, it is made the state-of-the-art on this subject and the justification of our approach; in section 2, the annotation conventions are described, the G2P algorithm is presented and some details on the implementation are shown; in section 3, results are discussed and in section 4 some conclusions and future work are presented.Instituto Superior de Contabilidade e Administração do Porto2019-08-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.34630/polissema.vi6.3320https://doi.org/10.34630/polissema.vi6.3320POLISSEMA – ISCAP Journal of Letters; No. 6 (2006)POLISSEMA – Revista de Letras do ISCAP; Núm. 6 (2006)POLISSEMA; No 6 (2006)POLISSEMA – Revista de Letras do ISCAP; N.º 6 (2006)2184-710X1645-1937reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPenghttps://parc.ipp.pt/index.php/Polissema/article/view/3320https://parc.ipp.pt/index.php/Polissema/article/view/3320/1304Direitos de Autor (c) 2006 POLISSEMA – Revista de Letras do ISCAPinfo:eu-repo/semantics/openAccessBraga, Daniela2024-02-01T20:18:00Zoai:oai.parc.ipp.pt:article/3320Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T16:00:58.741269Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
title |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
spellingShingle |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE Braga, Daniela Conversão Grafema-fone Regras Fonológicas Processamento da Fala Sistemas de Conversão Texto-fala Grapheme-to-phone Conversion Phonological Rules Speech Processing Text-to-speech Systems |
title_short |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
title_full |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
title_fullStr |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
title_full_unstemmed |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
title_sort |
GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE |
author |
Braga, Daniela |
author_facet |
Braga, Daniela |
author_role |
author |
dc.contributor.author.fl_str_mv |
Braga, Daniela |
dc.subject.por.fl_str_mv |
Conversão Grafema-fone Regras Fonológicas Processamento da Fala Sistemas de Conversão Texto-fala Grapheme-to-phone Conversion Phonological Rules Speech Processing Text-to-speech Systems |
topic |
Conversão Grafema-fone Regras Fonológicas Processamento da Fala Sistemas de Conversão Texto-fala Grapheme-to-phone Conversion Phonological Rules Speech Processing Text-to-speech Systems |
description |
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese (EP). A G2P, together with the stress determination and the syllable division, is an essential tool in the general architecture of a Text-to-Speech (TTS) system. The G2P is part of the text pre-processing module of the TTS system and its purpose is to convert text into a phonetic transcription that is interpreted by the synthesis engine.A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented under the C++ framework and tested by using online newspaper articles. The obtained experimental results gave rise to 98,80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of TTS.The present paper is organized as follows: in section 1, it is made the state-of-the-art on this subject and the justification of our approach; in section 2, the annotation conventions are described, the G2P algorithm is presented and some details on the implementation are shown; in section 3, results are discussed and in section 4 some conclusions and future work are presented. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-08-21 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://doi.org/10.34630/polissema.vi6.3320 https://doi.org/10.34630/polissema.vi6.3320 |
url |
https://doi.org/10.34630/polissema.vi6.3320 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://parc.ipp.pt/index.php/Polissema/article/view/3320 https://parc.ipp.pt/index.php/Polissema/article/view/3320/1304 |
dc.rights.driver.fl_str_mv |
Direitos de Autor (c) 2006 POLISSEMA – Revista de Letras do ISCAP info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Direitos de Autor (c) 2006 POLISSEMA – Revista de Letras do ISCAP |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Instituto Superior de Contabilidade e Administração do Porto |
publisher.none.fl_str_mv |
Instituto Superior de Contabilidade e Administração do Porto |
dc.source.none.fl_str_mv |
POLISSEMA – ISCAP Journal of Letters; No. 6 (2006) POLISSEMA – Revista de Letras do ISCAP; Núm. 6 (2006) POLISSEMA; No 6 (2006) POLISSEMA – Revista de Letras do ISCAP; N.º 6 (2006) 2184-710X 1645-1937 reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799130474712399872 |