GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE

Detalhes bibliográficos
Autor(a) principal: Braga, Daniela
Data de Publicação: 2019
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://doi.org/10.34630/polissema.vi6.3320
Resumo: In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese (EP). A G2P, together with the stress determination and the syllable division, is an essential tool in the general architecture of a Text-to-Speech (TTS) system. The G2P is part of the text pre-processing module of the TTS system and its purpose is to convert text into a phonetic transcription that is interpreted by the synthesis engine.A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented under the C++ framework and tested by using online newspaper articles. The obtained experimental results gave rise to 98,80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of TTS.The present paper is organized as follows: in section 1, it is made the state-of-the-art on this subject and the justification of our approach; in section 2, the annotation conventions are described, the G2P algorithm is presented and some details on the implementation are shown; in section 3, results are discussed and in section 4 some conclusions and future work are presented.
id RCAP_2bd5ebd07c03ebcfbd422c90f5a9b209
oai_identifier_str oai:oai.parc.ipp.pt:article/3320
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESEConversão Grafema-foneRegras FonológicasProcessamento da FalaSistemas de Conversão Texto-falaGrapheme-to-phone ConversionPhonological RulesSpeech ProcessingText-to-speech SystemsIn this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese (EP). A G2P, together with the stress determination and the syllable division, is an essential tool in the general architecture of a Text-to-Speech (TTS) system. The G2P is part of the text pre-processing module of the TTS system and its purpose is to convert text into a phonetic transcription that is interpreted by the synthesis engine.A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented under the C++ framework and tested by using online newspaper articles. The obtained experimental results gave rise to 98,80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of TTS.The present paper is organized as follows: in section 1, it is made the state-of-the-art on this subject and the justification of our approach; in section 2, the annotation conventions are described, the G2P algorithm is presented and some details on the implementation are shown; in section 3, results are discussed and in section 4 some conclusions and future work are presented.Instituto Superior de Contabilidade e Administração do Porto2019-08-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.34630/polissema.vi6.3320https://doi.org/10.34630/polissema.vi6.3320POLISSEMA – ISCAP Journal of Letters; No. 6 (2006)POLISSEMA – Revista de Letras do ISCAP; Núm. 6 (2006)POLISSEMA; No 6 (2006)POLISSEMA – Revista de Letras do ISCAP; N.º 6 (2006)2184-710X1645-1937reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPenghttps://parc.ipp.pt/index.php/Polissema/article/view/3320https://parc.ipp.pt/index.php/Polissema/article/view/3320/1304Direitos de Autor (c) 2006 POLISSEMA – Revista de Letras do ISCAPinfo:eu-repo/semantics/openAccessBraga, Daniela2024-02-01T20:18:00Zoai:oai.parc.ipp.pt:article/3320Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T16:00:58.741269Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
title GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
spellingShingle GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
Braga, Daniela
Conversão Grafema-fone
Regras Fonológicas
Processamento da Fala
Sistemas de Conversão Texto-fala
Grapheme-to-phone Conversion
Phonological Rules
Speech Processing
Text-to-speech Systems
title_short GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
title_full GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
title_fullStr GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
title_full_unstemmed GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
title_sort GRAPHEME-TO-PHONE TRANSCRIPTION ALGORITHM FOR TEXT-TO-SPEECH SYSTEMS IN EUROPEAN PORTUGUESE
author Braga, Daniela
author_facet Braga, Daniela
author_role author
dc.contributor.author.fl_str_mv Braga, Daniela
dc.subject.por.fl_str_mv Conversão Grafema-fone
Regras Fonológicas
Processamento da Fala
Sistemas de Conversão Texto-fala
Grapheme-to-phone Conversion
Phonological Rules
Speech Processing
Text-to-speech Systems
topic Conversão Grafema-fone
Regras Fonológicas
Processamento da Fala
Sistemas de Conversão Texto-fala
Grapheme-to-phone Conversion
Phonological Rules
Speech Processing
Text-to-speech Systems
description In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese (EP). A G2P, together with the stress determination and the syllable division, is an essential tool in the general architecture of a Text-to-Speech (TTS) system. The G2P is part of the text pre-processing module of the TTS system and its purpose is to convert text into a phonetic transcription that is interpreted by the synthesis engine.A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented under the C++ framework and tested by using online newspaper articles. The obtained experimental results gave rise to 98,80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of TTS.The present paper is organized as follows: in section 1, it is made the state-of-the-art on this subject and the justification of our approach; in section 2, the annotation conventions are described, the G2P algorithm is presented and some details on the implementation are shown; in section 3, results are discussed and in section 4 some conclusions and future work are presented.
publishDate 2019
dc.date.none.fl_str_mv 2019-08-21
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.34630/polissema.vi6.3320
https://doi.org/10.34630/polissema.vi6.3320
url https://doi.org/10.34630/polissema.vi6.3320
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://parc.ipp.pt/index.php/Polissema/article/view/3320
https://parc.ipp.pt/index.php/Polissema/article/view/3320/1304
dc.rights.driver.fl_str_mv Direitos de Autor (c) 2006 POLISSEMA – Revista de Letras do ISCAP
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Direitos de Autor (c) 2006 POLISSEMA – Revista de Letras do ISCAP
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Instituto Superior de Contabilidade e Administração do Porto
publisher.none.fl_str_mv Instituto Superior de Contabilidade e Administração do Porto
dc.source.none.fl_str_mv POLISSEMA – ISCAP Journal of Letters; No. 6 (2006)
POLISSEMA – Revista de Letras do ISCAP; Núm. 6 (2006)
POLISSEMA; No 6 (2006)
POLISSEMA – Revista de Letras do ISCAP; N.º 6 (2006)
2184-710X
1645-1937
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799130474712399872