Tone Mark Restoration in Standard Yorùbá Text: A Proposal

Detalhes bibliográficos
Autor(a) principal: Asahiah, Franklin Oladiipo
Data de Publicação: 2017
Outros Autores: Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì, Adagunodo, Emmanuel Rotimi, Olubode-Sawe, Funmi F.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: INFOCOMP: Jornal de Ciência da Computação
Texto Completo: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529
Resumo: Restoring diacritics have for the most part relied either on the letter (grapheme) or the space-delineated linguistic block often referred to as word as the lexical focus item. The usage of letter for Yorùbá text was often adduced to resource scarcity and the underlying model being language independent. On the other hand, the lack of sufficient contextual information for tone mark restoration using letters was cited for the limited performance of letter-based models. Thus, another research proposed the usage of the word as lexical token for restoration of tone marks in Yorùbá text. The result of this existing word-based tone-mark restoration approach did not indicate any improvement over the letter-based approach despite a larger training data. This situation might be due to the resource-scarcity problem. In this paper, we therefore proposed an alternative approach that is expected to address the twin challenges of resource scarcity and contextual insufficiency for tone marks restoration in Yorùbá text in particular and resourcescare tone languages in general. This approach is also expected to be linguistically sensible. It tried to relate the tone marks restoration task to orthographic function of tone marks in the text to the positioning of tone within the linguistic units of the language. We propose tone marks restoration for Yorùbá text based on using syllables as lexical focus or simply syllable-based tone marks restoration for Yorùbá text.
id UFLA-5_ad9cd21c032fc184c6b3985c1cc1a4d4
oai_identifier_str oai:infocomp.dcc.ufla.br:article/529
network_acronym_str UFLA-5
network_name_str INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling Tone Mark Restoration in Standard Yorùbá Text: A Proposalsyllabletone markrestoreRestoring diacritics have for the most part relied either on the letter (grapheme) or the space-delineated linguistic block often referred to as word as the lexical focus item. The usage of letter for Yorùbá text was often adduced to resource scarcity and the underlying model being language independent. On the other hand, the lack of sufficient contextual information for tone mark restoration using letters was cited for the limited performance of letter-based models. Thus, another research proposed the usage of the word as lexical token for restoration of tone marks in Yorùbá text. The result of this existing word-based tone-mark restoration approach did not indicate any improvement over the letter-based approach despite a larger training data. This situation might be due to the resource-scarcity problem. In this paper, we therefore proposed an alternative approach that is expected to address the twin challenges of resource scarcity and contextual insufficiency for tone marks restoration in Yorùbá text in particular and resourcescare tone languages in general. This approach is also expected to be linguistically sensible. It tried to relate the tone marks restoration task to orthographic function of tone marks in the text to the positioning of tone within the linguistic units of the language. We propose tone marks restoration for Yorùbá text based on using syllables as lexical focus or simply syllable-based tone marks restoration for Yorùbá text.Editora da UFLA2017-12-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529INFOCOMP Journal of Computer Science; Vol. 16 No. 1-2 (2017): June-December 2017; 8-191982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529/492Asahiah, Franklin OladiipoỌdẹ́jọbí, Ọdẹtúnjí ÀjàdìAdagunodo, Emmanuel RotimiOlubode-Sawe, Funmi F.info:eu-repo/semantics/openAccess2017-12-04T17:41:26Zoai:infocomp.dcc.ufla.br:article/529Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:42.431513INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv Tone Mark Restoration in Standard Yorùbá Text: A Proposal
title Tone Mark Restoration in Standard Yorùbá Text: A Proposal
spellingShingle Tone Mark Restoration in Standard Yorùbá Text: A Proposal
Asahiah, Franklin Oladiipo
syllable
tone mark
restore
title_short Tone Mark Restoration in Standard Yorùbá Text: A Proposal
title_full Tone Mark Restoration in Standard Yorùbá Text: A Proposal
title_fullStr Tone Mark Restoration in Standard Yorùbá Text: A Proposal
title_full_unstemmed Tone Mark Restoration in Standard Yorùbá Text: A Proposal
title_sort Tone Mark Restoration in Standard Yorùbá Text: A Proposal
author Asahiah, Franklin Oladiipo
author_facet Asahiah, Franklin Oladiipo
Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì
Adagunodo, Emmanuel Rotimi
Olubode-Sawe, Funmi F.
author_role author
author2 Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì
Adagunodo, Emmanuel Rotimi
Olubode-Sawe, Funmi F.
author2_role author
author
author
dc.contributor.author.fl_str_mv Asahiah, Franklin Oladiipo
Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì
Adagunodo, Emmanuel Rotimi
Olubode-Sawe, Funmi F.
dc.subject.por.fl_str_mv syllable
tone mark
restore
topic syllable
tone mark
restore
description Restoring diacritics have for the most part relied either on the letter (grapheme) or the space-delineated linguistic block often referred to as word as the lexical focus item. The usage of letter for Yorùbá text was often adduced to resource scarcity and the underlying model being language independent. On the other hand, the lack of sufficient contextual information for tone mark restoration using letters was cited for the limited performance of letter-based models. Thus, another research proposed the usage of the word as lexical token for restoration of tone marks in Yorùbá text. The result of this existing word-based tone-mark restoration approach did not indicate any improvement over the letter-based approach despite a larger training data. This situation might be due to the resource-scarcity problem. In this paper, we therefore proposed an alternative approach that is expected to address the twin challenges of resource scarcity and contextual insufficiency for tone marks restoration in Yorùbá text in particular and resourcescare tone languages in general. This approach is also expected to be linguistically sensible. It tried to relate the tone marks restoration task to orthographic function of tone marks in the text to the positioning of tone within the linguistic units of the language. We propose tone marks restoration for Yorùbá text based on using syllables as lexical focus or simply syllable-based tone marks restoration for Yorùbá text.
publishDate 2017
dc.date.none.fl_str_mv 2017-12-04
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529
url https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529/492
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Editora da UFLA
publisher.none.fl_str_mv Editora da UFLA
dc.source.none.fl_str_mv INFOCOMP Journal of Computer Science; Vol. 16 No. 1-2 (2017): June-December 2017; 8-19
1982-3363
1807-4545
reponame:INFOCOMP: Jornal de Ciência da Computação
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str INFOCOMP: Jornal de Ciência da Computação
collection INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv infocomp@dcc.ufla.br||apfreire@dcc.ufla.br
_version_ 1799874742158622720