Tone Mark Restoration in Standard Yorùbá Text: A Proposal
Autor(a) principal: | |
---|---|
Data de Publicação: | 2017 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | INFOCOMP: Jornal de Ciência da Computação |
Texto Completo: | https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529 |
Resumo: | Restoring diacritics have for the most part relied either on the letter (grapheme) or the space-delineated linguistic block often referred to as word as the lexical focus item. The usage of letter for Yorùbá text was often adduced to resource scarcity and the underlying model being language independent. On the other hand, the lack of sufficient contextual information for tone mark restoration using letters was cited for the limited performance of letter-based models. Thus, another research proposed the usage of the word as lexical token for restoration of tone marks in Yorùbá text. The result of this existing word-based tone-mark restoration approach did not indicate any improvement over the letter-based approach despite a larger training data. This situation might be due to the resource-scarcity problem. In this paper, we therefore proposed an alternative approach that is expected to address the twin challenges of resource scarcity and contextual insufficiency for tone marks restoration in Yorùbá text in particular and resourcescare tone languages in general. This approach is also expected to be linguistically sensible. It tried to relate the tone marks restoration task to orthographic function of tone marks in the text to the positioning of tone within the linguistic units of the language. We propose tone marks restoration for Yorùbá text based on using syllables as lexical focus or simply syllable-based tone marks restoration for Yorùbá text. |
id |
UFLA-5_ad9cd21c032fc184c6b3985c1cc1a4d4 |
---|---|
oai_identifier_str |
oai:infocomp.dcc.ufla.br:article/529 |
network_acronym_str |
UFLA-5 |
network_name_str |
INFOCOMP: Jornal de Ciência da Computação |
repository_id_str |
|
spelling |
Tone Mark Restoration in Standard Yorùbá Text: A Proposalsyllabletone markrestoreRestoring diacritics have for the most part relied either on the letter (grapheme) or the space-delineated linguistic block often referred to as word as the lexical focus item. The usage of letter for Yorùbá text was often adduced to resource scarcity and the underlying model being language independent. On the other hand, the lack of sufficient contextual information for tone mark restoration using letters was cited for the limited performance of letter-based models. Thus, another research proposed the usage of the word as lexical token for restoration of tone marks in Yorùbá text. The result of this existing word-based tone-mark restoration approach did not indicate any improvement over the letter-based approach despite a larger training data. This situation might be due to the resource-scarcity problem. In this paper, we therefore proposed an alternative approach that is expected to address the twin challenges of resource scarcity and contextual insufficiency for tone marks restoration in Yorùbá text in particular and resourcescare tone languages in general. This approach is also expected to be linguistically sensible. It tried to relate the tone marks restoration task to orthographic function of tone marks in the text to the positioning of tone within the linguistic units of the language. We propose tone marks restoration for Yorùbá text based on using syllables as lexical focus or simply syllable-based tone marks restoration for Yorùbá text.Editora da UFLA2017-12-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529INFOCOMP Journal of Computer Science; Vol. 16 No. 1-2 (2017): June-December 2017; 8-191982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529/492Asahiah, Franklin OladiipoỌdẹ́jọbí, Ọdẹtúnjí ÀjàdìAdagunodo, Emmanuel RotimiOlubode-Sawe, Funmi F.info:eu-repo/semantics/openAccess2017-12-04T17:41:26Zoai:infocomp.dcc.ufla.br:article/529Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:42.431513INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true |
dc.title.none.fl_str_mv |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
title |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
spellingShingle |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal Asahiah, Franklin Oladiipo syllable tone mark restore |
title_short |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
title_full |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
title_fullStr |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
title_full_unstemmed |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
title_sort |
Tone Mark Restoration in Standard Yorùbá Text: A Proposal |
author |
Asahiah, Franklin Oladiipo |
author_facet |
Asahiah, Franklin Oladiipo Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì Adagunodo, Emmanuel Rotimi Olubode-Sawe, Funmi F. |
author_role |
author |
author2 |
Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì Adagunodo, Emmanuel Rotimi Olubode-Sawe, Funmi F. |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Asahiah, Franklin Oladiipo Ọdẹ́jọbí, Ọdẹtúnjí Àjàdì Adagunodo, Emmanuel Rotimi Olubode-Sawe, Funmi F. |
dc.subject.por.fl_str_mv |
syllable tone mark restore |
topic |
syllable tone mark restore |
description |
Restoring diacritics have for the most part relied either on the letter (grapheme) or the space-delineated linguistic block often referred to as word as the lexical focus item. The usage of letter for Yorùbá text was often adduced to resource scarcity and the underlying model being language independent. On the other hand, the lack of sufficient contextual information for tone mark restoration using letters was cited for the limited performance of letter-based models. Thus, another research proposed the usage of the word as lexical token for restoration of tone marks in Yorùbá text. The result of this existing word-based tone-mark restoration approach did not indicate any improvement over the letter-based approach despite a larger training data. This situation might be due to the resource-scarcity problem. In this paper, we therefore proposed an alternative approach that is expected to address the twin challenges of resource scarcity and contextual insufficiency for tone marks restoration in Yorùbá text in particular and resourcescare tone languages in general. This approach is also expected to be linguistically sensible. It tried to relate the tone marks restoration task to orthographic function of tone marks in the text to the positioning of tone within the linguistic units of the language. We propose tone marks restoration for Yorùbá text based on using syllables as lexical focus or simply syllable-based tone marks restoration for Yorùbá text. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-12-04 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529 |
url |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/529/492 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Editora da UFLA |
publisher.none.fl_str_mv |
Editora da UFLA |
dc.source.none.fl_str_mv |
INFOCOMP Journal of Computer Science; Vol. 16 No. 1-2 (2017): June-December 2017; 8-19 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
INFOCOMP: Jornal de Ciência da Computação |
collection |
INFOCOMP: Jornal de Ciência da Computação |
repository.name.fl_str_mv |
INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
infocomp@dcc.ufla.br||apfreire@dcc.ufla.br |
_version_ |
1799874742158622720 |