An automatic model and Gold Standard for translation alignment of Ancient Greek
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://hdl.handle.net/11449/246506 |
Resumo: | This paper illustrates a workflow for developing and evaluating automatic translation alignment models for Ancient Greek. We designed an annotation Style Guide and a gold standard for the alignment of Ancient Greek-English and Ancient Greek-Portuguese, measured inter-annotator agreement and used the resulting dataset to evaluate the performance of various translation alignment models. We proposed a fine-tuning strategy that employs unsupervised training with mono- and bilingual texts and supervised training using manually aligned sentences. The results indicate that the fine-tuned model based on XLM-Roberta is superior in performance, and it achieved good results on language pairs that were not part of the training data. |
id |
UNSP_f6273dd8913c07d310d4c5a79caaabf9 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/246506 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
An automatic model and Gold Standard for translation alignment of Ancient GreekAlignment GuidelinesAncient GreekGold StandardTranslation AlignmentAlignment guidelineAncient GreeksAutomatic modelingAutomatic translationFine tuningGold standardsPerformanceStyle guidesTranslation alignmentWork-flowsThis paper illustrates a workflow for developing and evaluating automatic translation alignment models for Ancient Greek. We designed an annotation Style Guide and a gold standard for the alignment of Ancient Greek-English and Ancient Greek-Portuguese, measured inter-annotator agreement and used the resulting dataset to evaluate the performance of various translation alignment models. We proposed a fine-tuning strategy that employs unsupervised training with mono- and bilingual texts and supervised training using manually aligned sentences. The results indicate that the fine-tuned model based on XLM-Roberta is superior in performance, and it achieved good results on language pairs that were not part of the training data.Higher Education Discipline Innovation ProjectUniversity of Leipzig, Augustusplatz 10Furman University, 3300 Poinsett HighwayUniversidade Estadual Paulista (UNESP), Rod. Araraquara-Jaú Km 1 - Bairro dos Machados, SP, MachadosUniversidade Estadual Paulista (UNESP), Rod. Araraquara-Jaú Km 1 - Bairro dos Machados, SP, MachadosUniversidade de São Paulo (USP)Furman UniversityUniversidade Estadual Paulista (UNESP)Yousef, TariqPalladino, ChiaraShamsian, FarnooshD'Orange Ferreira, Anise [UNESP]dos Reis, Michel Ferreira [UNESP]2023-07-29T12:42:49Z2023-07-29T12:42:49Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject5894-59052022 Language Resources and Evaluation Conference, LREC 2022, p. 5894-5905.http://hdl.handle.net/11449/2465062-s2.0-85144450963Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPeng2022 Language Resources and Evaluation Conference, LREC 2022info:eu-repo/semantics/openAccess2023-07-29T12:42:49Zoai:repositorio.unesp.br:11449/246506Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T22:07:20.673453Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
title |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
spellingShingle |
An automatic model and Gold Standard for translation alignment of Ancient Greek Yousef, Tariq Alignment Guidelines Ancient Greek Gold Standard Translation Alignment Alignment guideline Ancient Greeks Automatic modeling Automatic translation Fine tuning Gold standards Performance Style guides Translation alignment Work-flows |
title_short |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
title_full |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
title_fullStr |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
title_full_unstemmed |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
title_sort |
An automatic model and Gold Standard for translation alignment of Ancient Greek |
author |
Yousef, Tariq |
author_facet |
Yousef, Tariq Palladino, Chiara Shamsian, Farnoosh D'Orange Ferreira, Anise [UNESP] dos Reis, Michel Ferreira [UNESP] |
author_role |
author |
author2 |
Palladino, Chiara Shamsian, Farnoosh D'Orange Ferreira, Anise [UNESP] dos Reis, Michel Ferreira [UNESP] |
author2_role |
author author author author |
dc.contributor.none.fl_str_mv |
Universidade de São Paulo (USP) Furman University Universidade Estadual Paulista (UNESP) |
dc.contributor.author.fl_str_mv |
Yousef, Tariq Palladino, Chiara Shamsian, Farnoosh D'Orange Ferreira, Anise [UNESP] dos Reis, Michel Ferreira [UNESP] |
dc.subject.por.fl_str_mv |
Alignment Guidelines Ancient Greek Gold Standard Translation Alignment Alignment guideline Ancient Greeks Automatic modeling Automatic translation Fine tuning Gold standards Performance Style guides Translation alignment Work-flows |
topic |
Alignment Guidelines Ancient Greek Gold Standard Translation Alignment Alignment guideline Ancient Greeks Automatic modeling Automatic translation Fine tuning Gold standards Performance Style guides Translation alignment Work-flows |
description |
This paper illustrates a workflow for developing and evaluating automatic translation alignment models for Ancient Greek. We designed an annotation Style Guide and a gold standard for the alignment of Ancient Greek-English and Ancient Greek-Portuguese, measured inter-annotator agreement and used the resulting dataset to evaluate the performance of various translation alignment models. We proposed a fine-tuning strategy that employs unsupervised training with mono- and bilingual texts and supervised training using manually aligned sentences. The results indicate that the fine-tuned model based on XLM-Roberta is superior in performance, and it achieved good results on language pairs that were not part of the training data. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-01-01 2023-07-29T12:42:49Z 2023-07-29T12:42:49Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
2022 Language Resources and Evaluation Conference, LREC 2022, p. 5894-5905. http://hdl.handle.net/11449/246506 2-s2.0-85144450963 |
identifier_str_mv |
2022 Language Resources and Evaluation Conference, LREC 2022, p. 5894-5905. 2-s2.0-85144450963 |
url |
http://hdl.handle.net/11449/246506 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
2022 Language Resources and Evaluation Conference, LREC 2022 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
5894-5905 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808129394072354816 |