Multidimensional strategy for the selection of machine translation candidates for post-editing

Detalhes bibliográficos
Autor(a) principal: Aranberri, Nora
Data de Publicação: 2020
Outros Autores: de Gibert, Ona
Tipo de documento: Artigo
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://doi.org/10.21814/lm.11.2.277
Resumo: An efficient integration of a machine translation (MT) system within a translation flow entails the need to distinguish between sentences that benefit from MT and those that do not before they are presented to the translator. In this work we question the use of Krings' (2001) post-editing effort dimensions separately to classify sentences into suitable for translation or for post-editing when training predictions models and propose a multidimensional strategy instead. We collect measurements of three effort parameters, namely, time, number of post-edited words and perception of effort, as representative of the three dimensions (temporal, technical and cognitive) in a real post-editing task. The results show that, although there are correlations between the measurements, the effort parameters differ in the classification of a considerable number of sentences. We conclude that the multidimensional strategy is necessary to estimate the overall post-editing effort.
id RCAP_398ac37291a886a7d37f101b359a206b
oai_identifier_str oai:linguamatica.com:article/277
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Multidimensional strategy for the selection of machine translation candidates for post-editingEstrategia multidimensional para la selección de candidatos de traducción automática para posediciónEstrategia multidimensional para la selección de candidatos de traducción automática para posediciónAn efficient integration of a machine translation (MT) system within a translation flow entails the need to distinguish between sentences that benefit from MT and those that do not before they are presented to the translator. In this work we question the use of Krings' (2001) post-editing effort dimensions separately to classify sentences into suitable for translation or for post-editing when training predictions models and propose a multidimensional strategy instead. We collect measurements of three effort parameters, namely, time, number of post-edited words and perception of effort, as representative of the three dimensions (temporal, technical and cognitive) in a real post-editing task. The results show that, although there are correlations between the measurements, the effort parameters differ in the classification of a considerable number of sentences. We conclude that the multidimensional strategy is necessary to estimate the overall post-editing effort.Una integración eficiente de un sistema de traducción automática (TA) en un flujo de traducción conlleva la necesidad de distinguir entre oraciones que se benefician de la TA y las que no antes de que pasen a manos del traductor. En este trabajo, cuestionamos el uso por separado de las dimensiones de esfuerzo de posedición de Krings (2001) para clasificar oraciones en aptas para traducir o poseditar al entrenar modelos de predicción y abogamos por una estrategia multidimensional. A partir de una tarea de posedición en un escenario real, se recogen mediciones de los tres parámetros de esfuerzo, a saber, tiempo, tasa de palabras poseditadas, y percepción del esfuerzo, como representativos de las tres dimensiones (temporal, técnica y cognitiva). Los resultados muestran que, a pesar de que existen correlaciones entre las mediciones, los parámetros difieren en la clasificación de un número elevado de oraciones. Concluimos que la estrategia multidimensional es necesaria para estimar el esfuerzo real de posedición.Una integración eficiente de un sistema de traducción automática (TA) en un flujo de traducción conlleva la necesidad de distinguir entre oraciones que se benefician de la TA y las que no antes de que pasen a manos del traductor. En este trabajo, cuestionamos el uso por separado de las dimensiones de esfuerzo de posedición de Krings (2001) para clasificar oraciones en aptas para traducir o poseditar al entrenar modelos de predicción y abogamos por una estrategia multidimensional. A partir de una tarea de posedición en un escenario real, se recogen mediciones de los tres parámetros de esfuerzo, a saber, tiempo, tasa de palabras poseditadas, y percepción del esfuerzo, como representativos de las tres dimensiones (temporal, técnica y cognitiva). Los resultados muestran que, a pesar de que existen correlaciones entre las mediciones, los parámetros difieren en la clasificación de un número elevado de oraciones. Concluimos que la estrategia multidimensional es necesaria para estimar el esfuerzo real de posedición.Universidade do Minho e Universidade de Vigo2020-01-04info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.21814/lm.11.2.277https://doi.org/10.21814/lm.11.2.277Linguamática; Vol. 11 No. 2; 3-16Linguamática; Vol. 11 Núm. 2; 3-16Linguamática; v. 11 n. 2; 3-161647-0818reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPporhttps://linguamatica.com/index.php/linguamatica/article/view/277https://linguamatica.com/index.php/linguamatica/article/view/277/456Direitos de Autor (c) 2019 Ona de Gibert, Nora Aranberrihttp://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessAranberri, Norade Gibert, Ona2023-09-08T13:46:39Zoai:linguamatica.com:article/277Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:28:38.901867Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Multidimensional strategy for the selection of machine translation candidates for post-editing
Estrategia multidimensional para la selección de candidatos de traducción automática para posedición
Estrategia multidimensional para la selección de candidatos de traducción automática para posedición
title Multidimensional strategy for the selection of machine translation candidates for post-editing
spellingShingle Multidimensional strategy for the selection of machine translation candidates for post-editing
Aranberri, Nora
title_short Multidimensional strategy for the selection of machine translation candidates for post-editing
title_full Multidimensional strategy for the selection of machine translation candidates for post-editing
title_fullStr Multidimensional strategy for the selection of machine translation candidates for post-editing
title_full_unstemmed Multidimensional strategy for the selection of machine translation candidates for post-editing
title_sort Multidimensional strategy for the selection of machine translation candidates for post-editing
author Aranberri, Nora
author_facet Aranberri, Nora
de Gibert, Ona
author_role author
author2 de Gibert, Ona
author2_role author
dc.contributor.author.fl_str_mv Aranberri, Nora
de Gibert, Ona
description An efficient integration of a machine translation (MT) system within a translation flow entails the need to distinguish between sentences that benefit from MT and those that do not before they are presented to the translator. In this work we question the use of Krings' (2001) post-editing effort dimensions separately to classify sentences into suitable for translation or for post-editing when training predictions models and propose a multidimensional strategy instead. We collect measurements of three effort parameters, namely, time, number of post-edited words and perception of effort, as representative of the three dimensions (temporal, technical and cognitive) in a real post-editing task. The results show that, although there are correlations between the measurements, the effort parameters differ in the classification of a considerable number of sentences. We conclude that the multidimensional strategy is necessary to estimate the overall post-editing effort.
publishDate 2020
dc.date.none.fl_str_mv 2020-01-04
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.21814/lm.11.2.277
https://doi.org/10.21814/lm.11.2.277
url https://doi.org/10.21814/lm.11.2.277
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://linguamatica.com/index.php/linguamatica/article/view/277
https://linguamatica.com/index.php/linguamatica/article/view/277/456
dc.rights.driver.fl_str_mv Direitos de Autor (c) 2019 Ona de Gibert, Nora Aranberri
http://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Direitos de Autor (c) 2019 Ona de Gibert, Nora Aranberri
http://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade do Minho e Universidade de Vigo
publisher.none.fl_str_mv Universidade do Minho e Universidade de Vigo
dc.source.none.fl_str_mv Linguamática; Vol. 11 No. 2; 3-16
Linguamática; Vol. 11 Núm. 2; 3-16
Linguamática; v. 11 n. 2; 3-16
1647-0818
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133553998430208