Assessing complexity and difficulty levels of machine-translated texts

Detalhes bibliográficos
Autor(a) principal: Norma Barbosa de Lima Fonseca
Data de Publicação: 2016
Outros Autores: Fabio Alves da Silva Junior
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFMG
Texto Completo: https://doi.org/10.14393/LL63-v32n1a2016-16
http://hdl.handle.net/1843/49146
https://orcid.org/0000-0002-0207-4789
https://orcid.org/0000-0003-1089-4864
Resumo: This paper addresses a proposal for assessing complexity and difficulty levels of machine-translated texts in Portuguese to be further post-edited without the support of the source text (monolingual post-editing) in an experimental setting. By using two objective standard parameters, namely readability indexes and word frequency, and by proposing post-editors’ perception of difficulty to comprehend and to post-edit machine-translated texts as a new parameter, we sought to select texts with similar textual complexity or difficulty levels. This selection was necessary to carry out an experiment with four monolingual postediting tasks in Portuguese involving machine-translated texts from three different source languages (English, Spanish, and Chinese). The application of readability indexes in conjunction with word frequency based on a corpus to analyze machinetranslated texts into Portuguese to be used in experiments showed to be consistent and adequate. This method can also be applied to select texts to be used in Portuguese language classrooms and to select Portuguese texts to be included in Portuguese language textbooks. The findings can also be applied to the translation classroom, in which teachers can use the same methodology to select texts to be translated or post-edited or encourage students to analyze the texts themselves before performing a task, so students can become aware of the potential effort to be invested on a task or the real effort invested on the task after performing it. Finally, post-editors’ perception proved to be a sound parameter to validate text selection.
id UFMG_8edf36445e7ab51cdd05ed57516bbd61
oai_identifier_str oai:repositorio.ufmg.br:1843/49146
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling 2023-01-25T18:44:46Z2023-01-25T18:44:46Z2016-08-21321306337https://doi.org/10.14393/LL63-v32n1a2016-161981-5239http://hdl.handle.net/1843/49146https://orcid.org/0000-0002-0207-4789https://orcid.org/0000-0003-1089-4864This paper addresses a proposal for assessing complexity and difficulty levels of machine-translated texts in Portuguese to be further post-edited without the support of the source text (monolingual post-editing) in an experimental setting. By using two objective standard parameters, namely readability indexes and word frequency, and by proposing post-editors’ perception of difficulty to comprehend and to post-edit machine-translated texts as a new parameter, we sought to select texts with similar textual complexity or difficulty levels. This selection was necessary to carry out an experiment with four monolingual postediting tasks in Portuguese involving machine-translated texts from three different source languages (English, Spanish, and Chinese). The application of readability indexes in conjunction with word frequency based on a corpus to analyze machinetranslated texts into Portuguese to be used in experiments showed to be consistent and adequate. This method can also be applied to select texts to be used in Portuguese language classrooms and to select Portuguese texts to be included in Portuguese language textbooks. The findings can also be applied to the translation classroom, in which teachers can use the same methodology to select texts to be translated or post-edited or encourage students to analyze the texts themselves before performing a task, so students can become aware of the potential effort to be invested on a task or the real effort invested on the task after performing it. Finally, post-editors’ perception proved to be a sound parameter to validate text selection.Este artigo aborda uma proposta para verificar o nível de complexidade e dificuldade de textos traduzidos automaticamente para o português a fim de serem pós-editados sem acesso ao texto-fonte (pós-edição monolíngue) em um estudo experimental. Com o uso de dois parâmetros padrão objetivos, quais sejam índices de legibilidade e frequência de palavras, e a percepção de pós-editores sobre a dificuldade para compreender e pós-editar os textos traduzidos pela máquina como um novo parâmetro, procurou-se selecionar textos traduzidos automaticamente que guardassem entre si níveis de complexidade e dificuldade textuais semelhantes. Essa seleção foi necessária para a realização de um experimento com quatro tarefas de pós-edição monolíngue em português envolvendo textos traduzidos automaticamente a partir de três línguas-fonte diferentes (inglês, espanhol e chinês). A aplicação de índices de legibilidade e da frequência das palavras com base em um corpus para analisar textos traduzidos automaticamente para o português para fins de uso em experimentos mostrou ser consistente e adequada. Esse método também pode ser aplicado para selecionar textos para serem trabalhados em aulas de português e para selecionar textos em português para serem incluídos em livros didáticos. Os resultados também podem ser aplicados para aulas de tradução. Professores podem usar a mesma metodologia para selecionar textos para serem traduzidos ou pós-editados em sala de aula e incentivar alunos a analisar os textos antes da execução das tarefas para que esses alunos possam se conscientizar do esforço a ser investido em uma tarefa ou do esforço real após executá-la. Além disso, a percepção dos pós-editores mostrou ser um parâmetro válido para a seleção dos textos.engUniversidade Federal de Minas GeraisUFMGBrasilFALE - FACULDADE DE LETRASLetras & LetrasTradução mecânicaInterpretação e traduçãoText complexity and difficultyReadability indexesExperimental textsMachine translationMonolingual posteditingAssessing complexity and difficulty levels of machine-translated textsVerificando níveis de complexidade e dificuldade de textos traduzidos automaticamenteinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleNorma Barbosa de Lima FonsecaFabio Alves da Silva Juniorapplication/pdfinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGLICENSELicense.txtLicense.txttext/plain; charset=utf-82042https://repositorio.ufmg.br/bitstream/1843/49146/1/License.txtfa505098d172de0bc8864fc1287ffe22MD51ORIGINALAssessing complexity and difficulty levels of machine-translated texts.pdfAssessing complexity and difficulty levels of machine-translated texts.pdfapplication/pdf530053https://repositorio.ufmg.br/bitstream/1843/49146/2/Assessing%20complexity%20and%20difficulty%20levels%20of%20machine-translated%20texts.pdfa134b04502b9f909deb15288b8656fb3MD521843/491462023-01-25 15:44:47.106oai:repositorio.ufmg.br:1843/49146TElDRU7vv71BIERFIERJU1RSSUJVSe+/ve+/vU8gTu+/vU8tRVhDTFVTSVZBIERPIFJFUE9TSVTvv71SSU8gSU5TVElUVUNJT05BTCBEQSBVRk1HCiAKCkNvbSBhIGFwcmVzZW50Ye+/ve+/vW8gZGVzdGEgbGljZW7vv71hLCB2b2Pvv70gKG8gYXV0b3IgKGVzKSBvdSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGRlIGF1dG9yKSBjb25jZWRlIGFvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbu+/vW8gZXhjbHVzaXZvIGUgaXJyZXZvZ++/vXZlbCBkZSByZXByb2R1emlyIGUvb3UgZGlzdHJpYnVpciBhIHN1YSBwdWJsaWNh77+977+9byAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0cu+/vW5pY28gZSBlbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mg77+9dWRpbyBvdSB277+9ZGVvLgoKVm9j77+9IGRlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zvv710aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2Pvv70gY29uY29yZGEgcXVlIG8gUmVwb3NpdO+/vXJpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250Ze+/vWRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNh77+977+9byBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHvv73vv71vLgoKVm9j77+9IHRhbWLvv71tIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPvv71waWEgZGUgc3VhIHB1YmxpY2Hvv73vv71vIHBhcmEgZmlucyBkZSBzZWd1cmFu77+9YSwgYmFjay11cCBlIHByZXNlcnZh77+977+9by4KClZvY++/vSBkZWNsYXJhIHF1ZSBhIHN1YSBwdWJsaWNh77+977+9byDvv70gb3JpZ2luYWwgZSBxdWUgdm9j77+9IHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vu77+9YS4gVm9j77+9IHRhbWLvv71tIGRlY2xhcmEgcXVlIG8gZGVw77+9c2l0byBkZSBzdWEgcHVibGljYe+/ve+/vW8gbu+/vW8sIHF1ZSBzZWphIGRlIHNldSBjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd177+9bS4KCkNhc28gYSBzdWEgcHVibGljYe+/ve+/vW8gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY++/vSBu77+9byBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2Pvv70gZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc++/vW8gaXJyZXN0cml0YSBkbyBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgcGFyYSBjb25jZWRlciBhbyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7vv71hLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3Tvv70gY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250Ze+/vWRvIGRhIHB1YmxpY2Hvv73vv71vIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFBVQkxJQ0Hvv73vv71PIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ++/vU5JTyBPVSBBUE9JTyBERSBVTUEgQUfvv71OQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0Pvv70gREVDTEFSQSBRVUUgUkVTUEVJVE9VIFRPRE9TIEUgUVVBSVNRVUVSIERJUkVJVE9TIERFIFJFVklT77+9TyBDT01PIFRBTULvv71NIEFTIERFTUFJUyBPQlJJR0Hvv73vv71FUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKTyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNh77+977+9bywgZSBu77+9byBmYXLvv70gcXVhbHF1ZXIgYWx0ZXJh77+977+9bywgYWzvv71tIGRhcXVlbGFzIGNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7vv71hLgo=Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2023-01-25T18:44:47Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv Assessing complexity and difficulty levels of machine-translated texts
dc.title.alternative.pt_BR.fl_str_mv Verificando níveis de complexidade e dificuldade de textos traduzidos automaticamente
title Assessing complexity and difficulty levels of machine-translated texts
spellingShingle Assessing complexity and difficulty levels of machine-translated texts
Norma Barbosa de Lima Fonseca
Text complexity and difficulty
Readability indexes
Experimental texts
Machine translation
Monolingual postediting
Tradução mecânica
Interpretação e tradução
title_short Assessing complexity and difficulty levels of machine-translated texts
title_full Assessing complexity and difficulty levels of machine-translated texts
title_fullStr Assessing complexity and difficulty levels of machine-translated texts
title_full_unstemmed Assessing complexity and difficulty levels of machine-translated texts
title_sort Assessing complexity and difficulty levels of machine-translated texts
author Norma Barbosa de Lima Fonseca
author_facet Norma Barbosa de Lima Fonseca
Fabio Alves da Silva Junior
author_role author
author2 Fabio Alves da Silva Junior
author2_role author
dc.contributor.author.fl_str_mv Norma Barbosa de Lima Fonseca
Fabio Alves da Silva Junior
dc.subject.por.fl_str_mv Text complexity and difficulty
Readability indexes
Experimental texts
Machine translation
Monolingual postediting
topic Text complexity and difficulty
Readability indexes
Experimental texts
Machine translation
Monolingual postediting
Tradução mecânica
Interpretação e tradução
dc.subject.other.pt_BR.fl_str_mv Tradução mecânica
Interpretação e tradução
description This paper addresses a proposal for assessing complexity and difficulty levels of machine-translated texts in Portuguese to be further post-edited without the support of the source text (monolingual post-editing) in an experimental setting. By using two objective standard parameters, namely readability indexes and word frequency, and by proposing post-editors’ perception of difficulty to comprehend and to post-edit machine-translated texts as a new parameter, we sought to select texts with similar textual complexity or difficulty levels. This selection was necessary to carry out an experiment with four monolingual postediting tasks in Portuguese involving machine-translated texts from three different source languages (English, Spanish, and Chinese). The application of readability indexes in conjunction with word frequency based on a corpus to analyze machinetranslated texts into Portuguese to be used in experiments showed to be consistent and adequate. This method can also be applied to select texts to be used in Portuguese language classrooms and to select Portuguese texts to be included in Portuguese language textbooks. The findings can also be applied to the translation classroom, in which teachers can use the same methodology to select texts to be translated or post-edited or encourage students to analyze the texts themselves before performing a task, so students can become aware of the potential effort to be invested on a task or the real effort invested on the task after performing it. Finally, post-editors’ perception proved to be a sound parameter to validate text selection.
publishDate 2016
dc.date.issued.fl_str_mv 2016-08-21
dc.date.accessioned.fl_str_mv 2023-01-25T18:44:46Z
dc.date.available.fl_str_mv 2023-01-25T18:44:46Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1843/49146
dc.identifier.doi.pt_BR.fl_str_mv https://doi.org/10.14393/LL63-v32n1a2016-16
dc.identifier.issn.pt_BR.fl_str_mv 1981-5239
dc.identifier.orcid.pt_BR.fl_str_mv https://orcid.org/0000-0002-0207-4789
https://orcid.org/0000-0003-1089-4864
url https://doi.org/10.14393/LL63-v32n1a2016-16
http://hdl.handle.net/1843/49146
https://orcid.org/0000-0002-0207-4789
https://orcid.org/0000-0003-1089-4864
identifier_str_mv 1981-5239
dc.language.iso.fl_str_mv eng
language eng
dc.relation.ispartof.pt_BR.fl_str_mv Letras & Letras
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.publisher.initials.fl_str_mv UFMG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv FALE - FACULDADE DE LETRAS
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
bitstream.url.fl_str_mv https://repositorio.ufmg.br/bitstream/1843/49146/1/License.txt
https://repositorio.ufmg.br/bitstream/1843/49146/2/Assessing%20complexity%20and%20difficulty%20levels%20of%20machine-translated%20texts.pdf
bitstream.checksum.fl_str_mv fa505098d172de0bc8864fc1287ffe22
a134b04502b9f909deb15288b8656fb3
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_ 1803589193768632320