A corpus-based translation study on english-persian verb phrase ellipsis

Detalhes bibliográficos
Autor(a) principal: Shahabi, Mitra
Data de Publicação: 2011
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.1/1523
Resumo: Dissertação de mest., Natural Language Processing & Human Language Technology, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2011
id RCAP_34bd7e3e8676e2d7896da3429982b7a2
oai_identifier_str oai:sapientia.ualg.pt:10400.1/1523
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A corpus-based translation study on english-persian verb phrase ellipsisVerb phrase ellipsisEnglish-persianDescriptive translation studiesNatural language processingDissertação de mest., Natural Language Processing & Human Language Technology, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2011The present research adopted a descriptive corpus-based translation approach and focused on the patterns of translation of English Verb Phrase Ellipsis (VPE) into Persian. The goal was to find out how the observed translation behavior may be taken as advantageous information for improving English-Persian Machine Translation (MT) systems performances. For this purpose, a bilingual English-Persian parallel corpus was used. It consisted of 1,600 movies´ subtitles, consisting of informal conversations, with about 4 million words for each language. Unitex finite-state tools were applied in order to detect the intended English VPE by defining certain search patterns. The extracted cases of VPE were then compared with their Persian counterparts. Analysis of the Persian translations provided by the translators and the strategies utilized by them in dealing with English VPE was indicative of the fact that in those cases where Persian and English show some similar VPE constructions, the human translator keeps the translation quite close to the original text, especially by retaining the ellipsis. However, in many cases, the elliptical forms are language-specific. In such cases, it is not possible to keep the ellipsis and the translator has to render the text in a non-elliptical form in order to provide the appropriate text, so to comply with Persian grammatical norms; that is, the gap resultant of VPE in English is usually recovered by the antecedent verb or replaced by a pro-verb. When the two languages present similar construction, Google translator (GT) also produces a quite reasonable translation. However, in cases where Persian does not allow ellipsis, GT fails to recover the gap left by zeroed material in the source text. Auxiliary verbs also pose some specific problems, as GT translate them into light or lexical verbs. The analysis of data was based on the following order: VPE after auxiliary verbs; VPE after 2 complementizer `to´; and VPE in the presence of pro-forms. The results indicate that human translator dealing with English VPE predominantly adopts the strategy of recovering the zeroed verb from its previous occurrence in discourse. Naturally, in some cases, instead of a verb, a pro-verb is used. For light verb constructions in Persian, the tendency is towards retaining the light verb and zeroing the nominal component. For a residual number of cases the strategies were non-literal. This general behavior, however, depends on the auxiliary verb, used in the source language. Differences in the kind of the auxiliary verb in English VPE, thus, have a relevant bearing on the choice of the strategies the human translator adopted. For instance, English VPEs occurring after auxiliaries `do´, `be´, and `have´ cannot be translated into Persian by keeping the ellipsis; therefore, the gap is usually filled by the antecedent verb or a pro-verb. However, if the English sentence carries a VPE after auxiliary `be´ and the sentence is translated into Persian using passive voice, then the VPE can be kept. Persian allows keeping the gap produced by the English VPE when this involves the modal verbs `can´, `may´, or `must/have to´, if they are translated as بودن مجبور (majboor boodan) [OBLIGED+BE/GR]). In case of English VPEs occurring after infinitival complementizer `to´, the translation is mostly by filling the gaps with the antecedent verb. For English VPEs with pro-form structures with `so/too/as well/neither/either´, the translation, for the most part, is by using pro-forms, and thus, keeping the ellipsis. GT produces distorted translations when dealing with English VPEs occurring after tense operators, since it translate these auxiliaries literally, and also because it does not recover the gap resulting from the English elliptical sentence. For VPEs after modal verbs, GT performs quite acceptably but only after modal `can´; however it fails in dealing with other modal verbs. GT, in all cases, retains the VPE after complementizer `to´; thus, the output is often unnatural. And, finally, GT, in dealing with VPE in presence of pro-forms, mostly produces inadequate translations. 3 As a statistical-based MT system, GT does not take into consideration the discourse previous to the sentence under processing. Therefore, it seems incapable to recover the gap induced by English VPE, which results in incorrect translation output in many cases. The comparison between HT and GT of Persian texts indicates that a stronger effort should be invested in an anaphora resolution module, particularly, for certain English VPE patterns: those auxiliary verbs` do´, `be´, `have´, and `will´, and those after complementizer `to´. The findings of this study may help devise better performing strategies for English-Persian MT systems, namely, by highlighting the relevance of an anaphora resolution module prior to the MT of this language pair.Baptista, JorgeEvans, Richard J.SapientiaShahabi, Mitra2012-07-24T17:29:40Z20112011-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/10400.1/1523eng82'255 SHA*Cor Caveinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-24T10:12:37Zoai:sapientia.ualg.pt:10400.1/1523Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:55:42.132409Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A corpus-based translation study on english-persian verb phrase ellipsis
title A corpus-based translation study on english-persian verb phrase ellipsis
spellingShingle A corpus-based translation study on english-persian verb phrase ellipsis
Shahabi, Mitra
Verb phrase ellipsis
English-persian
Descriptive translation studies
Natural language processing
title_short A corpus-based translation study on english-persian verb phrase ellipsis
title_full A corpus-based translation study on english-persian verb phrase ellipsis
title_fullStr A corpus-based translation study on english-persian verb phrase ellipsis
title_full_unstemmed A corpus-based translation study on english-persian verb phrase ellipsis
title_sort A corpus-based translation study on english-persian verb phrase ellipsis
author Shahabi, Mitra
author_facet Shahabi, Mitra
author_role author
dc.contributor.none.fl_str_mv Baptista, Jorge
Evans, Richard J.
Sapientia
dc.contributor.author.fl_str_mv Shahabi, Mitra
dc.subject.por.fl_str_mv Verb phrase ellipsis
English-persian
Descriptive translation studies
Natural language processing
topic Verb phrase ellipsis
English-persian
Descriptive translation studies
Natural language processing
description Dissertação de mest., Natural Language Processing & Human Language Technology, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2011
publishDate 2011
dc.date.none.fl_str_mv 2011
2011-01-01T00:00:00Z
2012-07-24T17:29:40Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.1/1523
url http://hdl.handle.net/10400.1/1523
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 82'255 SHA*Cor Cave
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133160338882560