A corpus-based translation study on english-persian verb phrase ellipsis

Shahabi, Mitra

A corpus-based translation study on english-persian verb phrase ellipsis

Detalhes bibliográficos
Autor(a) principal:	Shahabi, Mitra
Data de Publicação:	2011
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.1/1523
Resumo:	Dissertação de mest., Natural Language Processing & Human Language Technology, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2011

Metadados do item

id	RCAP_34bd7e3e8676e2d7896da3429982b7a2
oai_identifier_str	oai:sapientia.ualg.pt:10400.1/1523
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	A corpus-based translation study on english-persian verb phrase ellipsisVerb phrase ellipsisEnglish-persianDescriptive translation studiesNatural language processingDissertação de mest., Natural Language Processing & Human Language Technology, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2011The present research adopted a descriptive corpus-based translation approach and focused on the patterns of translation of English Verb Phrase Ellipsis (VPE) into Persian. The goal was to find out how the observed translation behavior may be taken as advantageous information for improving English-Persian Machine Translation (MT) systems performances. For this purpose, a bilingual English-Persian parallel corpus was used. It consisted of 1,600 movies´ subtitles, consisting of informal conversations, with about 4 million words for each language. Unitex finite-state tools were applied in order to detect the intended English VPE by defining certain search patterns. The extracted cases of VPE were then compared with their Persian counterparts. Analysis of the Persian translations provided by the translators and the strategies utilized by them in dealing with English VPE was indicative of the fact that in those cases where Persian and English show some similar VPE constructions, the human translator keeps the translation quite close to the original text, especially by retaining the ellipsis. However, in many cases, the elliptical forms are language-specific. In such cases, it is not possible to keep the ellipsis and the translator has to render the text in a non-elliptical form in order to provide the appropriate text, so to comply with Persian grammatical norms; that is, the gap resultant of VPE in English is usually recovered by the antecedent verb or replaced by a pro-verb. When the two languages present similar construction, Google translator (GT) also produces a quite reasonable translation. However, in cases where Persian does not allow ellipsis, GT fails to recover the gap left by zeroed material in the source text. Auxiliary verbs also pose some specific problems, as GT translate them into light or lexical verbs. The analysis of data was based on the following order: VPE after auxiliary verbs; VPE after 2 complementizer `to´; and VPE in the presence of pro-forms. The results indicate that human translator dealing with English VPE predominantly adopts the strategy of recovering the zeroed verb from its previous occurrence in discourse. Naturally, in some cases, instead of a verb, a pro-verb is used. For light verb constructions in Persian, the tendency is towards retaining the light verb and zeroing the nominal component. For a residual number of cases the strategies were non-literal. This general behavior, however, depends on the auxiliary verb, used in the source language. Differences in the kind of the auxiliary verb in English VPE, thus, have a relevant bearing on the choice of the strategies the human translator adopted. For instance, English VPEs occurring after auxiliaries `do´, `be´, and `have´ cannot be translated into Persian by keeping the ellipsis; therefore, the gap is usually filled by the antecedent verb or a pro-verb. However, if the English sentence carries a VPE after auxiliary `be´ and the sentence is translated into Persian using passive voice, then the VPE can be kept. Persian allows keeping the gap produced by the English VPE when this involves the modal verbs `can´, `may´, or `must/have to´, if they are translated as بودن مجبور (majboor boodan) [OBLIGED+BE/GR]). In case of English VPEs occurring after infinitival complementizer `to´, the translation is mostly by filling the gaps with the antecedent verb. For English VPEs with pro-form structures with `so/too/as well/neither/either´, the translation, for the most part, is by using pro-forms, and thus, keeping the ellipsis. GT produces distorted translations when dealing with English VPEs occurring after tense operators, since it translate these auxiliaries literally, and also because it does not recover the gap resulting from the English elliptical sentence. For VPEs after modal verbs, GT performs quite acceptably but only after modal `can´; however it fails in dealing with other modal verbs. GT, in all cases, retains the VPE after complementizer `to´; thus, the output is often unnatural. And, finally, GT, in dealing with VPE in presence of pro-forms, mostly produces inadequate translations. 3 As a statistical-based MT system, GT does not take into consideration the discourse previous to the sentence under processing. Therefore, it seems incapable to recover the gap induced by English VPE, which results in incorrect translation output in many cases. The comparison between HT and GT of Persian texts indicates that a stronger effort should be invested in an anaphora resolution module, particularly, for certain English VPE patterns: those auxiliary verbs` do´, `be´, `have´, and `will´, and those after complementizer `to´. The findings of this study may help devise better performing strategies for English-Persian MT systems, namely, by highlighting the relevance of an anaphora resolution module prior to the MT of this language pair.Baptista, JorgeEvans, Richard J.SapientiaShahabi, Mitra2012-07-24T17:29:40Z20112011-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/10400.1/1523eng82'255 SHA*Cor Caveinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-24T10:12:37Zoai:sapientia.ualg.pt:10400.1/1523Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:55:42.132409Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	A corpus-based translation study on english-persian verb phrase ellipsis
title	A corpus-based translation study on english-persian verb phrase ellipsis
spellingShingle	A corpus-based translation study on english-persian verb phrase ellipsis Shahabi, Mitra Verb phrase ellipsis English-persian Descriptive translation studies Natural language processing
title_short	A corpus-based translation study on english-persian verb phrase ellipsis
title_full	A corpus-based translation study on english-persian verb phrase ellipsis
title_fullStr	A corpus-based translation study on english-persian verb phrase ellipsis
title_full_unstemmed	A corpus-based translation study on english-persian verb phrase ellipsis
title_sort	A corpus-based translation study on english-persian verb phrase ellipsis
author	Shahabi, Mitra
author_facet	Shahabi, Mitra
author_role	author
dc.contributor.none.fl_str_mv	Baptista, Jorge Evans, Richard J. Sapientia
dc.contributor.author.fl_str_mv	Shahabi, Mitra
dc.subject.por.fl_str_mv	Verb phrase ellipsis English-persian Descriptive translation studies Natural language processing
topic	Verb phrase ellipsis English-persian Descriptive translation studies Natural language processing
description	Dissertação de mest., Natural Language Processing & Human Language Technology, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2011
publishDate	2011
dc.date.none.fl_str_mv	2011 2011-01-01T00:00:00Z 2012-07-24T17:29:40Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.1/1523
url	http://hdl.handle.net/10400.1/1523
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	82'255 SHA*Cor Cave
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799133160338882560

A corpus-based translation study on english-persian verb phrase ellipsis

Registros relacionados