Bilingual terminology extraction based on translation patterns

Detalhes bibliográficos
Autor(a) principal: Simões, Alberto
Data de Publicação: 2008
Outros Autores: Almeida, J. J.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/1822/14378
Resumo: Parallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases.
id RCAP_240454c78fa0d0669ca8417cc6e57ad8
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/14378
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Bilingual terminology extraction based on translation patternsTerminology extractionParallel corporaInformation extractionTranslation resourcesMachine translationCorpora paralelosExtracción de informaciónRecursos de traduccionTraducción automáticaSocial SciencesParallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases.Los corpora paralelos son fuentes ricas en recursos de traducción. Este documento presenta una metodología para la extracción de sintagmas nominales bilingües (candidatos terminológicos) a partir de corpora paralelos, utilizando reglas de traducción. Los modelos propuestos en este trabajo especifican las alteraciones en el orden de las palabras que se producen durante la traducción y que son intrínsecos a la sintaxis de las lenguas implicadas. Estas reglas se describen en un lenguaje de dominio específico llamado PDL (Pattern Description Language) y son sumamente eficientes para la detección de sintagmas nominales.Alberto Simoes has a scholarship from Fundacao para a Computacao Cientifica Nacional and the work reported here has been partially funded by Fundacao para a Ciencia e Tecnologia through project POSI/PLP/43931/2001, co-financed by POSI, and by POSC project POSC/339/1.3/C/NAC.Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN)Universidade do MinhoSimões, AlbertoAlmeida, J. J.2008-092008-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/14378eng1135-5948info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:06:17Zoai:repositorium.sdum.uminho.pt:1822/14378Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:56:54.876426Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Bilingual terminology extraction based on translation patterns
title Bilingual terminology extraction based on translation patterns
spellingShingle Bilingual terminology extraction based on translation patterns
Simões, Alberto
Terminology extraction
Parallel corpora
Information extraction
Translation resources
Machine translation
Corpora paralelos
Extracción de información
Recursos de traduccion
Traducción automática
Social Sciences
title_short Bilingual terminology extraction based on translation patterns
title_full Bilingual terminology extraction based on translation patterns
title_fullStr Bilingual terminology extraction based on translation patterns
title_full_unstemmed Bilingual terminology extraction based on translation patterns
title_sort Bilingual terminology extraction based on translation patterns
author Simões, Alberto
author_facet Simões, Alberto
Almeida, J. J.
author_role author
author2 Almeida, J. J.
author2_role author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Simões, Alberto
Almeida, J. J.
dc.subject.por.fl_str_mv Terminology extraction
Parallel corpora
Information extraction
Translation resources
Machine translation
Corpora paralelos
Extracción de información
Recursos de traduccion
Traducción automática
Social Sciences
topic Terminology extraction
Parallel corpora
Information extraction
Translation resources
Machine translation
Corpora paralelos
Extracción de información
Recursos de traduccion
Traducción automática
Social Sciences
description Parallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases.
publishDate 2008
dc.date.none.fl_str_mv 2008-09
2008-09-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/14378
url http://hdl.handle.net/1822/14378
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1135-5948
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN)
publisher.none.fl_str_mv Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN)
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799132357595234304