Bilingual terminology extraction based on translation patterns
Autor(a) principal: | |
---|---|
Data de Publicação: | 2008 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/1822/14378 |
Resumo: | Parallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases. |
id |
RCAP_240454c78fa0d0669ca8417cc6e57ad8 |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/14378 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Bilingual terminology extraction based on translation patternsTerminology extractionParallel corporaInformation extractionTranslation resourcesMachine translationCorpora paralelosExtracción de informaciónRecursos de traduccionTraducción automáticaSocial SciencesParallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases.Los corpora paralelos son fuentes ricas en recursos de traducción. Este documento presenta una metodología para la extracción de sintagmas nominales bilingües (candidatos terminológicos) a partir de corpora paralelos, utilizando reglas de traducción. Los modelos propuestos en este trabajo especifican las alteraciones en el orden de las palabras que se producen durante la traducción y que son intrínsecos a la sintaxis de las lenguas implicadas. Estas reglas se describen en un lenguaje de dominio específico llamado PDL (Pattern Description Language) y son sumamente eficientes para la detección de sintagmas nominales.Alberto Simoes has a scholarship from Fundacao para a Computacao Cientifica Nacional and the work reported here has been partially funded by Fundacao para a Ciencia e Tecnologia through project POSI/PLP/43931/2001, co-financed by POSI, and by POSC project POSC/339/1.3/C/NAC.Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN)Universidade do MinhoSimões, AlbertoAlmeida, J. J.2008-092008-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/14378eng1135-5948info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:06:17Zoai:repositorium.sdum.uminho.pt:1822/14378Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:56:54.876426Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Bilingual terminology extraction based on translation patterns |
title |
Bilingual terminology extraction based on translation patterns |
spellingShingle |
Bilingual terminology extraction based on translation patterns Simões, Alberto Terminology extraction Parallel corpora Information extraction Translation resources Machine translation Corpora paralelos Extracción de información Recursos de traduccion Traducción automática Social Sciences |
title_short |
Bilingual terminology extraction based on translation patterns |
title_full |
Bilingual terminology extraction based on translation patterns |
title_fullStr |
Bilingual terminology extraction based on translation patterns |
title_full_unstemmed |
Bilingual terminology extraction based on translation patterns |
title_sort |
Bilingual terminology extraction based on translation patterns |
author |
Simões, Alberto |
author_facet |
Simões, Alberto Almeida, J. J. |
author_role |
author |
author2 |
Almeida, J. J. |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Simões, Alberto Almeida, J. J. |
dc.subject.por.fl_str_mv |
Terminology extraction Parallel corpora Information extraction Translation resources Machine translation Corpora paralelos Extracción de información Recursos de traduccion Traducción automática Social Sciences |
topic |
Terminology extraction Parallel corpora Information extraction Translation resources Machine translation Corpora paralelos Extracción de información Recursos de traduccion Traducción automática Social Sciences |
description |
Parallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases. |
publishDate |
2008 |
dc.date.none.fl_str_mv |
2008-09 2008-09-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1822/14378 |
url |
http://hdl.handle.net/1822/14378 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1135-5948 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN) |
publisher.none.fl_str_mv |
Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN) |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799132357595234304 |