Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation

Detalhes bibliográficos
Autor(a) principal: Paul, Saptarshi
Data de Publicação: 2022
Tipo de documento: Artigo
Idioma: eng
Título da fonte: INFOCOMP: Jornal de Ciência da Computação
Texto Completo: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966
Resumo: The recent advent of corpora based transliteration and translation approaches such as SMT and NMT models are completely based on the parallel corpus. It is the corpus that ultimately decides the Translation Accuracy (TA) of the model. With the regular and common domains exhausted and things of the past, Modern fields of research corpora domains lie anywhere between medicines to aero-science. The Work becomes more interesting when Indian languages are taken up especially ones that include technical touch such as Aeronautics and Aviation. With corpora for technical domains in English-Indian languages pairs such as Bengali coming up now, the automatic analysis for such corpora is an interesting aspect that researchers are taking up. Such analysis also helps developers and researchers to further improve the quality of the corpus and set new benchmarks for the development of future corpora. This paper deals with the need, development and detailed analysis of a bilingual corpus in aviation for English and Bengali language pairs.
id UFLA-5_1225d43ef68add3e97e3daf10f0b57dc
oai_identifier_str oai:infocomp.dcc.ufla.br:article/1966
network_acronym_str UFLA-5
network_name_str INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translationThe recent advent of corpora based transliteration and translation approaches such as SMT and NMT models are completely based on the parallel corpus. It is the corpus that ultimately decides the Translation Accuracy (TA) of the model. With the regular and common domains exhausted and things of the past, Modern fields of research corpora domains lie anywhere between medicines to aero-science. The Work becomes more interesting when Indian languages are taken up especially ones that include technical touch such as Aeronautics and Aviation. With corpora for technical domains in English-Indian languages pairs such as Bengali coming up now, the automatic analysis for such corpora is an interesting aspect that researchers are taking up. Such analysis also helps developers and researchers to further improve the quality of the corpus and set new benchmarks for the development of future corpora. This paper deals with the need, development and detailed analysis of a bilingual corpus in aviation for English and Bengali language pairs.Editora da UFLA2022-06-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 20221982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966/577Copyright (c) 2022 Saptarshi Paulinfo:eu-repo/semantics/openAccessPaul, Saptarshi2022-06-01T13:53:39Zoai:infocomp.dcc.ufla.br:article/1966Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:47.709872INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
title Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
spellingShingle Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
Paul, Saptarshi
title_short Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
title_full Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
title_fullStr Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
title_full_unstemmed Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
title_sort Translating the Language of Aviation. The Development and Detailed analysis of the English-Bengali Aviation Corpus for Machine translation
author Paul, Saptarshi
author_facet Paul, Saptarshi
author_role author
dc.contributor.author.fl_str_mv Paul, Saptarshi
description The recent advent of corpora based transliteration and translation approaches such as SMT and NMT models are completely based on the parallel corpus. It is the corpus that ultimately decides the Translation Accuracy (TA) of the model. With the regular and common domains exhausted and things of the past, Modern fields of research corpora domains lie anywhere between medicines to aero-science. The Work becomes more interesting when Indian languages are taken up especially ones that include technical touch such as Aeronautics and Aviation. With corpora for technical domains in English-Indian languages pairs such as Bengali coming up now, the automatic analysis for such corpora is an interesting aspect that researchers are taking up. Such analysis also helps developers and researchers to further improve the quality of the corpus and set new benchmarks for the development of future corpora. This paper deals with the need, development and detailed analysis of a bilingual corpus in aviation for English and Bengali language pairs.
publishDate 2022
dc.date.none.fl_str_mv 2022-06-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966
url https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1966/577
dc.rights.driver.fl_str_mv Copyright (c) 2022 Saptarshi Paul
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2022 Saptarshi Paul
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Editora da UFLA
publisher.none.fl_str_mv Editora da UFLA
dc.source.none.fl_str_mv INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 2022
1982-3363
1807-4545
reponame:INFOCOMP: Jornal de Ciência da Computação
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str INFOCOMP: Jornal de Ciência da Computação
collection INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv infocomp@dcc.ufla.br||apfreire@dcc.ufla.br
_version_ 1799874742685007872