An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | INFOCOMP: Jornal de Ciência da Computação |
Texto Completo: | https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783 |
Resumo: | Question Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification. |
id |
UFLA-5_1e6ce947a9427efbbff826d8c6c91600 |
---|---|
oai_identifier_str |
oai:infocomp.dcc.ufla.br:article/783 |
network_acronym_str |
UFLA-5 |
network_name_str |
INFOCOMP: Jornal de Ciência da Computação |
repository_id_str |
|
spelling |
An Alternate Approach for Question Answering system in Bengali Language using Classification TechniquesQuestion Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification.Editora da UFLA2020-06-18info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783INFOCOMP Journal of Computer Science; Vol. 19 No. 1 (2020): June 20201982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783/538Copyright (c) 2020 ARIJIT DASinfo:eu-repo/semantics/openAccessDAS, ARIJIT2020-08-18T01:10:10Zoai:infocomp.dcc.ufla.br:article/783Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:45.312176INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true |
dc.title.none.fl_str_mv |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
title |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
spellingShingle |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques DAS, ARIJIT |
title_short |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
title_full |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
title_fullStr |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
title_full_unstemmed |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
title_sort |
An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques |
author |
DAS, ARIJIT |
author_facet |
DAS, ARIJIT |
author_role |
author |
dc.contributor.author.fl_str_mv |
DAS, ARIJIT |
description |
Question Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-06-18 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783 |
url |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783/538 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2020 ARIJIT DAS info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2020 ARIJIT DAS |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Editora da UFLA |
publisher.none.fl_str_mv |
Editora da UFLA |
dc.source.none.fl_str_mv |
INFOCOMP Journal of Computer Science; Vol. 19 No. 1 (2020): June 2020 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
INFOCOMP: Jornal de Ciência da Computação |
collection |
INFOCOMP: Jornal de Ciência da Computação |
repository.name.fl_str_mv |
INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
infocomp@dcc.ufla.br||apfreire@dcc.ufla.br |
_version_ |
1799874742628384768 |