An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques

Detalhes bibliográficos
Autor(a) principal: DAS, ARIJIT
Data de Publicação: 2020
Tipo de documento: Artigo
Idioma: eng
Título da fonte: INFOCOMP: Jornal de Ciência da Computação
Texto Completo: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783
Resumo: Question Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification.
id UFLA-5_1e6ce947a9427efbbff826d8c6c91600
oai_identifier_str oai:infocomp.dcc.ufla.br:article/783
network_acronym_str UFLA-5
network_name_str INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling An Alternate Approach for Question Answering system in Bengali Language using Classification TechniquesQuestion Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification.Editora da UFLA2020-06-18info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783INFOCOMP Journal of Computer Science; Vol. 19 No. 1 (2020): June 20201982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783/538Copyright (c) 2020 ARIJIT DASinfo:eu-repo/semantics/openAccessDAS, ARIJIT2020-08-18T01:10:10Zoai:infocomp.dcc.ufla.br:article/783Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:45.312176INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
title An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
spellingShingle An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
DAS, ARIJIT
title_short An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
title_full An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
title_fullStr An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
title_full_unstemmed An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
title_sort An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
author DAS, ARIJIT
author_facet DAS, ARIJIT
author_role author
dc.contributor.author.fl_str_mv DAS, ARIJIT
description Question Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification.
publishDate 2020
dc.date.none.fl_str_mv 2020-06-18
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783
url https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/783/538
dc.rights.driver.fl_str_mv Copyright (c) 2020 ARIJIT DAS
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2020 ARIJIT DAS
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Editora da UFLA
publisher.none.fl_str_mv Editora da UFLA
dc.source.none.fl_str_mv INFOCOMP Journal of Computer Science; Vol. 19 No. 1 (2020): June 2020
1982-3363
1807-4545
reponame:INFOCOMP: Jornal de Ciência da Computação
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str INFOCOMP: Jornal de Ciência da Computação
collection INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv infocomp@dcc.ufla.br||apfreire@dcc.ufla.br
_version_ 1799874742628384768