Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering

Detalhes bibliográficos
Autor(a) principal: Tavares, Leandro Luciani
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Institucional da UFSCAR
Texto Completo: https://repositorio.ufscar.br/handle/ufscar/10278
Resumo: The evolution of the interaction between humans and computers has accompanied the technological evolution of computers themselves. This process culminated in the rise of a subfield of computing called Questioning Answering (QA), which provides a form of natural interaction between machines and humans --- the Question-Answer interaction model. This model manifests itself in at least two forms of systems: restricted domain systems, which are specific, limited, more complex, and open domain systems, which address general subjects, not constrained to a particular topic, exhibiting a diversity that prevents the presentation in greater detail on any particular topic. Ideally, a QA system should combine in a practical way the main characteristics of the existing models in order to unite the variety of topics covered by the open domain systems and the thoroughness of the restricted domain systems. One possible solution is to combine several instances of restricted domain systems into a single open domain system. For this, a routing mechanism able to select one of the available instances must exist. Then, the selected instance should answer the question about the represented domain. In this work, a mechanism for selecting instances of QA systems is presented in shape of a hierarchical question domain classifier. Domains are naturally organized in a hierarchical taxonomy. When classifying the proposed questions into one of them, the classifier tries to select the most suitable QA system to answer the question. Although, for the purpose of training the classifier, quality data is mandatory. To tackle this dependency, an automatic question generation strategy based on documents was applied, resulting in a large synthetic question dataset. Results were promising when the classifier was evaluated against a real question dataset, suggesting that automatic question generation is feasible to train the classifier. In conclusion, the developed routing mechanism can be used to build a solid and universal hybrid QA system, ensembling the best qualities of each kind of system stand-alone.
id SCAR_18d4e87e37b93a25ee777e52e000b3ec
oai_identifier_str oai:repositorio.ufscar.br:ufscar/10278
network_acronym_str SCAR
network_name_str Repositório Institucional da UFSCAR
repository_id_str 4322
spelling Tavares, Leandro LucianiAlmeida, Tiago Agostinho dehttp://lattes.cnpq.br/5368680512020633http://lattes.cnpq.br/279673847005194226477ff3-1a58-4fdc-ab00-27a8ba000c302018-07-11T14:55:21Z2018-07-11T14:55:21Z2018-05-30TAVARES, Leandro Luciani. Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering. 2018. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10278.https://repositorio.ufscar.br/handle/ufscar/10278The evolution of the interaction between humans and computers has accompanied the technological evolution of computers themselves. This process culminated in the rise of a subfield of computing called Questioning Answering (QA), which provides a form of natural interaction between machines and humans --- the Question-Answer interaction model. This model manifests itself in at least two forms of systems: restricted domain systems, which are specific, limited, more complex, and open domain systems, which address general subjects, not constrained to a particular topic, exhibiting a diversity that prevents the presentation in greater detail on any particular topic. Ideally, a QA system should combine in a practical way the main characteristics of the existing models in order to unite the variety of topics covered by the open domain systems and the thoroughness of the restricted domain systems. One possible solution is to combine several instances of restricted domain systems into a single open domain system. For this, a routing mechanism able to select one of the available instances must exist. Then, the selected instance should answer the question about the represented domain. In this work, a mechanism for selecting instances of QA systems is presented in shape of a hierarchical question domain classifier. Domains are naturally organized in a hierarchical taxonomy. When classifying the proposed questions into one of them, the classifier tries to select the most suitable QA system to answer the question. Although, for the purpose of training the classifier, quality data is mandatory. To tackle this dependency, an automatic question generation strategy based on documents was applied, resulting in a large synthetic question dataset. Results were promising when the classifier was evaluated against a real question dataset, suggesting that automatic question generation is feasible to train the classifier. In conclusion, the developed routing mechanism can be used to build a solid and universal hybrid QA system, ensembling the best qualities of each kind of system stand-alone.A evolução do modo de interação entre os humanos e os computadores vem acompanhando a evolução tecnológica dos próprios computadores. Esse processo culminou no surgimento de uma subárea da computação denominada Question Answering (QA), que proporciona uma forma de interação natural entre a máquina e o homem --- o modelo de interação pergunta-resposta. Esse modelo manifesta-se em ao menos duas formas de sistemas: sistemas de domínio restrito, os quais se resumem a temas específicos, mais complexos e limitados, e sistemas de domínio aberto, os quais abordam assuntos gerais, não se restringindo a um tópico particular, exibindo uma diversidade que impede a apresentação em maior detalhamento sobre qualquer tópico em especial. Idealmente, um sistema de QA deveria combinar de modo prático as características mais evidentes dos modelos existentes, a fim de unir a variedade de tópicos abordados pelos sistemas de domínio aberto e a minuciosidade dos sistemas de domínio restrito. Uma possível solução seria integrar diversas instâncias de sistemas de domínio restrito como um sistema de domínio aberto. Para isso, com base na pergunta do usuário, é necessário um mecanismo capaz de selecionar uma entre as instâncias disponíveis, a fim de que a instância selecionada produza a resposta sobre o domínio representado. Nesse contexto, este trabalho apresenta um mecanismo de seleção de instâncias de sistemas de QA, representado por um classificador de domínio hierárquico de perguntas. Os domínios são naturalmente organizados em uma taxonomia hierárquica. Ao classificar uma pergunta, esse classificador procura selecionar o sistema de QA mais adequado para produzir a resposta. Contudo, para treinar esse classificador, é mandatório obter dados rotulados de qualidade. Para contornar essa dependência e viabilizar a aplicação prática da proposta, neste trabalho foi aplicada uma estratégia automática de geração de perguntas baseadas em documentos, o que resultou em uma grande base de perguntas sintéticas. Ao avaliar o desempenho do classificador em uma base de perguntas reais, os resultados se mostraram bastante promissores, indicando que a estratégia de geração de perguntas automaticamente é viável para o treinamento do classificador. Assim sendo, o mecanismo de roteirização desenvolvido pode ser usado na composição de um sistema de QA híbrido mais robusto e ao mesmo tempo universal, de tal forma que agregue as principais qualidades de cada um dos tipos de sistemas de QAs.Não recebi financiamentoporUniversidade Federal de São CarlosCâmpus SorocabaPrograma de Pós-Graduação em Ciência da Computação - PPGCC-SoUFSCarGeração de perguntasClassificação hierárquica de domínioSistemas de QAClassificação de perguntasInteração homem-máquinaHierarchical domain classificationQuestion classificationQuestion generationQA systemsHuman-computer interationCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAOUtilização de mecanismos de roteamento para seleção de sistemas de Question AnsweringUse of routing mechanisms for selection of Question Answering Systemsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisOnline6006005de967ad-743c-4f36-972b-79dd683c0e9dinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALDISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdfDISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdfapplication/pdf2103504https://repositorio.ufscar.br/bitstream/ufscar/10278/1/DISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdfc17704e773930ea82adf48d76960832bMD51Encaminhamento_Leandro.pdfEncaminhamento_Leandro.pdfapplication/pdf492104https://repositorio.ufscar.br/bitstream/ufscar/10278/2/Encaminhamento_Leandro.pdf9e829ed0b606f91b5596548000a9dc15MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstream/ufscar/10278/3/license.txtae0398b6f8b235e40ad82cba6c50031dMD53TEXTDISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.txtDISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.txtExtracted texttext/plain163284https://repositorio.ufscar.br/bitstream/ufscar/10278/4/DISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.txt1431990d15ea97e758873fb3a7d5f2c8MD54Encaminhamento_Leandro.pdf.txtEncaminhamento_Leandro.pdf.txtExtracted texttext/plain1https://repositorio.ufscar.br/bitstream/ufscar/10278/5/Encaminhamento_Leandro.pdf.txt68b329da9893e34099c7d8ad5cb9c940MD55THUMBNAILDISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.jpgDISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.jpgIM Thumbnailimage/jpeg3949https://repositorio.ufscar.br/bitstream/ufscar/10278/6/DISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.jpg6f514ec0c232665f2f96093ae5a91d81MD56Encaminhamento_Leandro.pdf.jpgEncaminhamento_Leandro.pdf.jpgIM Thumbnailimage/jpeg13652https://repositorio.ufscar.br/bitstream/ufscar/10278/7/Encaminhamento_Leandro.pdf.jpg093b50edd1e42467c95217279b67b2b6MD57ufscar/102782023-09-18 18:31:16.317oai:repositorio.ufscar.br:ufscar/10278TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg==Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:31:16Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.por.fl_str_mv Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
dc.title.alternative.eng.fl_str_mv Use of routing mechanisms for selection of Question Answering Systems
title Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
spellingShingle Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
Tavares, Leandro Luciani
Geração de perguntas
Classificação hierárquica de domínio
Sistemas de QA
Classificação de perguntas
Interação homem-máquina
Hierarchical domain classification
Question classification
Question generation
QA systems
Human-computer interation
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
title_short Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
title_full Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
title_fullStr Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
title_full_unstemmed Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
title_sort Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering
author Tavares, Leandro Luciani
author_facet Tavares, Leandro Luciani
author_role author
dc.contributor.authorlattes.por.fl_str_mv http://lattes.cnpq.br/2796738470051942
dc.contributor.author.fl_str_mv Tavares, Leandro Luciani
dc.contributor.advisor1.fl_str_mv Almeida, Tiago Agostinho de
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/5368680512020633
dc.contributor.authorID.fl_str_mv 26477ff3-1a58-4fdc-ab00-27a8ba000c30
contributor_str_mv Almeida, Tiago Agostinho de
dc.subject.por.fl_str_mv Geração de perguntas
Classificação hierárquica de domínio
Sistemas de QA
Classificação de perguntas
Interação homem-máquina
topic Geração de perguntas
Classificação hierárquica de domínio
Sistemas de QA
Classificação de perguntas
Interação homem-máquina
Hierarchical domain classification
Question classification
Question generation
QA systems
Human-computer interation
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
dc.subject.eng.fl_str_mv Hierarchical domain classification
Question classification
Question generation
QA systems
Human-computer interation
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
description The evolution of the interaction between humans and computers has accompanied the technological evolution of computers themselves. This process culminated in the rise of a subfield of computing called Questioning Answering (QA), which provides a form of natural interaction between machines and humans --- the Question-Answer interaction model. This model manifests itself in at least two forms of systems: restricted domain systems, which are specific, limited, more complex, and open domain systems, which address general subjects, not constrained to a particular topic, exhibiting a diversity that prevents the presentation in greater detail on any particular topic. Ideally, a QA system should combine in a practical way the main characteristics of the existing models in order to unite the variety of topics covered by the open domain systems and the thoroughness of the restricted domain systems. One possible solution is to combine several instances of restricted domain systems into a single open domain system. For this, a routing mechanism able to select one of the available instances must exist. Then, the selected instance should answer the question about the represented domain. In this work, a mechanism for selecting instances of QA systems is presented in shape of a hierarchical question domain classifier. Domains are naturally organized in a hierarchical taxonomy. When classifying the proposed questions into one of them, the classifier tries to select the most suitable QA system to answer the question. Although, for the purpose of training the classifier, quality data is mandatory. To tackle this dependency, an automatic question generation strategy based on documents was applied, resulting in a large synthetic question dataset. Results were promising when the classifier was evaluated against a real question dataset, suggesting that automatic question generation is feasible to train the classifier. In conclusion, the developed routing mechanism can be used to build a solid and universal hybrid QA system, ensembling the best qualities of each kind of system stand-alone.
publishDate 2018
dc.date.accessioned.fl_str_mv 2018-07-11T14:55:21Z
dc.date.available.fl_str_mv 2018-07-11T14:55:21Z
dc.date.issued.fl_str_mv 2018-05-30
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv TAVARES, Leandro Luciani. Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering. 2018. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10278.
dc.identifier.uri.fl_str_mv https://repositorio.ufscar.br/handle/ufscar/10278
identifier_str_mv TAVARES, Leandro Luciani. Utilização de mecanismos de roteamento para seleção de sistemas de Question Answering. 2018. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/10278.
url https://repositorio.ufscar.br/handle/ufscar/10278
dc.language.iso.fl_str_mv por
language por
dc.relation.confidence.fl_str_mv 600
600
dc.relation.authority.fl_str_mv 5de967ad-743c-4f36-972b-79dd683c0e9d
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus Sorocaba
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
dc.publisher.initials.fl_str_mv UFSCar
publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus Sorocaba
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFSCAR
instname:Universidade Federal de São Carlos (UFSCAR)
instacron:UFSCAR
instname_str Universidade Federal de São Carlos (UFSCAR)
instacron_str UFSCAR
institution UFSCAR
reponame_str Repositório Institucional da UFSCAR
collection Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv https://repositorio.ufscar.br/bitstream/ufscar/10278/1/DISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf
https://repositorio.ufscar.br/bitstream/ufscar/10278/2/Encaminhamento_Leandro.pdf
https://repositorio.ufscar.br/bitstream/ufscar/10278/3/license.txt
https://repositorio.ufscar.br/bitstream/ufscar/10278/4/DISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.txt
https://repositorio.ufscar.br/bitstream/ufscar/10278/5/Encaminhamento_Leandro.pdf.txt
https://repositorio.ufscar.br/bitstream/ufscar/10278/6/DISSERTACAO_LEANDRO_LUCIANI_TAVARES_FINAL.pdf.jpg
https://repositorio.ufscar.br/bitstream/ufscar/10278/7/Encaminhamento_Leandro.pdf.jpg
bitstream.checksum.fl_str_mv c17704e773930ea82adf48d76960832b
9e829ed0b606f91b5596548000a9dc15
ae0398b6f8b235e40ad82cba6c50031d
1431990d15ea97e758873fb3a7d5f2c8
68b329da9893e34099c7d8ad5cb9c940
6f514ec0c232665f2f96093ae5a91d81
093b50edd1e42467c95217279b67b2b6
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv
_version_ 1802136343450484736