Intent-aware semantic query annotation
Autor(a) principal: | |
---|---|
Data de Publicação: | 2017 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFMG |
Texto Completo: | http://hdl.handle.net/1843/30489 |
Resumo: | Query understanding is a challenging task primarily due to the inherent ambiguity of natural language. A common strategy for improving the understanding of natural language queries is to annotate them with semantic information mined from a knowledge base. Nevertheless, queries with different intents may arguably benefit from specialized annotation strategies. For instance, some queries could be effectively annotated with a single entity or an entity attribute, others could be better represented by a list of entities of a single type or by entities of multiple distinct types, and others may be simply ambiguous. In this dissertation, we propose a framework for learning semantic query annotations suitable to the target intent of each individual query. Thorough experiments on a publicly available benchmark show that our proposed approach can significantly improve state-of-the-art intent-agnostic approaches based on Markov random fields and learning to rank. Our results further demonstrate the consistent effectiveness of our approach for queries of various target intents, lengths, and difficulty levels, as well as its robustness to noise in intent detection. |
id |
UFMG_10bc8515af6b58236862beb24dc4d3f8 |
---|---|
oai_identifier_str |
oai:repositorio.ufmg.br:1843/30489 |
network_acronym_str |
UFMG |
network_name_str |
Repositório Institucional da UFMG |
repository_id_str |
|
spelling |
Rodrygo Luis Teodoro Santoshttp://lattes.cnpq.br/1162362624079364Nivio ZivianiAltigran Soares da SilvaMarcos André Gonçalveshttp://lattes.cnpq.br/7329858225436491Rafael Glater da Cruz Machado2019-10-17T20:20:51Z2019-10-17T20:20:51Z2017-04-07http://hdl.handle.net/1843/30489Query understanding is a challenging task primarily due to the inherent ambiguity of natural language. A common strategy for improving the understanding of natural language queries is to annotate them with semantic information mined from a knowledge base. Nevertheless, queries with different intents may arguably benefit from specialized annotation strategies. For instance, some queries could be effectively annotated with a single entity or an entity attribute, others could be better represented by a list of entities of a single type or by entities of multiple distinct types, and others may be simply ambiguous. In this dissertation, we propose a framework for learning semantic query annotations suitable to the target intent of each individual query. Thorough experiments on a publicly available benchmark show that our proposed approach can significantly improve state-of-the-art intent-agnostic approaches based on Markov random fields and learning to rank. Our results further demonstrate the consistent effectiveness of our approach for queries of various target intents, lengths, and difficulty levels, as well as its robustness to noise in intent detection.O entendimento de uma consulta é uma tarefa desafiadora, principalmente devido à ambigüidade inerente da linguagem natural. Uma estratégia comum para melhorar a compreensão das consultas em linguagem natural é anotá-las com informações semânticas extraídas de uma base de conhecimento. No entanto, consultas com diferentes intenções podem se beneficiar de diferentes estratégias de anotação. Por exemplo, algumas consultas podem ser efetivamente anotadas com uma única entidade ou um atributo de entidade, outras podem ser melhor representadas por uma lista de entidades de um único tipo ou por entidades de vários tipos distintos, e outras podem ser simplesmente ambíguas. Nesta dissertação, propomos um framework para aprendizagem de anotações semânticas em consultas de acordo com a intenção existente em cada uma. Experimentos minuciosos em um benchmark publicamente disponível mostram que a abordagem proposta pode melhorar significativamente quando comparadas às abordagens agnósticas baseadas em campos aleatórios de Markov e de aprendizado de ranqueamento. Nossos resultados demonstram ainda, de forma consistente, a eficácia de nossa abordagem para consultas de várias intenções, comprimentos e níveis de dificuldade, bem como sua robustez ao ruído na detecção de intenção.engUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGBrasilAprendizado de ranqueamentoRecuperação de informaçãoAprendizado de ranqueamentoRecuperação da informaçãoAprendizado de representaçõesBusca semânticaAnotação semântica em consultasIntent-aware semantic query annotationAnotações semânticas em consultas baseada na intenção do usuárioinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALRafaelGlaterdaCruzMachado.pdfRafaelGlaterdaCruzMachado.pdfapplication/pdf2242627https://repositorio.ufmg.br/bitstream/1843/30489/2/RafaelGlaterdaCruzMachado.pdf3de9f1b5066a753f028d0dc53030b5a8MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82119https://repositorio.ufmg.br/bitstream/1843/30489/3/license.txt34badce4be7e31e3adb4575ae96af679MD53TEXTRafaelGlaterdaCruzMachado.pdf.txtRafaelGlaterdaCruzMachado.pdf.txtExtracted texttext/plain134389https://repositorio.ufmg.br/bitstream/1843/30489/4/RafaelGlaterdaCruzMachado.pdf.txtd6260160e608a9073a6f0b9f1df3d853MD541843/304892019-11-14 12:32:16.97oai:repositorio.ufmg.br:1843/30489TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KCg==Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2019-11-14T15:32:16Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
dc.title.pt_BR.fl_str_mv |
Intent-aware semantic query annotation |
dc.title.alternative.pt_BR.fl_str_mv |
Anotações semânticas em consultas baseada na intenção do usuário |
title |
Intent-aware semantic query annotation |
spellingShingle |
Intent-aware semantic query annotation Rafael Glater da Cruz Machado Aprendizado de ranqueamento Recuperação da informação Aprendizado de representações Busca semântica Anotação semântica em consultas Aprendizado de ranqueamento Recuperação de informação |
title_short |
Intent-aware semantic query annotation |
title_full |
Intent-aware semantic query annotation |
title_fullStr |
Intent-aware semantic query annotation |
title_full_unstemmed |
Intent-aware semantic query annotation |
title_sort |
Intent-aware semantic query annotation |
author |
Rafael Glater da Cruz Machado |
author_facet |
Rafael Glater da Cruz Machado |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Rodrygo Luis Teodoro Santos |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/1162362624079364 |
dc.contributor.advisor-co1.fl_str_mv |
Nivio Ziviani |
dc.contributor.referee1.fl_str_mv |
Altigran Soares da Silva |
dc.contributor.referee2.fl_str_mv |
Marcos André Gonçalves |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/7329858225436491 |
dc.contributor.author.fl_str_mv |
Rafael Glater da Cruz Machado |
contributor_str_mv |
Rodrygo Luis Teodoro Santos Nivio Ziviani Altigran Soares da Silva Marcos André Gonçalves |
dc.subject.por.fl_str_mv |
Aprendizado de ranqueamento Recuperação da informação Aprendizado de representações Busca semântica Anotação semântica em consultas |
topic |
Aprendizado de ranqueamento Recuperação da informação Aprendizado de representações Busca semântica Anotação semântica em consultas Aprendizado de ranqueamento Recuperação de informação |
dc.subject.other.pt_BR.fl_str_mv |
Aprendizado de ranqueamento Recuperação de informação |
description |
Query understanding is a challenging task primarily due to the inherent ambiguity of natural language. A common strategy for improving the understanding of natural language queries is to annotate them with semantic information mined from a knowledge base. Nevertheless, queries with different intents may arguably benefit from specialized annotation strategies. For instance, some queries could be effectively annotated with a single entity or an entity attribute, others could be better represented by a list of entities of a single type or by entities of multiple distinct types, and others may be simply ambiguous. In this dissertation, we propose a framework for learning semantic query annotations suitable to the target intent of each individual query. Thorough experiments on a publicly available benchmark show that our proposed approach can significantly improve state-of-the-art intent-agnostic approaches based on Markov random fields and learning to rank. Our results further demonstrate the consistent effectiveness of our approach for queries of various target intents, lengths, and difficulty levels, as well as its robustness to noise in intent detection. |
publishDate |
2017 |
dc.date.issued.fl_str_mv |
2017-04-07 |
dc.date.accessioned.fl_str_mv |
2019-10-17T20:20:51Z |
dc.date.available.fl_str_mv |
2019-10-17T20:20:51Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1843/30489 |
url |
http://hdl.handle.net/1843/30489 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Ciência da Computação |
dc.publisher.initials.fl_str_mv |
UFMG |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
instname_str |
Universidade Federal de Minas Gerais (UFMG) |
instacron_str |
UFMG |
institution |
UFMG |
reponame_str |
Repositório Institucional da UFMG |
collection |
Repositório Institucional da UFMG |
bitstream.url.fl_str_mv |
https://repositorio.ufmg.br/bitstream/1843/30489/2/RafaelGlaterdaCruzMachado.pdf https://repositorio.ufmg.br/bitstream/1843/30489/3/license.txt https://repositorio.ufmg.br/bitstream/1843/30489/4/RafaelGlaterdaCruzMachado.pdf.txt |
bitstream.checksum.fl_str_mv |
3de9f1b5066a753f028d0dc53030b5a8 34badce4be7e31e3adb4575ae96af679 d6260160e608a9073a6f0b9f1df3d853 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
repository.mail.fl_str_mv |
|
_version_ |
1801677024955203584 |