Dense and hybrid models for information retrieval

Detalhes bibliográficos
Autor(a) principal: Frias, José André Lopes
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/38434
Resumo: As in the era of Big Data, there is the need of finding information in an easy and fast way, being imperative for a search system to understand more efficiently the user intent. Dense Retrieval focuses on this idea, by allowing the models to capture the underlying meaning of the queries and documents. Current models already surpass the classical BM-25 model in terms of accuracy. However, due to the use of a high number of dimensions to create representations of the queries and documents, the dense models are still not optimized in terms of their efficiency at a search level. This work focuses on evaluating the need for that high number of dimensions, by analyzing different dimensionality reduction methods, trained for different purposes, and comparing the trade-offs between efficiency and accuracy.
id RCAP_d733d19a7460db80efae9b732432e925
oai_identifier_str oai:ria.ua.pt:10773/38434
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Dense and hybrid models for information retrievalInformation retrievalDense retrievalDimensionality reductionDeep learningAs in the era of Big Data, there is the need of finding information in an easy and fast way, being imperative for a search system to understand more efficiently the user intent. Dense Retrieval focuses on this idea, by allowing the models to capture the underlying meaning of the queries and documents. Current models already surpass the classical BM-25 model in terms of accuracy. However, due to the use of a high number of dimensions to create representations of the queries and documents, the dense models are still not optimized in terms of their efficiency at a search level. This work focuses on evaluating the need for that high number of dimensions, by analyzing different dimensionality reduction methods, trained for different purposes, and comparing the trade-offs between efficiency and accuracy.Na era de Big Data em que nos encontramos, existe a necessidade de encontrar informação de uma forma mais fácil e mais rápida, sendo imperativo para um sistema de pesquisa entender eficientemente a intenção do utilizador. O campo de Dense Retrieval foca-se nesta ideia, permitindo que os modelos capturem os aspetos semânticos de queries e documentos. Modelos atuais já superam o modelo clássico BM-25 em termos de eficácia. No entanto, devido à aplicação de um número elevado de dimensões para criar as representações de queries e documentos, estes modelos densos ainda não estão otimizados em termos de desempenho ao nível da pesquisa. Este trabalho foca-se em avaliar a necessidade desse número elevado de dimensões, analisando diferentes métodos de redução de dimensionalidade, orientados para diferentes objectivos, e em comparar pontos de equilíbrio entre eficiência e precisão.2023-07-07T15:53:45Z2022-12-05T00:00:00Z2022-12-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/38434engFrias, José André Lopesinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:14:21Zoai:ria.ua.pt:10773/38434Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:08:37.883817Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Dense and hybrid models for information retrieval
title Dense and hybrid models for information retrieval
spellingShingle Dense and hybrid models for information retrieval
Frias, José André Lopes
Information retrieval
Dense retrieval
Dimensionality reduction
Deep learning
title_short Dense and hybrid models for information retrieval
title_full Dense and hybrid models for information retrieval
title_fullStr Dense and hybrid models for information retrieval
title_full_unstemmed Dense and hybrid models for information retrieval
title_sort Dense and hybrid models for information retrieval
author Frias, José André Lopes
author_facet Frias, José André Lopes
author_role author
dc.contributor.author.fl_str_mv Frias, José André Lopes
dc.subject.por.fl_str_mv Information retrieval
Dense retrieval
Dimensionality reduction
Deep learning
topic Information retrieval
Dense retrieval
Dimensionality reduction
Deep learning
description As in the era of Big Data, there is the need of finding information in an easy and fast way, being imperative for a search system to understand more efficiently the user intent. Dense Retrieval focuses on this idea, by allowing the models to capture the underlying meaning of the queries and documents. Current models already surpass the classical BM-25 model in terms of accuracy. However, due to the use of a high number of dimensions to create representations of the queries and documents, the dense models are still not optimized in terms of their efficiency at a search level. This work focuses on evaluating the need for that high number of dimensions, by analyzing different dimensionality reduction methods, trained for different purposes, and comparing the trade-offs between efficiency and accuracy.
publishDate 2022
dc.date.none.fl_str_mv 2022-12-05T00:00:00Z
2022-12-05
2023-07-07T15:53:45Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/38434
url http://hdl.handle.net/10773/38434
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137738372415488