Dense and hybrid models for information retrieval
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10773/38434 |
Resumo: | As in the era of Big Data, there is the need of finding information in an easy and fast way, being imperative for a search system to understand more efficiently the user intent. Dense Retrieval focuses on this idea, by allowing the models to capture the underlying meaning of the queries and documents. Current models already surpass the classical BM-25 model in terms of accuracy. However, due to the use of a high number of dimensions to create representations of the queries and documents, the dense models are still not optimized in terms of their efficiency at a search level. This work focuses on evaluating the need for that high number of dimensions, by analyzing different dimensionality reduction methods, trained for different purposes, and comparing the trade-offs between efficiency and accuracy. |
id |
RCAP_d733d19a7460db80efae9b732432e925 |
---|---|
oai_identifier_str |
oai:ria.ua.pt:10773/38434 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Dense and hybrid models for information retrievalInformation retrievalDense retrievalDimensionality reductionDeep learningAs in the era of Big Data, there is the need of finding information in an easy and fast way, being imperative for a search system to understand more efficiently the user intent. Dense Retrieval focuses on this idea, by allowing the models to capture the underlying meaning of the queries and documents. Current models already surpass the classical BM-25 model in terms of accuracy. However, due to the use of a high number of dimensions to create representations of the queries and documents, the dense models are still not optimized in terms of their efficiency at a search level. This work focuses on evaluating the need for that high number of dimensions, by analyzing different dimensionality reduction methods, trained for different purposes, and comparing the trade-offs between efficiency and accuracy.Na era de Big Data em que nos encontramos, existe a necessidade de encontrar informação de uma forma mais fácil e mais rápida, sendo imperativo para um sistema de pesquisa entender eficientemente a intenção do utilizador. O campo de Dense Retrieval foca-se nesta ideia, permitindo que os modelos capturem os aspetos semânticos de queries e documentos. Modelos atuais já superam o modelo clássico BM-25 em termos de eficácia. No entanto, devido à aplicação de um número elevado de dimensões para criar as representações de queries e documentos, estes modelos densos ainda não estão otimizados em termos de desempenho ao nível da pesquisa. Este trabalho foca-se em avaliar a necessidade desse número elevado de dimensões, analisando diferentes métodos de redução de dimensionalidade, orientados para diferentes objectivos, e em comparar pontos de equilíbrio entre eficiência e precisão.2023-07-07T15:53:45Z2022-12-05T00:00:00Z2022-12-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/38434engFrias, José André Lopesinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:14:21Zoai:ria.ua.pt:10773/38434Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:08:37.883817Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Dense and hybrid models for information retrieval |
title |
Dense and hybrid models for information retrieval |
spellingShingle |
Dense and hybrid models for information retrieval Frias, José André Lopes Information retrieval Dense retrieval Dimensionality reduction Deep learning |
title_short |
Dense and hybrid models for information retrieval |
title_full |
Dense and hybrid models for information retrieval |
title_fullStr |
Dense and hybrid models for information retrieval |
title_full_unstemmed |
Dense and hybrid models for information retrieval |
title_sort |
Dense and hybrid models for information retrieval |
author |
Frias, José André Lopes |
author_facet |
Frias, José André Lopes |
author_role |
author |
dc.contributor.author.fl_str_mv |
Frias, José André Lopes |
dc.subject.por.fl_str_mv |
Information retrieval Dense retrieval Dimensionality reduction Deep learning |
topic |
Information retrieval Dense retrieval Dimensionality reduction Deep learning |
description |
As in the era of Big Data, there is the need of finding information in an easy and fast way, being imperative for a search system to understand more efficiently the user intent. Dense Retrieval focuses on this idea, by allowing the models to capture the underlying meaning of the queries and documents. Current models already surpass the classical BM-25 model in terms of accuracy. However, due to the use of a high number of dimensions to create representations of the queries and documents, the dense models are still not optimized in terms of their efficiency at a search level. This work focuses on evaluating the need for that high number of dimensions, by analyzing different dimensionality reduction methods, trained for different purposes, and comparing the trade-offs between efficiency and accuracy. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-12-05T00:00:00Z 2022-12-05 2023-07-07T15:53:45Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10773/38434 |
url |
http://hdl.handle.net/10773/38434 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137738372415488 |