Using temporal evidence in blog search
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | , |
Tipo de documento: | Livro |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/26928 |
Resumo: | In this paper we present a study on the relevance of web documents over time and the use of temporal evidence in blog search tasks. Time is an intrinsic property of social media, most notably in blogs where each post is typically attached with a timestamp representing its publish date. However, due to the challenges in obtaining document collections containing temporal information, research on this field has been scarce. We base our study on the Blog06 collection and the relevance assessments produced in the context of the TREC Blog Track, to investigate the relevance of time-based features in standard retrieval tasks. We observe small, but statistically significant improvements over a BM25 baseline when temporal information is used. Also, we find a direct connection between recency and relevance of documents for ad-hoc retrieval. |
id |
RCAP_31227be3c539f0263685ddba2f6a6639 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/26928 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Using temporal evidence in blog searchTecnologia da informação, Engenharia de computadores, Ciências da computação e da informaçãoInformation technology, Computer engineering, Computer and information sciencesIn this paper we present a study on the relevance of web documents over time and the use of temporal evidence in blog search tasks. Time is an intrinsic property of social media, most notably in blogs where each post is typically attached with a timestamp representing its publish date. However, due to the challenges in obtaining document collections containing temporal information, research on this field has been scarce. We base our study on the Blog06 collection and the relevance assessments produced in the context of the TREC Blog Track, to investigate the relevance of time-based features in standard retrieval tasks. We observe small, but statistically significant improvements over a BM25 baseline when temporal information is used. Also, we find a direct connection between recency and relevance of documents for ad-hoc retrieval.20092009-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookapplication/pdfhttps://hdl.handle.net/10216/26928engSérgio NunesCristina RibeiroGabriel Davidinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T13:58:33Zoai:repositorio-aberto.up.pt:10216/26928Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:51:20.164219Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Using temporal evidence in blog search |
title |
Using temporal evidence in blog search |
spellingShingle |
Using temporal evidence in blog search Sérgio Nunes Tecnologia da informação, Engenharia de computadores, Ciências da computação e da informação Information technology, Computer engineering, Computer and information sciences |
title_short |
Using temporal evidence in blog search |
title_full |
Using temporal evidence in blog search |
title_fullStr |
Using temporal evidence in blog search |
title_full_unstemmed |
Using temporal evidence in blog search |
title_sort |
Using temporal evidence in blog search |
author |
Sérgio Nunes |
author_facet |
Sérgio Nunes Cristina Ribeiro Gabriel David |
author_role |
author |
author2 |
Cristina Ribeiro Gabriel David |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Sérgio Nunes Cristina Ribeiro Gabriel David |
dc.subject.por.fl_str_mv |
Tecnologia da informação, Engenharia de computadores, Ciências da computação e da informação Information technology, Computer engineering, Computer and information sciences |
topic |
Tecnologia da informação, Engenharia de computadores, Ciências da computação e da informação Information technology, Computer engineering, Computer and information sciences |
description |
In this paper we present a study on the relevance of web documents over time and the use of temporal evidence in blog search tasks. Time is an intrinsic property of social media, most notably in blogs where each post is typically attached with a timestamp representing its publish date. However, due to the challenges in obtaining document collections containing temporal information, research on this field has been scarce. We base our study on the Blog06 collection and the relevance assessments produced in the context of the TREC Blog Track, to investigate the relevance of time-based features in standard retrieval tasks. We observe small, but statistically significant improvements over a BM25 baseline when temporal information is used. Also, we find a direct connection between recency and relevance of documents for ad-hoc retrieval. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009 2009-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/book |
format |
book |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/26928 |
url |
https://hdl.handle.net/10216/26928 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135832525766656 |