WCL2R : a benchmark collection for Learning to rank research with clickthrough data.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2010 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFOP |
Texto Completo: | http://www.repositorio.ufop.br/handle/123456789/1630 |
Resumo: | WCL2R: A benchmark collection for Learning to rank research with clickthrough data In this paper we present WCL2R, a benchmark collection for supporting research in learning to rank (L2R) algorithms which exploit clickthrough features. Differently from other L2R benchmark collections, such as LETOR and the recently released Yahoo!’s collection for a L2R competition, in WCL2R we focus on defining a significant (and new) set of features over clickthrough data extracted from the logs of a real-world search engine. In this paper, we describe the WCL2R collection by providing details about how the corpora, queries and relevance judgments were obtained, how the learning features were constructed and how the process of splitting the collection in folds for representative learning was performed. We also analyze the discriminative power of the WCL2R collection using traditional feature selection algorithms and show that the most discriminative features are, in fact, those based on clickthrough data. We then compare several L2R algorithms on WCL2R, showing that all of them obtain significant gains by exploiting clickthrough information over using traditional ranking approaches. |
id |
UFOP_be585a3a185b7b26802f8ebfb7dde5d3 |
---|---|
oai_identifier_str |
oai:localhost:123456789/1630 |
network_acronym_str |
UFOP |
network_name_str |
Repositório Institucional da UFOP |
repository_id_str |
3233 |
spelling |
Alcântara, Otávio D. A.Pereira Junior, Álvaro RodriguesAlmeida, Humberto Mossri deGonçalves, Marcos AndréMiddleton, ChristianYates, Ricardo Baeza2012-10-11T21:51:55Z2012-10-11T21:51:55Z2010ALCÂNTARA, O. D. A. WCL2R : a benchmark collection for Learning to rank research with clickthrough data. Journal of Information and Data Management, v. 1, n. 3, p. 551-566, 2010. Disponível em: <http://seer.lcc.ufmg.br/index.php/jidm/article/viewFile/83/49>. Acesso em: 11 out. 2012.21666288http://www.repositorio.ufop.br/handle/123456789/1630WCL2R: A benchmark collection for Learning to rank research with clickthrough data In this paper we present WCL2R, a benchmark collection for supporting research in learning to rank (L2R) algorithms which exploit clickthrough features. Differently from other L2R benchmark collections, such as LETOR and the recently released Yahoo!’s collection for a L2R competition, in WCL2R we focus on defining a significant (and new) set of features over clickthrough data extracted from the logs of a real-world search engine. In this paper, we describe the WCL2R collection by providing details about how the corpora, queries and relevance judgments were obtained, how the learning features were constructed and how the process of splitting the collection in folds for representative learning was performed. We also analyze the discriminative power of the WCL2R collection using traditional feature selection algorithms and show that the most discriminative features are, in fact, those based on clickthrough data. We then compare several L2R algorithms on WCL2R, showing that all of them obtain significant gains by exploiting clickthrough information over using traditional ranking approaches.BenchmarkClicktroughLearning to rankWCL2R : a benchmark collection for Learning to rank research with clickthrough data.info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlePermission to copy without fee all or part of the material printed in JIDM is granted provided that the copies are not made or distributed for commercial advantage, and that notice is given that copying is by permission of the Sociedade Brasileira de Computação. Fonte: o próprio artigo.info:eu-repo/semantics/openAccessengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOPLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://www.repositorio.ufop.br/bitstream/123456789/1630/5/license.txt8a4605be74aa9ea9d79846c1fba20a33MD55ORIGINALARTIGO_BenchmarkCollectionLearning.pdfARTIGO_BenchmarkCollectionLearning.pdfapplication/pdf516116http://www.repositorio.ufop.br/bitstream/123456789/1630/1/ARTIGO_BenchmarkCollectionLearning.pdf453450dd405c4db0db5b2ad7ebd41668MD51123456789/16302019-03-11 14:24:43.676oai:localhost:123456789/1630Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332019-03-11T18:24:43Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false |
dc.title.pt_BR.fl_str_mv |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
title |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
spellingShingle |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. Alcântara, Otávio D. A. Benchmark Clicktrough Learning to rank |
title_short |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
title_full |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
title_fullStr |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
title_full_unstemmed |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
title_sort |
WCL2R : a benchmark collection for Learning to rank research with clickthrough data. |
author |
Alcântara, Otávio D. A. |
author_facet |
Alcântara, Otávio D. A. Pereira Junior, Álvaro Rodrigues Almeida, Humberto Mossri de Gonçalves, Marcos André Middleton, Christian Yates, Ricardo Baeza |
author_role |
author |
author2 |
Pereira Junior, Álvaro Rodrigues Almeida, Humberto Mossri de Gonçalves, Marcos André Middleton, Christian Yates, Ricardo Baeza |
author2_role |
author author author author author |
dc.contributor.author.fl_str_mv |
Alcântara, Otávio D. A. Pereira Junior, Álvaro Rodrigues Almeida, Humberto Mossri de Gonçalves, Marcos André Middleton, Christian Yates, Ricardo Baeza |
dc.subject.por.fl_str_mv |
Benchmark Clicktrough Learning to rank |
topic |
Benchmark Clicktrough Learning to rank |
description |
WCL2R: A benchmark collection for Learning to rank research with clickthrough data In this paper we present WCL2R, a benchmark collection for supporting research in learning to rank (L2R) algorithms which exploit clickthrough features. Differently from other L2R benchmark collections, such as LETOR and the recently released Yahoo!’s collection for a L2R competition, in WCL2R we focus on defining a significant (and new) set of features over clickthrough data extracted from the logs of a real-world search engine. In this paper, we describe the WCL2R collection by providing details about how the corpora, queries and relevance judgments were obtained, how the learning features were constructed and how the process of splitting the collection in folds for representative learning was performed. We also analyze the discriminative power of the WCL2R collection using traditional feature selection algorithms and show that the most discriminative features are, in fact, those based on clickthrough data. We then compare several L2R algorithms on WCL2R, showing that all of them obtain significant gains by exploiting clickthrough information over using traditional ranking approaches. |
publishDate |
2010 |
dc.date.issued.fl_str_mv |
2010 |
dc.date.accessioned.fl_str_mv |
2012-10-11T21:51:55Z |
dc.date.available.fl_str_mv |
2012-10-11T21:51:55Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
ALCÂNTARA, O. D. A. WCL2R : a benchmark collection for Learning to rank research with clickthrough data. Journal of Information and Data Management, v. 1, n. 3, p. 551-566, 2010. Disponível em: <http://seer.lcc.ufmg.br/index.php/jidm/article/viewFile/83/49>. Acesso em: 11 out. 2012. |
dc.identifier.uri.fl_str_mv |
http://www.repositorio.ufop.br/handle/123456789/1630 |
dc.identifier.issn.none.fl_str_mv |
21666288 |
identifier_str_mv |
ALCÂNTARA, O. D. A. WCL2R : a benchmark collection for Learning to rank research with clickthrough data. Journal of Information and Data Management, v. 1, n. 3, p. 551-566, 2010. Disponível em: <http://seer.lcc.ufmg.br/index.php/jidm/article/viewFile/83/49>. Acesso em: 11 out. 2012. 21666288 |
url |
http://www.repositorio.ufop.br/handle/123456789/1630 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFOP instname:Universidade Federal de Ouro Preto (UFOP) instacron:UFOP |
instname_str |
Universidade Federal de Ouro Preto (UFOP) |
instacron_str |
UFOP |
institution |
UFOP |
reponame_str |
Repositório Institucional da UFOP |
collection |
Repositório Institucional da UFOP |
bitstream.url.fl_str_mv |
http://www.repositorio.ufop.br/bitstream/123456789/1630/5/license.txt http://www.repositorio.ufop.br/bitstream/123456789/1630/1/ARTIGO_BenchmarkCollectionLearning.pdf |
bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 453450dd405c4db0db5b2ad7ebd41668 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP) |
repository.mail.fl_str_mv |
repositorio@ufop.edu.br |
_version_ |
1801685746621349888 |