Scalable modelling and recommendation using wiki-based crowdsourced repositories
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/12974 |
Resumo: | Wiki-based crowdsourced repositories have increasingly become an important source of information for users in multiple domains. However, as the amount of wiki-based data increases, so does the information overloading for users. Wikis, and in general crowdsourcing platforms, raise trustability questions since they do not generally store user background data, making the recommendation of pages particularly hard to rely on. In this context, this work explores scalable multi-criteria profiling using side information to model the publishers and pages of wiki-based crowdsourced platforms. Based on streams of publisher-page-review triads, we have modelled publishers and pages in terms of quality and popularity using different criteria and user-page-view events collected via a wiki platform. Our modelling approach classifies statistically, both page-review (quality) and page-view (popularity) events, attributing an appropriate rating. The quality-related information is then merged employing Multiple Linear Regression as well as a weighted average. Based on the quality and popularity, the resulting page profiles are then used to address the problem of recommending the most interesting wiki pages per destination to viewers. This paper also explores the parallelisation of profiling and recommendation algorithms using wiki-based crowdsourced distributed data repositories as data streams via incremental updating. The proposed method has been successfully evaluated using Wikivoyage, a tourism crowdsourced wiki-based repository. |
id |
RCAP_57ea6b4baaad9d3b0dc6ec38c24b6592 |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/12974 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Scalable modelling and recommendation using wiki-based crowdsourced repositoriesModellingScalable data miningWiki-based crowdsourcingParallel processingReputationUser profilingCloud computingRecommender systemsWiki-based crowdsourced repositories have increasingly become an important source of information for users in multiple domains. However, as the amount of wiki-based data increases, so does the information overloading for users. Wikis, and in general crowdsourcing platforms, raise trustability questions since they do not generally store user background data, making the recommendation of pages particularly hard to rely on. In this context, this work explores scalable multi-criteria profiling using side information to model the publishers and pages of wiki-based crowdsourced platforms. Based on streams of publisher-page-review triads, we have modelled publishers and pages in terms of quality and popularity using different criteria and user-page-view events collected via a wiki platform. Our modelling approach classifies statistically, both page-review (quality) and page-view (popularity) events, attributing an appropriate rating. The quality-related information is then merged employing Multiple Linear Regression as well as a weighted average. Based on the quality and popularity, the resulting page profiles are then used to address the problem of recommending the most interesting wiki pages per destination to viewers. This paper also explores the parallelisation of profiling and recommendation algorithms using wiki-based crowdsourced distributed data repositories as data streams via incremental updating. The proposed method has been successfully evaluated using Wikivoyage, a tourism crowdsourced wiki-based repository.ElsevierRepositório Científico do Instituto Politécnico do PortoLeal, FátimaVeloso, BrunoMalheiro, BeneditaGonzález-Veléz, HoracioBurguillo, Juan Carlos20192019-03-08T15:50:56Z2119-01-01T00:00:00Z2019-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/12974eng1567422310.1016/j.elerap.2018.11.004metadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T12:54:56Zoai:recipp.ipp.pt:10400.22/12974Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:33:11.472520Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
title |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
spellingShingle |
Scalable modelling and recommendation using wiki-based crowdsourced repositories Leal, Fátima Modelling Scalable data mining Wiki-based crowdsourcing Parallel processing Reputation User profiling Cloud computing Recommender systems |
title_short |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
title_full |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
title_fullStr |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
title_full_unstemmed |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
title_sort |
Scalable modelling and recommendation using wiki-based crowdsourced repositories |
author |
Leal, Fátima |
author_facet |
Leal, Fátima Veloso, Bruno Malheiro, Benedita González-Veléz, Horacio Burguillo, Juan Carlos |
author_role |
author |
author2 |
Veloso, Bruno Malheiro, Benedita González-Veléz, Horacio Burguillo, Juan Carlos |
author2_role |
author author author author |
dc.contributor.none.fl_str_mv |
Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Leal, Fátima Veloso, Bruno Malheiro, Benedita González-Veléz, Horacio Burguillo, Juan Carlos |
dc.subject.por.fl_str_mv |
Modelling Scalable data mining Wiki-based crowdsourcing Parallel processing Reputation User profiling Cloud computing Recommender systems |
topic |
Modelling Scalable data mining Wiki-based crowdsourcing Parallel processing Reputation User profiling Cloud computing Recommender systems |
description |
Wiki-based crowdsourced repositories have increasingly become an important source of information for users in multiple domains. However, as the amount of wiki-based data increases, so does the information overloading for users. Wikis, and in general crowdsourcing platforms, raise trustability questions since they do not generally store user background data, making the recommendation of pages particularly hard to rely on. In this context, this work explores scalable multi-criteria profiling using side information to model the publishers and pages of wiki-based crowdsourced platforms. Based on streams of publisher-page-review triads, we have modelled publishers and pages in terms of quality and popularity using different criteria and user-page-view events collected via a wiki platform. Our modelling approach classifies statistically, both page-review (quality) and page-view (popularity) events, attributing an appropriate rating. The quality-related information is then merged employing Multiple Linear Regression as well as a weighted average. Based on the quality and popularity, the resulting page profiles are then used to address the problem of recommending the most interesting wiki pages per destination to viewers. This paper also explores the parallelisation of profiling and recommendation algorithms using wiki-based crowdsourced distributed data repositories as data streams via incremental updating. The proposed method has been successfully evaluated using Wikivoyage, a tourism crowdsourced wiki-based repository. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019 2019-03-08T15:50:56Z 2019-01-01T00:00:00Z 2119-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/12974 |
url |
http://hdl.handle.net/10400.22/12974 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
15674223 10.1016/j.elerap.2018.11.004 |
dc.rights.driver.fl_str_mv |
metadata only access info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
metadata only access |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131424831307776 |