Scalable modelling and recommendation using wiki-based crowdsourced repositories

Detalhes bibliográficos
Autor(a) principal: Leal, Fátima
Data de Publicação: 2019
Outros Autores: Veloso, Bruno, Malheiro, Benedita, González-Veléz, Horacio, Burguillo, Juan Carlos
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.22/12974
Resumo: Wiki-based crowdsourced repositories have increasingly become an important source of information for users in multiple domains. However, as the amount of wiki-based data increases, so does the information overloading for users. Wikis, and in general crowdsourcing platforms, raise trustability questions since they do not generally store user background data, making the recommendation of pages particularly hard to rely on. In this context, this work explores scalable multi-criteria profiling using side information to model the publishers and pages of wiki-based crowdsourced platforms. Based on streams of publisher-page-review triads, we have modelled publishers and pages in terms of quality and popularity using different criteria and user-page-view events collected via a wiki platform. Our modelling approach classifies statistically, both page-review (quality) and page-view (popularity) events, attributing an appropriate rating. The quality-related information is then merged employing Multiple Linear Regression as well as a weighted average. Based on the quality and popularity, the resulting page profiles are then used to address the problem of recommending the most interesting wiki pages per destination to viewers. This paper also explores the parallelisation of profiling and recommendation algorithms using wiki-based crowdsourced distributed data repositories as data streams via incremental updating. The proposed method has been successfully evaluated using Wikivoyage, a tourism crowdsourced wiki-based repository.
id RCAP_57ea6b4baaad9d3b0dc6ec38c24b6592
oai_identifier_str oai:recipp.ipp.pt:10400.22/12974
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Scalable modelling and recommendation using wiki-based crowdsourced repositoriesModellingScalable data miningWiki-based crowdsourcingParallel processingReputationUser profilingCloud computingRecommender systemsWiki-based crowdsourced repositories have increasingly become an important source of information for users in multiple domains. However, as the amount of wiki-based data increases, so does the information overloading for users. Wikis, and in general crowdsourcing platforms, raise trustability questions since they do not generally store user background data, making the recommendation of pages particularly hard to rely on. In this context, this work explores scalable multi-criteria profiling using side information to model the publishers and pages of wiki-based crowdsourced platforms. Based on streams of publisher-page-review triads, we have modelled publishers and pages in terms of quality and popularity using different criteria and user-page-view events collected via a wiki platform. Our modelling approach classifies statistically, both page-review (quality) and page-view (popularity) events, attributing an appropriate rating. The quality-related information is then merged employing Multiple Linear Regression as well as a weighted average. Based on the quality and popularity, the resulting page profiles are then used to address the problem of recommending the most interesting wiki pages per destination to viewers. This paper also explores the parallelisation of profiling and recommendation algorithms using wiki-based crowdsourced distributed data repositories as data streams via incremental updating. The proposed method has been successfully evaluated using Wikivoyage, a tourism crowdsourced wiki-based repository.ElsevierRepositório Científico do Instituto Politécnico do PortoLeal, FátimaVeloso, BrunoMalheiro, BeneditaGonzález-Veléz, HoracioBurguillo, Juan Carlos20192019-03-08T15:50:56Z2119-01-01T00:00:00Z2019-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/12974eng1567422310.1016/j.elerap.2018.11.004metadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T12:54:56Zoai:recipp.ipp.pt:10400.22/12974Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:33:11.472520Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Scalable modelling and recommendation using wiki-based crowdsourced repositories
title Scalable modelling and recommendation using wiki-based crowdsourced repositories
spellingShingle Scalable modelling and recommendation using wiki-based crowdsourced repositories
Leal, Fátima
Modelling
Scalable data mining
Wiki-based crowdsourcing
Parallel processing
Reputation
User profiling
Cloud computing
Recommender systems
title_short Scalable modelling and recommendation using wiki-based crowdsourced repositories
title_full Scalable modelling and recommendation using wiki-based crowdsourced repositories
title_fullStr Scalable modelling and recommendation using wiki-based crowdsourced repositories
title_full_unstemmed Scalable modelling and recommendation using wiki-based crowdsourced repositories
title_sort Scalable modelling and recommendation using wiki-based crowdsourced repositories
author Leal, Fátima
author_facet Leal, Fátima
Veloso, Bruno
Malheiro, Benedita
González-Veléz, Horacio
Burguillo, Juan Carlos
author_role author
author2 Veloso, Bruno
Malheiro, Benedita
González-Veléz, Horacio
Burguillo, Juan Carlos
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv Leal, Fátima
Veloso, Bruno
Malheiro, Benedita
González-Veléz, Horacio
Burguillo, Juan Carlos
dc.subject.por.fl_str_mv Modelling
Scalable data mining
Wiki-based crowdsourcing
Parallel processing
Reputation
User profiling
Cloud computing
Recommender systems
topic Modelling
Scalable data mining
Wiki-based crowdsourcing
Parallel processing
Reputation
User profiling
Cloud computing
Recommender systems
description Wiki-based crowdsourced repositories have increasingly become an important source of information for users in multiple domains. However, as the amount of wiki-based data increases, so does the information overloading for users. Wikis, and in general crowdsourcing platforms, raise trustability questions since they do not generally store user background data, making the recommendation of pages particularly hard to rely on. In this context, this work explores scalable multi-criteria profiling using side information to model the publishers and pages of wiki-based crowdsourced platforms. Based on streams of publisher-page-review triads, we have modelled publishers and pages in terms of quality and popularity using different criteria and user-page-view events collected via a wiki platform. Our modelling approach classifies statistically, both page-review (quality) and page-view (popularity) events, attributing an appropriate rating. The quality-related information is then merged employing Multiple Linear Regression as well as a weighted average. Based on the quality and popularity, the resulting page profiles are then used to address the problem of recommending the most interesting wiki pages per destination to viewers. This paper also explores the parallelisation of profiling and recommendation algorithms using wiki-based crowdsourced distributed data repositories as data streams via incremental updating. The proposed method has been successfully evaluated using Wikivoyage, a tourism crowdsourced wiki-based repository.
publishDate 2019
dc.date.none.fl_str_mv 2019
2019-03-08T15:50:56Z
2019-01-01T00:00:00Z
2119-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.22/12974
url http://hdl.handle.net/10400.22/12974
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 15674223
10.1016/j.elerap.2018.11.004
dc.rights.driver.fl_str_mv metadata only access
info:eu-repo/semantics/openAccess
rights_invalid_str_mv metadata only access
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799131424831307776