A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com

Detalhes bibliográficos
Autor(a) principal: Costa, Ana Rebello de Andrade da
Data de Publicação: 2017
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10071/15297
Resumo: The rapid growth of social media in the last decades led e-commerce into a new era of value co-creation between the seller and the consumer. Since there is no contact with the product, people have to rely on the description of the seller, knowing that sometimes it may be biased and not entirely truth. Therefore, reviewing systems emerged in order to provide more trustworthy sources of information, since customer opinions may be less biased. The problem was, once sellers realized the importance of reviews and their direct impact on sales, the need to control this key factor arose. One of the methods developed was to offer customers a certain product in exchange for an honest review. However, in the light of the results of some studies, these "honest" reviews were proved to be biased and skew the overall rating of the product. The purpose of this work is to find patterns in these incentivized reviews and create a model that may predict whether a new review is biased or not. To study this subject, besides the sentiment analysis performed on the data, some other characteristics were taken into account, such as the overall rating, helpfulness rate, review length and the timestamp when the review was written. Results show that some of the most significant characteristics when predicting an incentivized review are the length of a review, its helpfulness rate and the overall polarity score, calculated through VADER algorithm, as the most important sentiment-related factor.
id RCAP_5776c551f57c8878ae8b7c582981122a
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/15297
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.comOnline reviewsText miningSentiment analysisVADERComércio eletrónicoCriação de valorText miningSatisfação do clienteThe rapid growth of social media in the last decades led e-commerce into a new era of value co-creation between the seller and the consumer. Since there is no contact with the product, people have to rely on the description of the seller, knowing that sometimes it may be biased and not entirely truth. Therefore, reviewing systems emerged in order to provide more trustworthy sources of information, since customer opinions may be less biased. The problem was, once sellers realized the importance of reviews and their direct impact on sales, the need to control this key factor arose. One of the methods developed was to offer customers a certain product in exchange for an honest review. However, in the light of the results of some studies, these "honest" reviews were proved to be biased and skew the overall rating of the product. The purpose of this work is to find patterns in these incentivized reviews and create a model that may predict whether a new review is biased or not. To study this subject, besides the sentiment analysis performed on the data, some other characteristics were taken into account, such as the overall rating, helpfulness rate, review length and the timestamp when the review was written. Results show that some of the most significant characteristics when predicting an incentivized review are the length of a review, its helpfulness rate and the overall polarity score, calculated through VADER algorithm, as the most important sentiment-related factor.O rápido crescimento das redes sociais nas últimas décadas levaram o comércio electrónico a uma nova era de co-criação de valor entre o vendedor e o consumidor. Uma vez que não há contacto com o produto, os clientes têm de se basear na descrição do vendedor, mesmo sabendo que por vezes tal descrição pode ser tendenciosa e não totalmente verdadeira. Deste modo, surgiu um sistema de reviews com o propósito de disponibilizar um meio de informação de maior confiança, uma vez que se trata de partilha de informação entre clientes e por isso mais imparcial. No entanto, quando os vendedores se aperceberam da importância das "reviews" e o seu impacto direto nas vendas, surgiu a necessidade de controlar este fator chave. Uma das formas de o fazer foi através da oferta de determinados produtos em troca de "reviews" honestas. Contudo, à luz dos resultados de alguns estudos, foi demonstrado que estas "reviews" "honestas" são tendenciosas e enviesam a classificação geral do produto. O objetivo deste estudo foi o de encontrar padrões na forma como estas "reviews" incentivadas são escritas e criar um modelo para prever se uma determinada review seria enviesada. Para esta análise, além da análise de sentimentos realizada sobre os dados, outras características foram tidas em conta, tal como a classificação geral, a taxa de "helpfulness", o tamanho da "review" e a hora a que foi escrita. Os modelos gerados mostraram que as características mais importantes na previsão de parcialidade numa "review" são o tamanho e a taxa de utilidade e como característica sentimental mais relevante a pontuação geral da "review", calculada através do algoritmo VADER.2018-02-28T14:40:20Z2017-11-07T00:00:00Z2017-11-072017-09info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfapplication/octet-streamhttp://hdl.handle.net/10071/15297TID:201760517engCosta, Ana Rebello de Andrade dainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:56:47Zoai:repositorio.iscte-iul.pt:10071/15297Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:29:12.627694Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
title A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
spellingShingle A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
Costa, Ana Rebello de Andrade da
Online reviews
Text mining
Sentiment analysis
VADER
Comércio eletrónico
Criação de valor
Text mining
Satisfação do cliente
title_short A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
title_full A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
title_fullStr A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
title_full_unstemmed A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
title_sort A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
author Costa, Ana Rebello de Andrade da
author_facet Costa, Ana Rebello de Andrade da
author_role author
dc.contributor.author.fl_str_mv Costa, Ana Rebello de Andrade da
dc.subject.por.fl_str_mv Online reviews
Text mining
Sentiment analysis
VADER
Comércio eletrónico
Criação de valor
Text mining
Satisfação do cliente
topic Online reviews
Text mining
Sentiment analysis
VADER
Comércio eletrónico
Criação de valor
Text mining
Satisfação do cliente
description The rapid growth of social media in the last decades led e-commerce into a new era of value co-creation between the seller and the consumer. Since there is no contact with the product, people have to rely on the description of the seller, knowing that sometimes it may be biased and not entirely truth. Therefore, reviewing systems emerged in order to provide more trustworthy sources of information, since customer opinions may be less biased. The problem was, once sellers realized the importance of reviews and their direct impact on sales, the need to control this key factor arose. One of the methods developed was to offer customers a certain product in exchange for an honest review. However, in the light of the results of some studies, these "honest" reviews were proved to be biased and skew the overall rating of the product. The purpose of this work is to find patterns in these incentivized reviews and create a model that may predict whether a new review is biased or not. To study this subject, besides the sentiment analysis performed on the data, some other characteristics were taken into account, such as the overall rating, helpfulness rate, review length and the timestamp when the review was written. Results show that some of the most significant characteristics when predicting an incentivized review are the length of a review, its helpfulness rate and the overall polarity score, calculated through VADER algorithm, as the most important sentiment-related factor.
publishDate 2017
dc.date.none.fl_str_mv 2017-11-07T00:00:00Z
2017-11-07
2017-09
2018-02-28T14:40:20Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/15297
TID:201760517
url http://hdl.handle.net/10071/15297
identifier_str_mv TID:201760517
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/octet-stream
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134854392053760