TrollBus, An Empirical Study Of Features For Troll Detection
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/129026 |
Resumo: | In today's social network context, the discussion of politics online has become a normal event. Users from all sides of the political spectrum are able to express their opinions freely and discuss their views in various social networks, including Twitter. From 2016 onward, a group of users whose objective is to polarize discussions and sow discord began to gain notoriety in this social network. These accounts are known as Trolls, and they have been linked to several events in recent history such as the influencing of elections and the organizing of violent protests. Since their discovery, several approaches have been developed to detect these accounts using machine learning techniques. Existing approaches have used different types of features. The goal of this work is to compare those different sets of features. To do so, an empirical study was performed, which adapts these features to the Portuguese Twitter community. The necessary data was collected through SocialBus, a tool for the collection, processing and storage of data from social networks, namely Twitter. The set of accounts used to collect the data were obtained from Portuguese political journalists and the labelling of trolls was performed with a strict set of behavioural rules, aided by a scoring function. A new module for SocialBus was developed, called Trollbus, which performs troll detection in real time. A public dataset was also released. The features of the best model obtained combine an account's profile metadata with the superficial aspects present in its text. The most important feature set noted to be the numerical aspects of the text, with the most important feature revealing to be the presence of political insults. |
id |
RCAP_a513d4bed21fecb785d61e7f3d013af9 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/129026 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
TrollBus, An Empirical Study Of Features For Troll DetectionEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringIn today's social network context, the discussion of politics online has become a normal event. Users from all sides of the political spectrum are able to express their opinions freely and discuss their views in various social networks, including Twitter. From 2016 onward, a group of users whose objective is to polarize discussions and sow discord began to gain notoriety in this social network. These accounts are known as Trolls, and they have been linked to several events in recent history such as the influencing of elections and the organizing of violent protests. Since their discovery, several approaches have been developed to detect these accounts using machine learning techniques. Existing approaches have used different types of features. The goal of this work is to compare those different sets of features. To do so, an empirical study was performed, which adapts these features to the Portuguese Twitter community. The necessary data was collected through SocialBus, a tool for the collection, processing and storage of data from social networks, namely Twitter. The set of accounts used to collect the data were obtained from Portuguese political journalists and the labelling of trolls was performed with a strict set of behavioural rules, aided by a scoring function. A new module for SocialBus was developed, called Trollbus, which performs troll detection in real time. A public dataset was also released. The features of the best model obtained combine an account's profile metadata with the superficial aspects present in its text. The most important feature set noted to be the numerical aspects of the text, with the most important feature revealing to be the presence of political insults.2020-07-152020-07-15T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/129026TID:202598918engTiago Neves Correia de Lacerdainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T12:58:15Zoai:repositorio-aberto.up.pt:10216/129026Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:30:43.300527Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
TrollBus, An Empirical Study Of Features For Troll Detection |
title |
TrollBus, An Empirical Study Of Features For Troll Detection |
spellingShingle |
TrollBus, An Empirical Study Of Features For Troll Detection Tiago Neves Correia de Lacerda Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
title_short |
TrollBus, An Empirical Study Of Features For Troll Detection |
title_full |
TrollBus, An Empirical Study Of Features For Troll Detection |
title_fullStr |
TrollBus, An Empirical Study Of Features For Troll Detection |
title_full_unstemmed |
TrollBus, An Empirical Study Of Features For Troll Detection |
title_sort |
TrollBus, An Empirical Study Of Features For Troll Detection |
author |
Tiago Neves Correia de Lacerda |
author_facet |
Tiago Neves Correia de Lacerda |
author_role |
author |
dc.contributor.author.fl_str_mv |
Tiago Neves Correia de Lacerda |
dc.subject.por.fl_str_mv |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
topic |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
description |
In today's social network context, the discussion of politics online has become a normal event. Users from all sides of the political spectrum are able to express their opinions freely and discuss their views in various social networks, including Twitter. From 2016 onward, a group of users whose objective is to polarize discussions and sow discord began to gain notoriety in this social network. These accounts are known as Trolls, and they have been linked to several events in recent history such as the influencing of elections and the organizing of violent protests. Since their discovery, several approaches have been developed to detect these accounts using machine learning techniques. Existing approaches have used different types of features. The goal of this work is to compare those different sets of features. To do so, an empirical study was performed, which adapts these features to the Portuguese Twitter community. The necessary data was collected through SocialBus, a tool for the collection, processing and storage of data from social networks, namely Twitter. The set of accounts used to collect the data were obtained from Portuguese political journalists and the labelling of trolls was performed with a strict set of behavioural rules, aided by a scoring function. A new module for SocialBus was developed, called Trollbus, which performs troll detection in real time. A public dataset was also released. The features of the best model obtained combine an account's profile metadata with the superficial aspects present in its text. The most important feature set noted to be the numerical aspects of the text, with the most important feature revealing to be the presence of political insults. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-07-15 2020-07-15T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/129026 TID:202598918 |
url |
https://hdl.handle.net/10216/129026 |
identifier_str_mv |
TID:202598918 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135615923519488 |