GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS

Santini, Rose Marie; Salles, Débora; Ferreira, Fernando; Grael, Felipe

GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS

Detalhes bibliográficos
Autor(a) principal:	Santini, Rose Marie
Data de Publicação:	2023
Outros Autores:	Salles, Débora, Ferreira, Fernando, Grael, Felipe
Tipo de documento:	preprint
Idioma:	eng
Título da fonte:	SciELO Preprints
Texto Completo:	https://preprints.scielo.org/index.php/scielo/preprint/view/5974
Resumo:	Bot detection is increasingly relevant considering that automated accounts play a disproportionate role in spreading disinformation, controlling social interactions, influencing social media algorithms and manufacturing public opinion online for different purposes. Definition, description and detection of automated manipulation techniques have proved a challenge as technology quickly advances in reach and sophistication. Considering the high contextual character of social science research, the employment of off-the-shelf detection tools raises questions regarding the applicability of machine learning systems in different cases, times and places. Thus, our purpose is to discuss the role of computational methods focusing on understanding the limitations and potential of machine learning systems to identify bots on social media platforms. To address it, we analyze the performance of Botometer, a widely adopted detection tool, in a specific domain (Amazon Forest Fires) and language (Portuguese) and propose a supervised machine learning classifier, called Gotcha, based on Botometer's framework and trained for this specific dataset. We also question how our classifier behaves and evolves over time and perform tests to evaluate the generalization capabilities of the retrained model. Our results demonstrated that supervised methods do not perform well with datasets that present features on which the system was not directly trained, such as language and topic. Hence, our study shows that a successful computational model does not always guarantee reliable results, applicable to a specific real case. Our findings indicate the need for social scientists to confirm the reliability of different tools created and tested only through the prism of computational studies before applying them to empirical social science research.

Metadados do item

id	SCI-1_fb05c1bef9ee22c78187306c405cdc1a
oai_identifier_str	oai:ops.preprints.scielo.org:preprint/5974
network_acronym_str	SCI-1
network_name_str	SciELO Preprints
repository_id_str
spelling	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERSBot detectionmachine learning algorithmBrazilcomputational propagandaBot detection is increasingly relevant considering that automated accounts play a disproportionate role in spreading disinformation, controlling social interactions, influencing social media algorithms and manufacturing public opinion online for different purposes. Definition, description and detection of automated manipulation techniques have proved a challenge as technology quickly advances in reach and sophistication. Considering the high contextual character of social science research, the employment of off-the-shelf detection tools raises questions regarding the applicability of machine learning systems in different cases, times and places. Thus, our purpose is to discuss the role of computational methods focusing on understanding the limitations and potential of machine learning systems to identify bots on social media platforms. To address it, we analyze the performance of Botometer, a widely adopted detection tool, in a specific domain (Amazon Forest Fires) and language (Portuguese) and propose a supervised machine learning classifier, called Gotcha, based on Botometer's framework and trained for this specific dataset. We also question how our classifier behaves and evolves over time and perform tests to evaluate the generalization capabilities of the retrained model. Our results demonstrated that supervised methods do not perform well with datasets that present features on which the system was not directly trained, such as language and topic. Hence, our study shows that a successful computational model does not always guarantee reliable results, applicable to a specific real case. Our findings indicate the need for social scientists to confirm the reliability of different tools created and tested only through the prism of computational studies before applying them to empirical social science research.SciELO PreprintsSciELO PreprintsSciELO Preprints2023-05-05info:eu-repo/semantics/preprintinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://preprints.scielo.org/index.php/scielo/preprint/view/597410.1590/SciELOPreprints.5974enghttps://preprints.scielo.org/index.php/scielo/article/view/5974/11503Copyright (c) 2023 Rose Marie Santini, Débora Salles, Fernando Ferreira, Felipe Graelhttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessSantini, Rose MarieSalles, DéboraFerreira, FernandoGrael, Felipereponame:SciELO Preprintsinstname:Scientific Electronic Library Online (SCIELO)instacron:SCI2023-04-28T12:49:41Zoai:ops.preprints.scielo.org:preprint/5974Servidor de preprintshttps://preprints.scielo.org/index.php/scieloONGhttps://preprints.scielo.org/index.php/scielo/oaiscielo.submission@scielo.orgopendoar:2023-04-28T12:49:41SciELO Preprints - Scientific Electronic Library Online (SCIELO)false
dc.title.none.fl_str_mv	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
title	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
spellingShingle	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS Santini, Rose Marie Bot detection machine learning algorithm Brazil computational propaganda
title_short	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
title_full	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
title_fullStr	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
title_full_unstemmed	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
title_sort	GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
author	Santini, Rose Marie
author_facet	Santini, Rose Marie Salles, Débora Ferreira, Fernando Grael, Felipe
author_role	author
author2	Salles, Débora Ferreira, Fernando Grael, Felipe
author2_role	author author author
dc.contributor.author.fl_str_mv	Santini, Rose Marie Salles, Débora Ferreira, Fernando Grael, Felipe
dc.subject.por.fl_str_mv	Bot detection machine learning algorithm Brazil computational propaganda
topic	Bot detection machine learning algorithm Brazil computational propaganda
description	Bot detection is increasingly relevant considering that automated accounts play a disproportionate role in spreading disinformation, controlling social interactions, influencing social media algorithms and manufacturing public opinion online for different purposes. Definition, description and detection of automated manipulation techniques have proved a challenge as technology quickly advances in reach and sophistication. Considering the high contextual character of social science research, the employment of off-the-shelf detection tools raises questions regarding the applicability of machine learning systems in different cases, times and places. Thus, our purpose is to discuss the role of computational methods focusing on understanding the limitations and potential of machine learning systems to identify bots on social media platforms. To address it, we analyze the performance of Botometer, a widely adopted detection tool, in a specific domain (Amazon Forest Fires) and language (Portuguese) and propose a supervised machine learning classifier, called Gotcha, based on Botometer's framework and trained for this specific dataset. We also question how our classifier behaves and evolves over time and perform tests to evaluate the generalization capabilities of the retrained model. Our results demonstrated that supervised methods do not perform well with datasets that present features on which the system was not directly trained, such as language and topic. Hence, our study shows that a successful computational model does not always guarantee reliable results, applicable to a specific real case. Our findings indicate the need for social scientists to confirm the reliability of different tools created and tested only through the prism of computational studies before applying them to empirical social science research.
publishDate	2023
dc.date.none.fl_str_mv	2023-05-05
dc.type.driver.fl_str_mv	info:eu-repo/semantics/preprint info:eu-repo/semantics/publishedVersion
format	preprint
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://preprints.scielo.org/index.php/scielo/preprint/view/5974 10.1590/SciELOPreprints.5974
url	https://preprints.scielo.org/index.php/scielo/preprint/view/5974
identifier_str_mv	10.1590/SciELOPreprints.5974
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	https://preprints.scielo.org/index.php/scielo/article/view/5974/11503
dc.rights.driver.fl_str_mv	Copyright (c) 2023 Rose Marie Santini, Débora Salles, Fernando Ferreira, Felipe Grael https://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Copyright (c) 2023 Rose Marie Santini, Débora Salles, Fernando Ferreira, Felipe Grael https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	SciELO Preprints SciELO Preprints SciELO Preprints
publisher.none.fl_str_mv	SciELO Preprints SciELO Preprints SciELO Preprints
dc.source.none.fl_str_mv	reponame:SciELO Preprints instname:Scientific Electronic Library Online (SCIELO) instacron:SCI
instname_str	Scientific Electronic Library Online (SCIELO)
instacron_str	SCI
institution	SCI
reponame_str	SciELO Preprints
collection	SciELO Preprints
repository.name.fl_str_mv	SciELO Preprints - Scientific Electronic Library Online (SCIELO)
repository.mail.fl_str_mv	scielo.submission@scielo.org
_version_	1797047811629383680

GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS

Registros relacionados