Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings

Teimas, Rúben; Saias, José

Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings

Detalhes bibliográficos
Autor(a) principal:	Teimas, Rúben
Data de Publicação:	2023
Outros Autores:	Saias, José
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10174/35708 https://doi.org/10.3390/electronics12214447
Resumo:	The rise of social networks and the increasing amount of time people spend on them have created a perfect place for the dissemination of false narratives, propaganda, and manipulated content. In order to prevent the spread of disinformation, content moderation is needed. However, manual moderation is unfeasible due to the large amount of daily posts. This paper studies the impact of using different loss functions on a multi-label classification problem with an imbalanced dataset, consisting of 20 persuasion techniques and only 950 samples, provided by SemEval’s 2021 Task 6. We used machine learning models, such as Naive Bayes and Decision Trees, and a custom deep learning architecture, based on DistilBERT and Convolutional Layers. Overall, the machine learning models achieved far worse results than the deep learning model, using Binary Cross Entropy, which we considered our baseline deep learning model. To address the class imbalance problem, we trained our model using different loss functions, such as Focal Loss and Asymmetric Loss. The latter providing the best results, particularly for the least represented classes.

Metadados do item

id	RCAP_43c430a390cc84e4cc42ef51a4929b1d
oai_identifier_str	oai:dspace.uevora.pt:10174/35708
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data SettingsNatural Language Processingmachine learningdeep learningpersuasion attemptssocial networksThe rise of social networks and the increasing amount of time people spend on them have created a perfect place for the dissemination of false narratives, propaganda, and manipulated content. In order to prevent the spread of disinformation, content moderation is needed. However, manual moderation is unfeasible due to the large amount of daily posts. This paper studies the impact of using different loss functions on a multi-label classification problem with an imbalanced dataset, consisting of 20 persuasion techniques and only 950 samples, provided by SemEval’s 2021 Task 6. We used machine learning models, such as Naive Bayes and Decision Trees, and a custom deep learning architecture, based on DistilBERT and Convolutional Layers. Overall, the machine learning models achieved far worse results than the deep learning model, using Binary Cross Entropy, which we considered our baseline deep learning model. To address the class imbalance problem, we trained our model using different loss functions, such as Focal Loss and Asymmetric Loss. The latter providing the best results, particularly for the least represented classes.MDPI - Electronics2023-11-22T11:09:03Z2023-11-222023-10-29T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/35708http://hdl.handle.net/10174/35708https://doi.org/10.3390/electronics12214447engRúben Teimas and José Saias. 2023. "Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings" Electronics 12, no. 21: 4447.2079-9292https://www.mdpi.com/2539862Electronicsruben.teimas@uevora.ptjsaias@uevora.pt283Teimas, RúbenSaias, Joséinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:39:30Zoai:dspace.uevora.pt:10174/35708Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:24:01.630952Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
title	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
spellingShingle	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings Teimas, Rúben Natural Language Processing machine learning deep learning persuasion attempts social networks
title_short	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
title_full	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
title_fullStr	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
title_full_unstemmed	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
title_sort	Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings
author	Teimas, Rúben
author_facet	Teimas, Rúben Saias, José
author_role	author
author2	Saias, José
author2_role	author
dc.contributor.author.fl_str_mv	Teimas, Rúben Saias, José
dc.subject.por.fl_str_mv	Natural Language Processing machine learning deep learning persuasion attempts social networks
topic	Natural Language Processing machine learning deep learning persuasion attempts social networks
description	The rise of social networks and the increasing amount of time people spend on them have created a perfect place for the dissemination of false narratives, propaganda, and manipulated content. In order to prevent the spread of disinformation, content moderation is needed. However, manual moderation is unfeasible due to the large amount of daily posts. This paper studies the impact of using different loss functions on a multi-label classification problem with an imbalanced dataset, consisting of 20 persuasion techniques and only 950 samples, provided by SemEval’s 2021 Task 6. We used machine learning models, such as Naive Bayes and Decision Trees, and a custom deep learning architecture, based on DistilBERT and Convolutional Layers. Overall, the machine learning models achieved far worse results than the deep learning model, using Binary Cross Entropy, which we considered our baseline deep learning model. To address the class imbalance problem, we trained our model using different loss functions, such as Focal Loss and Asymmetric Loss. The latter providing the best results, particularly for the least represented classes.
publishDate	2023
dc.date.none.fl_str_mv	2023-11-22T11:09:03Z 2023-11-22 2023-10-29T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10174/35708 http://hdl.handle.net/10174/35708 https://doi.org/10.3390/electronics12214447
url	http://hdl.handle.net/10174/35708 https://doi.org/10.3390/electronics12214447
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	Rúben Teimas and José Saias. 2023. "Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings" Electronics 12, no. 21: 4447. 2079-9292 https://www.mdpi.com/2539862 Electronics ruben.teimas@uevora.pt jsaias@uevora.pt 283
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	MDPI - Electronics
publisher.none.fl_str_mv	MDPI - Electronics
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136722387206144

Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings

Registros relacionados