Análise exploratória e experimental sobre detecção inteligente de fake news

Silva, Caio Vinícius Meneses

Análise exploratória e experimental sobre detecção inteligente de fake news

Detalhes bibliográficos
Autor(a) principal:	Silva, Caio Vinícius Meneses
Data de Publicação:	2020
Tipo de documento:	Dissertação
Idioma:	por
Título da fonte:	Repositório Institucional da UFS
Texto Completo:	https://ri.ufs.br/jspui/handle/riufs/14136
Resumo:	Context: The evolution of the media has contributed to the spread of false news, especially after the emergence of digital social networks. However, this practice is not a recent phenomenon in human history. Reports from the First World War period show the use of misleading advertising by the press, which culminated in new standards of objectivity and journalistic balance. In digital social media, this phenomenon, now called fake news, has found a new environment conducive to spreading worldwide, making it impossible to manually check this immense volume of data. In this context, work in several areas has been carried out in order to try to minimize the damage caused by the proliferation of fake news. Objective: The purpose of this work was to evaluate the effectiveness of the most used methods to check text correspondence, in the task of automatic detection of fake news about the Brazilian presidential elections of 2018, comparing the evidence found with the results obtained from a mapping of the state of art published in this research. Method: Initially, a systematic mapping was carried out to identify and characterize the main approaches, techniques and algorithms used, in computing, to detect false news. Finally, a controlled experiment was carried out, in vitro, using as perspective one of the works found in the literature, whose context has a strong relationship with this study: the American elections of 2016. In this way, the effectiveness of the methods was evaluated, comparing the results and contexts of the two works. Results: For the state of the art, it was identified that the main algorithms used in the task of detecting false news are LSTM (17.14%), Naive-Bayes and Similarity Algorithm (11.43% each). With the execution of the entire experimental process, it was evidenced that the TF-IDF and BM25 methods obtained statistically similar averages of accuracy, respectively, 79.86% and 79.00%. Finally, the Word2Vec and Doc2Vec methods also obtained, respectively, the worst averages, 75.69% and 72.39%. Conclusions: After analyzing the state of the art, gaps related to work in the Big Data context and the need for replication of existing studies, in the form of more controlled experiments, became evident. With the experimental evaluation, it was found that the effectiveness of the methods evaluated were similar to the effectiveness of the work used as a control. In addition, considering the universe of checked news available, the analyzed period and a margin of error of approximately 3.5%, the disclosure of fake news by the followers of both candidates evaluated in the experiment was evidenced. Followers of candidate Jair Bolsonaro (PSL) were responsible for 62.25% of tweets related to fake news, against 37.75% of followers of candidate Fernando Haddad (PT). With regard to accounts deleted from the social network in a short period of time, 59.96% were followers of the PSL candidate and 40.04% of followers of the PT candidate. The dissemination of fake news does not always imply intention, and may only imply greater engagement by some.

Metadados do item

id	UFS-2_a315852d5f71d4f6a6e91f201e0670ad
oai_identifier_str	oai:ufs.br:riufs/14136
network_acronym_str	UFS-2
network_name_str	Repositório Institucional da UFS
repository_id_str
spelling	Silva, Caio Vinícius MenesesRodrigues Júnior, Methanias Colaço2021-04-27T23:34:44Z2021-04-27T23:34:44Z2020-12-08SILVA, Caio Vinícius Meneses. Análise exploratória e experimental sobre detecção inteligente de fake news. 2020. 83f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Sergipe, São Cristóvão, Sergipe, 2020.https://ri.ufs.br/jspui/handle/riufs/14136Autorização para publicação no Repositório da Universidade Federal de Sergipe (RI-UFS), concedida pelo autor.Context: The evolution of the media has contributed to the spread of false news, especially after the emergence of digital social networks. However, this practice is not a recent phenomenon in human history. Reports from the First World War period show the use of misleading advertising by the press, which culminated in new standards of objectivity and journalistic balance. In digital social media, this phenomenon, now called fake news, has found a new environment conducive to spreading worldwide, making it impossible to manually check this immense volume of data. In this context, work in several areas has been carried out in order to try to minimize the damage caused by the proliferation of fake news. Objective: The purpose of this work was to evaluate the effectiveness of the most used methods to check text correspondence, in the task of automatic detection of fake news about the Brazilian presidential elections of 2018, comparing the evidence found with the results obtained from a mapping of the state of art published in this research. Method: Initially, a systematic mapping was carried out to identify and characterize the main approaches, techniques and algorithms used, in computing, to detect false news. Finally, a controlled experiment was carried out, in vitro, using as perspective one of the works found in the literature, whose context has a strong relationship with this study: the American elections of 2016. In this way, the effectiveness of the methods was evaluated, comparing the results and contexts of the two works. Results: For the state of the art, it was identified that the main algorithms used in the task of detecting false news are LSTM (17.14%), Naive-Bayes and Similarity Algorithm (11.43% each). With the execution of the entire experimental process, it was evidenced that the TF-IDF and BM25 methods obtained statistically similar averages of accuracy, respectively, 79.86% and 79.00%. Finally, the Word2Vec and Doc2Vec methods also obtained, respectively, the worst averages, 75.69% and 72.39%. Conclusions: After analyzing the state of the art, gaps related to work in the Big Data context and the need for replication of existing studies, in the form of more controlled experiments, became evident. With the experimental evaluation, it was found that the effectiveness of the methods evaluated were similar to the effectiveness of the work used as a control. In addition, considering the universe of checked news available, the analyzed period and a margin of error of approximately 3.5%, the disclosure of fake news by the followers of both candidates evaluated in the experiment was evidenced. Followers of candidate Jair Bolsonaro (PSL) were responsible for 62.25% of tweets related to fake news, against 37.75% of followers of candidate Fernando Haddad (PT). With regard to accounts deleted from the social network in a short period of time, 59.96% were followers of the PSL candidate and 40.04% of followers of the PT candidate. The dissemination of fake news does not always imply intention, and may only imply greater engagement by some.Contexto: A evolução dos meios de comunicação tem contribuído para a disseminação de notícias falsas, principalmente após o surgimento das redes sociais digitais. No entanto, esta prática não é um fenômeno recente na história da humanidade. Relatos do período da Primeira Guerra Mundial evidenciam o uso de propaganda enganosa por parte da imprensa, que culminou em novas normas de objetividade e equilíbrio jornalístico. Nas mídias sociais digitais, tal fenômeno, agora chamado de fake news, encontrou um novo ambiente propício para se espalhar em escalas mundiais, tornando inviável a checagem manual desse imenso volume de dados. Diante deste contexto, trabalhos em diversas áreas têm sido realizados a fim de tentar minimizar os danos causados pela proliferação das fake news. Objetivo: Este trabalho teve por propósito avaliar a eficácia dos métodos mais utilizados para verificar correspondência de textos, na tarefa de detecção automática de fake news sobre as eleições presidenciais brasileiras de 2018, comparando as evidências encontradas com os resultados obtidos de um mapeamento do estado da arte publicado nesta pesquisa. Método: Inicialmente, foi realizado um mapeamento sistemático para identificar e caracterizar as principais abordagens, técnicas e algoritmos usados, na computação, para a detecção de notícias falsas. Por fim, foi realizado um experimento controlado, in vitro, usando como perspectiva um dos trabalhos encontrados na literatura, cujo contexto possui forte relação com este estudo: as eleições americanas de 2016. Desta forma, avaliou-se a eficácia dos métodos, confrontando os resultados e os contextos dos dois trabalhos. Resultados: Para o estado da arte, foi identificado que os principais algoritmos utilizados na tarefa de detecção de notícias falsas são LSTM (17,14%), Naive-Bayes e Algoritmo de Similaridade (11,43% cada um). Com a execução de todo o processo experimental, foi evidenciado que os métodos TF-IDF e BM25 obtiveram médias estatisticamente similares de acurácia, respectivamente, 79,86% e 79,00%. Por fim, os métodos Word2Vec e Doc2Vec obtiveram resultados um pouco abaixo dos demais, 75,69% e 72,39% respectivamente. Conclusões: Após a análise do estado da arte, evidenciou-se lacunas relacionadas a trabalhos no contexto Big Data e à necessidade de replicações dos estudos existentes, na forma de experimentos mais controlados. Com a avaliação experimental, foi constatado que as eficácias dos métodos avaliados foram similares às eficácias do trabalho utilizado como controle. Além disso, considerando o universo de notícias checadas disponível, o período analisado e uma margem de erro de aproximadamente 3,5%, evidenciou-se a divulgação de fake news da parte de seguidores de ambos os candidatos avaliados no experimento. Os seguidores do candidato Jair Bolsonaro (PSL) foram responsáveis por 62,25% dos tweets relacionados a notícias falsas, contra 37,75% dos seguidores do candidato Fernando Haddad (PT). No que diz respeito às contas excluídas da rede social em um curto espaço de tempo, 59,96% eram de seguidores do candidato do PSL e 40,04% de seguidores do candidato do PT. A divulgação de fake news nem sempre implica intenção, em alguns casos indica apenas um maior engajamento.São Cristóvão, SEporNotícias falsasEleiçõesProcessamento eletrônico de dadosMineração de textoFake newsElectionsText miningCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOAnálise exploratória e experimental sobre detecção inteligente de fake newsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisPós-Graduação em Ciência da ComputaçãoUniversidade Federal de Sergipereponame:Repositório Institucional da UFSinstname:Universidade Federal de Sergipe (UFS)instacron:UFSinfo:eu-repo/semantics/openAccessTEXTCAIO_VINICIUS_MENESES_SILVA.pdf.txtCAIO_VINICIUS_MENESES_SILVA.pdf.txtExtracted texttext/plain174028https://ri.ufs.br/jspui/bitstream/riufs/14136/3/CAIO_VINICIUS_MENESES_SILVA.pdf.txt6f6ee4180a74a5b52cb843474bf7b845MD53THUMBNAILCAIO_VINICIUS_MENESES_SILVA.pdf.jpgCAIO_VINICIUS_MENESES_SILVA.pdf.jpgGenerated Thumbnailimage/jpeg1348https://ri.ufs.br/jspui/bitstream/riufs/14136/4/CAIO_VINICIUS_MENESES_SILVA.pdf.jpg602a7ef86a246d5d00353951a13fd184MD54ORIGINALCAIO_VINICIUS_MENESES_SILVA.pdfCAIO_VINICIUS_MENESES_SILVA.pdfapplication/pdf3635958https://ri.ufs.br/jspui/bitstream/riufs/14136/2/CAIO_VINICIUS_MENESES_SILVA.pdfd857515da4c02950cbfbfd2e517b1e9fMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81475https://ri.ufs.br/jspui/bitstream/riufs/14136/1/license.txt098cbbf65c2c15e1fb2e49c5d306a44cMD51riufs/141362021-04-27 20:34:47.636oai:ufs.br:riufs/14136TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvcihlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTZXJnaXBlIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSByZXByb2R1emlyIHNldSB0cmFiYWxobyBubyBmb3JtYXRvIGVsZXRyw7RuaWNvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRlIFNlcmdpcGUgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250ZcO6ZG8sIHRyYW5zcG9yIHNldSB0cmFiYWxobyBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgU2VyZ2lwZSBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgZGUgc2V1IHRyYWJhbGhvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIHNldSB0cmFiYWxobyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0bywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgbsOjbyBpbmZyaW5nZSBkaXJlaXRvcyBhdXRvcmFpcyBkZSBuaW5ndcOpbS4KCkNhc28gbyB0cmFiYWxobyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgw6AgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgU2VyZ2lwZSBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvLgoKQSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTZXJnaXBlIHNlIGNvbXByb21ldGUgYSBpZGVudGlmaWNhciBjbGFyYW1lbnRlIG8gc2V1IG5vbWUocykgb3UgbyhzKSBub21lKHMpIGRvKHMpIApkZXRlbnRvcihlcykgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRvIHRyYWJhbGhvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIGNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuIAo=Repositório InstitucionalPUBhttps://ri.ufs.br/oai/requestrepositorio@academico.ufs.bropendoar:2021-04-27T23:34:47Repositório Institucional da UFS - Universidade Federal de Sergipe (UFS)false
dc.title.pt_BR.fl_str_mv	Análise exploratória e experimental sobre detecção inteligente de fake news
title	Análise exploratória e experimental sobre detecção inteligente de fake news
spellingShingle	Análise exploratória e experimental sobre detecção inteligente de fake news Silva, Caio Vinícius Meneses Notícias falsas Eleições Processamento eletrônico de dados Mineração de texto Fake news Elections Text mining CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short	Análise exploratória e experimental sobre detecção inteligente de fake news
title_full	Análise exploratória e experimental sobre detecção inteligente de fake news
title_fullStr	Análise exploratória e experimental sobre detecção inteligente de fake news
title_full_unstemmed	Análise exploratória e experimental sobre detecção inteligente de fake news
title_sort	Análise exploratória e experimental sobre detecção inteligente de fake news
author	Silva, Caio Vinícius Meneses
author_facet	Silva, Caio Vinícius Meneses
author_role	author
dc.contributor.author.fl_str_mv	Silva, Caio Vinícius Meneses
dc.contributor.advisor1.fl_str_mv	Rodrigues Júnior, Methanias Colaço
contributor_str_mv	Rodrigues Júnior, Methanias Colaço
dc.subject.por.fl_str_mv	Notícias falsas Eleições Processamento eletrônico de dados Mineração de texto
topic	Notícias falsas Eleições Processamento eletrônico de dados Mineração de texto Fake news Elections Text mining CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv	Fake news Elections Text mining
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description	Context: The evolution of the media has contributed to the spread of false news, especially after the emergence of digital social networks. However, this practice is not a recent phenomenon in human history. Reports from the First World War period show the use of misleading advertising by the press, which culminated in new standards of objectivity and journalistic balance. In digital social media, this phenomenon, now called fake news, has found a new environment conducive to spreading worldwide, making it impossible to manually check this immense volume of data. In this context, work in several areas has been carried out in order to try to minimize the damage caused by the proliferation of fake news. Objective: The purpose of this work was to evaluate the effectiveness of the most used methods to check text correspondence, in the task of automatic detection of fake news about the Brazilian presidential elections of 2018, comparing the evidence found with the results obtained from a mapping of the state of art published in this research. Method: Initially, a systematic mapping was carried out to identify and characterize the main approaches, techniques and algorithms used, in computing, to detect false news. Finally, a controlled experiment was carried out, in vitro, using as perspective one of the works found in the literature, whose context has a strong relationship with this study: the American elections of 2016. In this way, the effectiveness of the methods was evaluated, comparing the results and contexts of the two works. Results: For the state of the art, it was identified that the main algorithms used in the task of detecting false news are LSTM (17.14%), Naive-Bayes and Similarity Algorithm (11.43% each). With the execution of the entire experimental process, it was evidenced that the TF-IDF and BM25 methods obtained statistically similar averages of accuracy, respectively, 79.86% and 79.00%. Finally, the Word2Vec and Doc2Vec methods also obtained, respectively, the worst averages, 75.69% and 72.39%. Conclusions: After analyzing the state of the art, gaps related to work in the Big Data context and the need for replication of existing studies, in the form of more controlled experiments, became evident. With the experimental evaluation, it was found that the effectiveness of the methods evaluated were similar to the effectiveness of the work used as a control. In addition, considering the universe of checked news available, the analyzed period and a margin of error of approximately 3.5%, the disclosure of fake news by the followers of both candidates evaluated in the experiment was evidenced. Followers of candidate Jair Bolsonaro (PSL) were responsible for 62.25% of tweets related to fake news, against 37.75% of followers of candidate Fernando Haddad (PT). With regard to accounts deleted from the social network in a short period of time, 59.96% were followers of the PSL candidate and 40.04% of followers of the PT candidate. The dissemination of fake news does not always imply intention, and may only imply greater engagement by some.
publishDate	2020
dc.date.issued.fl_str_mv	2020-12-08
dc.date.accessioned.fl_str_mv	2021-04-27T23:34:44Z
dc.date.available.fl_str_mv	2021-04-27T23:34:44Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	SILVA, Caio Vinícius Meneses. Análise exploratória e experimental sobre detecção inteligente de fake news. 2020. 83f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Sergipe, São Cristóvão, Sergipe, 2020.
dc.identifier.uri.fl_str_mv	https://ri.ufs.br/jspui/handle/riufs/14136
dc.identifier.license.pt_BR.fl_str_mv	Autorização para publicação no Repositório da Universidade Federal de Sergipe (RI-UFS), concedida pelo autor.
identifier_str_mv	SILVA, Caio Vinícius Meneses. Análise exploratória e experimental sobre detecção inteligente de fake news. 2020. 83f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Sergipe, São Cristóvão, Sergipe, 2020. Autorização para publicação no Repositório da Universidade Federal de Sergipe (RI-UFS), concedida pelo autor.
url	https://ri.ufs.br/jspui/handle/riufs/14136
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.program.fl_str_mv	Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv	Universidade Federal de Sergipe
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFS instname:Universidade Federal de Sergipe (UFS) instacron:UFS
instname_str	Universidade Federal de Sergipe (UFS)
instacron_str	UFS
institution	UFS
reponame_str	Repositório Institucional da UFS
collection	Repositório Institucional da UFS
bitstream.url.fl_str_mv	https://ri.ufs.br/jspui/bitstream/riufs/14136/3/CAIO_VINICIUS_MENESES_SILVA.pdf.txt https://ri.ufs.br/jspui/bitstream/riufs/14136/4/CAIO_VINICIUS_MENESES_SILVA.pdf.jpg https://ri.ufs.br/jspui/bitstream/riufs/14136/2/CAIO_VINICIUS_MENESES_SILVA.pdf https://ri.ufs.br/jspui/bitstream/riufs/14136/1/license.txt
bitstream.checksum.fl_str_mv	6f6ee4180a74a5b52cb843474bf7b845 602a7ef86a246d5d00353951a13fd184 d857515da4c02950cbfbfd2e517b1e9f 098cbbf65c2c15e1fb2e49c5d306a44c
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFS - Universidade Federal de Sergipe (UFS)
repository.mail.fl_str_mv	repositorio@academico.ufs.br
_version_	1802110695509065728

Análise exploratória e experimental sobre detecção inteligente de fake news

Registros relacionados