Towards automatic fake news detection in digital platforms: properties, limitations, and applications

Detalhes bibliográficos
Autor(a) principal: Julio Cesar Soares dos Reis
Data de Publicação: 2020
Tipo de documento: Tese
Idioma: eng
Título da fonte: Repositório Institucional da UFMG
Texto Completo: http://hdl.handle.net/1843/34447
https://orcid.org/0000-0003-0563-0434
Resumo: Digital platforms have dramatically changed the way news is produced, disseminated, and consumed in our society. A key problem today is that digital platforms have become a place for campaigns of misinformation that affect the credibility of the entire news ecosystem. The emergence of fake news in these environments has quickly evolved into a worldwide phenomenon, where the lack of scalable fact-checking strategies is especially worrisome. Thus, automatic solutions for fake news detection could be used as an auxiliary tool for fact-checkers to identify content that is more likely to be fake, or content that is worth checking. In this context, this thesis aims at investigating practical approaches for the automatic detection of fake news in digital platforms. First, we survey a large number of recent and related works as an effort to implement all potential features to detect fake news. We propose novel features and explore labeled datasets proposing new ones to assess the prediction performance of current supervised machine learning approaches. Our results reveal that these proposed computational models have a useful discriminative capacity for detecting fake news disseminated in digital platforms. We then propose an unbiased framework for quantifying the informativeness of features for fake news detection. As part of our proposed framework, we present an explanation of factors contributing to model decisions, thus promoting civic reasoning by complementing our ability to evaluate digital content and reach warranted conclusions. We also analyze features and models that can be useful for detecting fake news from different scenarios: the US and Brazilian elections. Finally, we propose and implement into a real system a new mechanism that accounts for the potential occurrence of fake news within data, significantly reducing the number of content pieces journalists and fact-checkers have to go through before finding a fake story.
id UFMG_87df14d9ed0a0a7b20aa9474d58e66cf
oai_identifier_str oai:repositorio.ufmg.br:1843/34447
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Fabrício Benevenuto de Souzahttp://lattes.cnpq.br/7014991384513854Viviane Pereira MoreiraLeandro Balby MarinhoFabrício Murai FerreiraMirella Moura MoroAdriano Alonso Velosohttp://lattes.cnpq.br/4231109991328059Julio Cesar Soares dos Reis2020-11-30T18:12:24Z2020-11-30T18:12:24Z2020-11-03http://hdl.handle.net/1843/34447https://orcid.org/0000-0003-0563-0434Digital platforms have dramatically changed the way news is produced, disseminated, and consumed in our society. A key problem today is that digital platforms have become a place for campaigns of misinformation that affect the credibility of the entire news ecosystem. The emergence of fake news in these environments has quickly evolved into a worldwide phenomenon, where the lack of scalable fact-checking strategies is especially worrisome. Thus, automatic solutions for fake news detection could be used as an auxiliary tool for fact-checkers to identify content that is more likely to be fake, or content that is worth checking. In this context, this thesis aims at investigating practical approaches for the automatic detection of fake news in digital platforms. First, we survey a large number of recent and related works as an effort to implement all potential features to detect fake news. We propose novel features and explore labeled datasets proposing new ones to assess the prediction performance of current supervised machine learning approaches. Our results reveal that these proposed computational models have a useful discriminative capacity for detecting fake news disseminated in digital platforms. We then propose an unbiased framework for quantifying the informativeness of features for fake news detection. As part of our proposed framework, we present an explanation of factors contributing to model decisions, thus promoting civic reasoning by complementing our ability to evaluate digital content and reach warranted conclusions. We also analyze features and models that can be useful for detecting fake news from different scenarios: the US and Brazilian elections. Finally, we propose and implement into a real system a new mechanism that accounts for the potential occurrence of fake news within data, significantly reducing the number of content pieces journalists and fact-checkers have to go through before finding a fake story.As plataformas digitais mudaram drasticamente a forma como as notícias são produzidas, disseminadas e consumidas em nossa sociedade. Um problema fundamental hoje é que as plataformas digitais se tornaram espaços amplamente abusados por campa- nhas de desinformação que afetam a credibilidade de todo o ecossistema de notícias. O surgimento de notícias falsas nesses ambientes evoluiu rapidamente para um fenômeno mundial, onde a falta de estratégias escaláveis de verificação de fatos é preocupante. Assim, soluções automáticas para detecção de notícias falsas poderiam ser usadas por jornalistas e equipes de checagem de fatos como uma ferramenta auxiliar na identificação de notícias com alta probabilidade de serem falsas. Neste contexto, esta tese tem como objetivo investigar abordagens práticas para a detecção automática de notícias falsas disseminadas em plataformas digitais. Para isso, inicialmente nós pesquisamos um grande número de trabalhos recentes e relacionados como uma tentativa de implementar atributos propostos na literatura para a detecção de notícias falsas. Isso nos possibilitou propor novos recursos, explorar conjuntos de dados rotulados disponíveis e propor um novo conjunto de dados para avaliar o desempenho de previsão das atuais abordagens de aprendizado de máquina supervisionadas na realização desta tarefa. Nossos resultados revelam que esses modelos computacionais propostos possuem um grau útil de poder discriminativo para detectar notícias falsas disseminadas em plataformas digitais. Além disso, nós propomos um arcabouço imparcial para quantificar a informatividade de atributos para detecção de notícias falsas. Como parte de nosso arcabouço proposto, apresentamos uma explicação dos fatores que contribuem para as decisões do modelo, promovendo assim o raciocínio cívico, complementando nossa capacidade de avaliar o conteúdo digital e chegar a conclusões justificadas. Também analisamos recursos e modelos que podem ser úteis para detectar notícias falsas em diferentes cenários: eleições nos Estados Unidos e no Brasil. Por fim, propomos e implementamos em um sistema real um novo mecanismo que, conforme resultados experimentais, reduziu significativamente o número de notícias que jornalistas e verificadores de fatos precisam ler antes de encontrar uma história falsa.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGBrasilICEX - INSTITUTO DE CIÊNCIAS EXATASComputação – Teses.Mídia social - Teses.Fake news - Teses.Desinformação - Teses.Aprendizagem de máquina - TesesComputação – TesesMídia socialFake newsDesinformaçãoAprendizagem de máquinaTowards automatic fake news detection in digital platforms: properties, limitations, and applicationsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALthesis_JReis_Towards-automatic-fake-news-detection_versaoFinal.pdfthesis_JReis_Towards-automatic-fake-news-detection_versaoFinal.pdfapplication/pdf23733679https://repositorio.ufmg.br/bitstream/1843/34447/1/thesis_JReis_Towards-automatic-fake-news-detection_versaoFinal.pdfb2132c12bd1a8deddee21dfd0fa23703MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82119https://repositorio.ufmg.br/bitstream/1843/34447/2/license.txt34badce4be7e31e3adb4575ae96af679MD521843/344472020-11-30 15:12:24.816oai:repositorio.ufmg.br:1843/34447TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KCg==Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2020-11-30T18:12:24Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv Towards automatic fake news detection in digital platforms: properties, limitations, and applications
title Towards automatic fake news detection in digital platforms: properties, limitations, and applications
spellingShingle Towards automatic fake news detection in digital platforms: properties, limitations, and applications
Julio Cesar Soares dos Reis
Computação – Teses
Mídia social
Fake news
Desinformação
Aprendizagem de máquina
Computação – Teses.
Mídia social - Teses.
Fake news - Teses.
Desinformação - Teses.
Aprendizagem de máquina - Teses
title_short Towards automatic fake news detection in digital platforms: properties, limitations, and applications
title_full Towards automatic fake news detection in digital platforms: properties, limitations, and applications
title_fullStr Towards automatic fake news detection in digital platforms: properties, limitations, and applications
title_full_unstemmed Towards automatic fake news detection in digital platforms: properties, limitations, and applications
title_sort Towards automatic fake news detection in digital platforms: properties, limitations, and applications
author Julio Cesar Soares dos Reis
author_facet Julio Cesar Soares dos Reis
author_role author
dc.contributor.advisor1.fl_str_mv Fabrício Benevenuto de Souza
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/7014991384513854
dc.contributor.referee1.fl_str_mv Viviane Pereira Moreira
dc.contributor.referee2.fl_str_mv Leandro Balby Marinho
dc.contributor.referee3.fl_str_mv Fabrício Murai Ferreira
dc.contributor.referee4.fl_str_mv Mirella Moura Moro
dc.contributor.referee5.fl_str_mv Adriano Alonso Veloso
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/4231109991328059
dc.contributor.author.fl_str_mv Julio Cesar Soares dos Reis
contributor_str_mv Fabrício Benevenuto de Souza
Viviane Pereira Moreira
Leandro Balby Marinho
Fabrício Murai Ferreira
Mirella Moura Moro
Adriano Alonso Veloso
dc.subject.por.fl_str_mv Computação – Teses
Mídia social
Fake news
Desinformação
Aprendizagem de máquina
topic Computação – Teses
Mídia social
Fake news
Desinformação
Aprendizagem de máquina
Computação – Teses.
Mídia social - Teses.
Fake news - Teses.
Desinformação - Teses.
Aprendizagem de máquina - Teses
dc.subject.other.pt_BR.fl_str_mv Computação – Teses.
Mídia social - Teses.
Fake news - Teses.
Desinformação - Teses.
Aprendizagem de máquina - Teses
description Digital platforms have dramatically changed the way news is produced, disseminated, and consumed in our society. A key problem today is that digital platforms have become a place for campaigns of misinformation that affect the credibility of the entire news ecosystem. The emergence of fake news in these environments has quickly evolved into a worldwide phenomenon, where the lack of scalable fact-checking strategies is especially worrisome. Thus, automatic solutions for fake news detection could be used as an auxiliary tool for fact-checkers to identify content that is more likely to be fake, or content that is worth checking. In this context, this thesis aims at investigating practical approaches for the automatic detection of fake news in digital platforms. First, we survey a large number of recent and related works as an effort to implement all potential features to detect fake news. We propose novel features and explore labeled datasets proposing new ones to assess the prediction performance of current supervised machine learning approaches. Our results reveal that these proposed computational models have a useful discriminative capacity for detecting fake news disseminated in digital platforms. We then propose an unbiased framework for quantifying the informativeness of features for fake news detection. As part of our proposed framework, we present an explanation of factors contributing to model decisions, thus promoting civic reasoning by complementing our ability to evaluate digital content and reach warranted conclusions. We also analyze features and models that can be useful for detecting fake news from different scenarios: the US and Brazilian elections. Finally, we propose and implement into a real system a new mechanism that accounts for the potential occurrence of fake news within data, significantly reducing the number of content pieces journalists and fact-checkers have to go through before finding a fake story.
publishDate 2020
dc.date.accessioned.fl_str_mv 2020-11-30T18:12:24Z
dc.date.available.fl_str_mv 2020-11-30T18:12:24Z
dc.date.issued.fl_str_mv 2020-11-03
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1843/34447
dc.identifier.orcid.pt_BR.fl_str_mv https://orcid.org/0000-0003-0563-0434
url http://hdl.handle.net/1843/34447
https://orcid.org/0000-0003-0563-0434
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv UFMG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv ICEX - INSTITUTO DE CIÊNCIAS EXATAS
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
bitstream.url.fl_str_mv https://repositorio.ufmg.br/bitstream/1843/34447/1/thesis_JReis_Towards-automatic-fake-news-detection_versaoFinal.pdf
https://repositorio.ufmg.br/bitstream/1843/34447/2/license.txt
bitstream.checksum.fl_str_mv b2132c12bd1a8deddee21dfd0fa23703
34badce4be7e31e3adb4575ae96af679
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_ 1803589243060092928