Mining User Activity Data in Social Media Services

Detalhes bibliográficos
Autor(a) principal: Costa, Alceu Ferraz
Data de Publicação: 2017
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-11092017-151000/
Resumo: Social media services have a growing impact in our society. Individuals often rely on social media to get their news, decide which products to buy or to communicate with their friends. As consequence of the widespread adoption of social media, a large volume of data on how users behave is created every day and stored into large databases. Learning how to analyze and extract useful knowledge from this data has a number of potential applications. For instance, a deeper understanding on how legitimate users interact with social media services could be explored to design more accurate spam and fraud detection methods. This PhD research is based on the following hypothesis: data generated by social media users present patterns that can be exploited to improve the effectiveness of tasks such as prediction, forecasting and modeling in the domain of social media. To validate our hypothesis, we focus on designing data mining methods tailored to social media data. The main contributions of this PhD can be divided into three parts. First, we propose Act-M, a mathematical model that describes the timing of users actions. We also show that Act-M can be used to automatically detect bots among social media users based only on the timing (i.e. time-stamp) data. Our second contribution is VnC (Vote-and-Comment), a model that explains how the volume of different types of user interactions evolve over time when a piece of content is submitted to a social media service. In addition to accurately matching real data, VnC is useful, as it can be employed to forecast the number of interactions received by social media content. Finally, our third contribution is the MFS-Map method. MFS-Map automatically provides textual annotations to social media images by efficiently combining visual and metadata features. Our contributions were validated using real data from several social media services. Our experiments show that the Act-M and VnC models provided a more accurate fit to the data than existing models for communication dynamics and information diffusion, respectively. MFS-Map obtained both superior precision and faster speed when compared to other widely employed image annotation methods.
id USP_0ad293e4cf980d1b38b2e55afa35c8d7
oai_identifier_str oai:teses.usp.br:tde-11092017-151000
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Mining User Activity Data in Social Media ServicesMineração de Dados de Atividade de Usuários em Serviços de Mídia SocialData miningMídia socialMineração de dadosModelagem de usuáriosSocial mediaUser ModelingSocial media services have a growing impact in our society. Individuals often rely on social media to get their news, decide which products to buy or to communicate with their friends. As consequence of the widespread adoption of social media, a large volume of data on how users behave is created every day and stored into large databases. Learning how to analyze and extract useful knowledge from this data has a number of potential applications. For instance, a deeper understanding on how legitimate users interact with social media services could be explored to design more accurate spam and fraud detection methods. This PhD research is based on the following hypothesis: data generated by social media users present patterns that can be exploited to improve the effectiveness of tasks such as prediction, forecasting and modeling in the domain of social media. To validate our hypothesis, we focus on designing data mining methods tailored to social media data. The main contributions of this PhD can be divided into three parts. First, we propose Act-M, a mathematical model that describes the timing of users actions. We also show that Act-M can be used to automatically detect bots among social media users based only on the timing (i.e. time-stamp) data. Our second contribution is VnC (Vote-and-Comment), a model that explains how the volume of different types of user interactions evolve over time when a piece of content is submitted to a social media service. In addition to accurately matching real data, VnC is useful, as it can be employed to forecast the number of interactions received by social media content. Finally, our third contribution is the MFS-Map method. MFS-Map automatically provides textual annotations to social media images by efficiently combining visual and metadata features. Our contributions were validated using real data from several social media services. Our experiments show that the Act-M and VnC models provided a more accurate fit to the data than existing models for communication dynamics and information diffusion, respectively. MFS-Map obtained both superior precision and faster speed when compared to other widely employed image annotation methods.O impacto dos serviços de mídia social em nossa sociedade é crescente. Indivíduos frequentemente utilizam mídias sociais para obter notícias, decidir quais os produtos comprar ou para se comunicar com amigos. Como consequência da adoção generalizada de mídias sociais, um grande volume de dados sobre como os usuários se comportam é gerado diariamente e armazenado em grandes bancos de dados. Aprender a analisar e extrair conhecimentos úteis a partir destes dados tem uma série de potenciais aplicações. Por exemplo, um entendimento mais detalhado sobre como usuários legítimos interagem com serviços de mídia social poderia ser explorado para projetar métodos mais precisos de detecção de spam e fraude. Esta pesquisa de doutorado baseia-se na seguinte hipótese: dados gerados por usuários de mídia social apresentam padrões que podem ser explorados para melhorar a eficácia de tarefas como previsão e modelagem no domínio das mídias sociais. Para validar esta hipótese, foram projetados métodos de mineração de dados adaptados aos dados de mídia social. As principais contribuições desta pesquisa de doutorado podem ser divididas em três partes. Primeiro, foi desenvolvido o Act-M, um modelo matemático que descreve o tempo das ações dos usuários. O autor demonstrou que o Act-M pode ser usado para detectar automaticamente bots entre usuários de mídia social com base apenas nos dados de tempo. A segunda contribuição desta tese é o VnC (Vote-and- Comment), um modelo que explica como o volume de diferentes tipos de interações de usuário evolui ao longo do tempo quando um conteúdo é submetido a um serviço de mídia social. Além de descrever precisamente os dados reais, o VnC é útil, pois pode ser empregado para prever o número de interações recebidas por determinado conteúdo de mídia social. Por fim, nossa terceira contribuição é o método MFS-Map. O MFS-Map fornece automaticamente anotações textuais para imagens de mídias sociais, combinando eficientemente características visuais e de metadados das imagens. As contribuições deste doutorado foram validadas utilizando dados reais de diversos serviços de mídia social. Os experimentos mostraram que os modelos Act-M e VnC forneceram um ajuste mais preciso aos dados quando comparados, respectivamente, a modelos existentes para dinâmica de comunicação e difusão de informação. O MFS-Map obteve precisão superior e tempo de execução reduzido quando comparado com outros métodos amplamente utilizados para anotação de imagens.Biblioteca Digitais de Teses e Dissertações da USPFaloutsos, ChristosTraina, Agma Juci MachadoCosta, Alceu Ferraz2017-05-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://www.teses.usp.br/teses/disponiveis/55/55134/tde-11092017-151000/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2018-07-17T16:38:18Zoai:teses.usp.br:tde-11092017-151000Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212018-07-17T16:38:18Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Mining User Activity Data in Social Media Services
Mineração de Dados de Atividade de Usuários em Serviços de Mídia Social
title Mining User Activity Data in Social Media Services
spellingShingle Mining User Activity Data in Social Media Services
Costa, Alceu Ferraz
Data mining
Mídia social
Mineração de dados
Modelagem de usuários
Social media
User Modeling
title_short Mining User Activity Data in Social Media Services
title_full Mining User Activity Data in Social Media Services
title_fullStr Mining User Activity Data in Social Media Services
title_full_unstemmed Mining User Activity Data in Social Media Services
title_sort Mining User Activity Data in Social Media Services
author Costa, Alceu Ferraz
author_facet Costa, Alceu Ferraz
author_role author
dc.contributor.none.fl_str_mv Faloutsos, Christos
Traina, Agma Juci Machado
dc.contributor.author.fl_str_mv Costa, Alceu Ferraz
dc.subject.por.fl_str_mv Data mining
Mídia social
Mineração de dados
Modelagem de usuários
Social media
User Modeling
topic Data mining
Mídia social
Mineração de dados
Modelagem de usuários
Social media
User Modeling
description Social media services have a growing impact in our society. Individuals often rely on social media to get their news, decide which products to buy or to communicate with their friends. As consequence of the widespread adoption of social media, a large volume of data on how users behave is created every day and stored into large databases. Learning how to analyze and extract useful knowledge from this data has a number of potential applications. For instance, a deeper understanding on how legitimate users interact with social media services could be explored to design more accurate spam and fraud detection methods. This PhD research is based on the following hypothesis: data generated by social media users present patterns that can be exploited to improve the effectiveness of tasks such as prediction, forecasting and modeling in the domain of social media. To validate our hypothesis, we focus on designing data mining methods tailored to social media data. The main contributions of this PhD can be divided into three parts. First, we propose Act-M, a mathematical model that describes the timing of users actions. We also show that Act-M can be used to automatically detect bots among social media users based only on the timing (i.e. time-stamp) data. Our second contribution is VnC (Vote-and-Comment), a model that explains how the volume of different types of user interactions evolve over time when a piece of content is submitted to a social media service. In addition to accurately matching real data, VnC is useful, as it can be employed to forecast the number of interactions received by social media content. Finally, our third contribution is the MFS-Map method. MFS-Map automatically provides textual annotations to social media images by efficiently combining visual and metadata features. Our contributions were validated using real data from several social media services. Our experiments show that the Act-M and VnC models provided a more accurate fit to the data than existing models for communication dynamics and information diffusion, respectively. MFS-Map obtained both superior precision and faster speed when compared to other widely employed image annotation methods.
publishDate 2017
dc.date.none.fl_str_mv 2017-05-12
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://www.teses.usp.br/teses/disponiveis/55/55134/tde-11092017-151000/
url http://www.teses.usp.br/teses/disponiveis/55/55134/tde-11092017-151000/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815256873812623360