Aprendizado profundo para classificação de sentimentos em microblogs

Graciano, Gabriel Franco Dias

Aprendizado profundo para classificação de sentimentos em microblogs

Detalhes bibliográficos
Autor(a) principal:	Graciano, Gabriel Franco Dias
Data de Publicação:	2019
Tipo de documento:	Trabalho de conclusão de curso
Idioma:	por
Título da fonte:	Repositório Institucional da UFU
Texto Completo:	https://repositorio.ufu.br/handle/123456789/26220
Resumo:	The Internet today represents one of the leading means of communication and information sharing. In Brazil and in several other countries, the vast majority of the population uses it for a variety of activities. Examples include social networks and microblogs, environments where there are frequent manifestation of opinions and discussions about products, services or other subjects of general interest. In this context, the task of sentiment analysis may label opinions expressed in such vehicles as positive, negative or neutral, using algorithms of natural language processing (NLP) and machine learning. The objective of this study was to investigate the application of deep learning techniques to sentiment classification on a Twitter corpus, considering subjects of interest to the population. Specifically, this study considered the manifestation of opinions about the Brazilian presidential elections of 2018. Data collection was done through Tweepy, an application programming interface (API) provided by Twitter, on all days of televised debates. The database was preprocessed using NLP techniques. To perform the experiments, a relevant set of data classification algorithms was selected from the literature (naive Bayes, decision tree, logistic regression and support vector machines), and compared with deep neural networks. The results obtained showed that the logistic regression classifier achieved the best performance among traditional algorithms, with an averaged accuracy of 54%. The predictive performance of the deep neural network was equivalent, achieving the same 54% of averaged accuracy in a given parameter setting. This result is promising because there is a large space for improving the performance of such a technique as a small amount of parameters was studied. In addition, the poor performance of other techniques, such as the naive Bayes, show that the number of labeled tweets (Twitter posts) on the database is still small and that the overall performance of the classification techniques can be improved by labeling more of them.

Metadados do item

id	UFU_de27028e26a8a116736decb40761e3df
oai_identifier_str	oai:repositorio.ufu.br:123456789/26220
network_acronym_str	UFU
network_name_str	Repositório Institucional da UFU
repository_id_str
spelling	Aprendizado profundo para classificação de sentimentos em microblogsDeep learning for sentiment classification in microblogsAprendizado profundoDeep learningAprendizado de máquinaMachine learningAnálise de sentimentosSentiment analysisRedes neuraisNeural networksProcessamento de línguas naturaisNatural language processingCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAOThe Internet today represents one of the leading means of communication and information sharing. In Brazil and in several other countries, the vast majority of the population uses it for a variety of activities. Examples include social networks and microblogs, environments where there are frequent manifestation of opinions and discussions about products, services or other subjects of general interest. In this context, the task of sentiment analysis may label opinions expressed in such vehicles as positive, negative or neutral, using algorithms of natural language processing (NLP) and machine learning. The objective of this study was to investigate the application of deep learning techniques to sentiment classification on a Twitter corpus, considering subjects of interest to the population. Specifically, this study considered the manifestation of opinions about the Brazilian presidential elections of 2018. Data collection was done through Tweepy, an application programming interface (API) provided by Twitter, on all days of televised debates. The database was preprocessed using NLP techniques. To perform the experiments, a relevant set of data classification algorithms was selected from the literature (naive Bayes, decision tree, logistic regression and support vector machines), and compared with deep neural networks. The results obtained showed that the logistic regression classifier achieved the best performance among traditional algorithms, with an averaged accuracy of 54%. The predictive performance of the deep neural network was equivalent, achieving the same 54% of averaged accuracy in a given parameter setting. This result is promising because there is a large space for improving the performance of such a technique as a small amount of parameters was studied. In addition, the poor performance of other techniques, such as the naive Bayes, show that the number of labeled tweets (Twitter posts) on the database is still small and that the overall performance of the classification techniques can be improved by labeling more of them.UFU - Universidade Federal de UberlândiaTrabalho de Conclusão de Curso (Graduação)A Internet, atualmente, representa um dos maiores meios de comunicação e compartilhamento de informações. No Brasil e em vários outros países, a grande maioria da população a utiliza para uma série de atividades. Exemplos incluem as redes sociais e os microblogs, ambientes onde são frequentes as manifestações de opiniões e discussões sobre produtos, serviços ou outros assuntos de interesse geral. Neste contexto, a tarefa de classificação de sentimentos pode rotular opiniões expressas em tais veículos como positivas, negativas ou neutras, utilizando algoritmos para Processamento de Línguas Naturais (PLN) e Aprendizado de Máquina. O objetivo desse trabalho foi investigar a aplicação de técnicas de Aprendizado Profundo para classificação de sentimentos em publicações da rede social Twitter, considerando assuntos de interesse para a população. Especificamente, este estudo considerou a manifestação de opiniões sobre a corrida presidencial brasileira no ano de 2018. A coleta dos dados foi realizada por meio da Tweepy, uma Interface de Programação de Aplicativos (API) disponibilizada pelo Twitter, em todos os dias que houveram debates televisionados. A base de dados foi pré-processada utilizando técnicas de PLN. Para a realização dos experimentos, um conjunto relevante de algoritmos de classificação de dados foi selecionado a partir da literatura (naive Bayes, árvore de decisão, regressão logística, máquina de vetores de suporte), e comparados com as redes neurais profundas. Os resultados obtidos mostraram que o classificador de regressão logística alcançou o melhor desempenho entre os algoritmos tradicionais, com 54% de acurácia média. O desempenho das redes neurais profundas foi equivalente, alcançando até 54% de acurácia média de acordo com os ajustes dos parâmetros. Este resultado é visto como promissor, uma vez que há espaço para melhora de desempenho dessas técnicas se considerarmos que um número bastante reduzido de parâmetros foi estudado. Ademais, o baixo desempenho de outras técnicas, tais como o naive Bayes, evidenciam que o número de tweets (publicações do Twitter) rotulados na base de dados ainda é pequeno e que o desempenho geral das técnicas pode ser melhorado pela anotação de mais deles.Universidade Federal de UberlândiaBrasilSistemas de InformaçãoCarneiro, Murillo Guimarãeshttp://lattes.cnpq.br/8158868389973535Lopes, Carlos Robertohttp://lattes.cnpq.br/6737493567462425Martins, Luiz Gustavo Almeidahttp://lattes.cnpq.br/2546751023256424Graciano, Gabriel Franco Dias2019-07-22T13:24:00Z2019-07-22T13:24:00Z2019-07-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisapplication/pdfGRACIANO, Gabriel Franco Dias. Aprendizado profundo para classificação de sentimentos em microblogs. 2019. 48 f. Trabalho de Conclusão de Curso (Graduação em Sistemas de Informação) - Universidade Federal de Uberlândia, Monte Carmelo, 2019.https://repositorio.ufu.br/handle/123456789/26220porhttp://creativecommons.org/publicdomain/zero/1.0/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFUinstname:Universidade Federal de Uberlândia (UFU)instacron:UFU2019-07-23T06:07:16Zoai:repositorio.ufu.br:123456789/26220Repositório InstitucionalONGhttp://repositorio.ufu.br/oai/requestdiinf@dirbi.ufu.bropendoar:2019-07-23T06:07:16Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)false
dc.title.none.fl_str_mv	Aprendizado profundo para classificação de sentimentos em microblogs Deep learning for sentiment classification in microblogs
title	Aprendizado profundo para classificação de sentimentos em microblogs
spellingShingle	Aprendizado profundo para classificação de sentimentos em microblogs Graciano, Gabriel Franco Dias Aprendizado profundo Deep learning Aprendizado de máquina Machine learning Análise de sentimentos Sentiment analysis Redes neurais Neural networks Processamento de línguas naturais Natural language processing CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
title_short	Aprendizado profundo para classificação de sentimentos em microblogs
title_full	Aprendizado profundo para classificação de sentimentos em microblogs
title_fullStr	Aprendizado profundo para classificação de sentimentos em microblogs
title_full_unstemmed	Aprendizado profundo para classificação de sentimentos em microblogs
title_sort	Aprendizado profundo para classificação de sentimentos em microblogs
author	Graciano, Gabriel Franco Dias
author_facet	Graciano, Gabriel Franco Dias
author_role	author
dc.contributor.none.fl_str_mv	Carneiro, Murillo Guimarães http://lattes.cnpq.br/8158868389973535 Lopes, Carlos Roberto http://lattes.cnpq.br/6737493567462425 Martins, Luiz Gustavo Almeida http://lattes.cnpq.br/2546751023256424
dc.contributor.author.fl_str_mv	Graciano, Gabriel Franco Dias
dc.subject.por.fl_str_mv	Aprendizado profundo Deep learning Aprendizado de máquina Machine learning Análise de sentimentos Sentiment analysis Redes neurais Neural networks Processamento de línguas naturais Natural language processing CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
topic	Aprendizado profundo Deep learning Aprendizado de máquina Machine learning Análise de sentimentos Sentiment analysis Redes neurais Neural networks Processamento de línguas naturais Natural language processing CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
description	The Internet today represents one of the leading means of communication and information sharing. In Brazil and in several other countries, the vast majority of the population uses it for a variety of activities. Examples include social networks and microblogs, environments where there are frequent manifestation of opinions and discussions about products, services or other subjects of general interest. In this context, the task of sentiment analysis may label opinions expressed in such vehicles as positive, negative or neutral, using algorithms of natural language processing (NLP) and machine learning. The objective of this study was to investigate the application of deep learning techniques to sentiment classification on a Twitter corpus, considering subjects of interest to the population. Specifically, this study considered the manifestation of opinions about the Brazilian presidential elections of 2018. Data collection was done through Tweepy, an application programming interface (API) provided by Twitter, on all days of televised debates. The database was preprocessed using NLP techniques. To perform the experiments, a relevant set of data classification algorithms was selected from the literature (naive Bayes, decision tree, logistic regression and support vector machines), and compared with deep neural networks. The results obtained showed that the logistic regression classifier achieved the best performance among traditional algorithms, with an averaged accuracy of 54%. The predictive performance of the deep neural network was equivalent, achieving the same 54% of averaged accuracy in a given parameter setting. This result is promising because there is a large space for improving the performance of such a technique as a small amount of parameters was studied. In addition, the poor performance of other techniques, such as the naive Bayes, show that the number of labeled tweets (Twitter posts) on the database is still small and that the overall performance of the classification techniques can be improved by labeling more of them.
publishDate	2019
dc.date.none.fl_str_mv	2019-07-22T13:24:00Z 2019-07-22T13:24:00Z 2019-07-12
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/bachelorThesis
format	bachelorThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	GRACIANO, Gabriel Franco Dias. Aprendizado profundo para classificação de sentimentos em microblogs. 2019. 48 f. Trabalho de Conclusão de Curso (Graduação em Sistemas de Informação) - Universidade Federal de Uberlândia, Monte Carmelo, 2019. https://repositorio.ufu.br/handle/123456789/26220
identifier_str_mv	GRACIANO, Gabriel Franco Dias. Aprendizado profundo para classificação de sentimentos em microblogs. 2019. 48 f. Trabalho de Conclusão de Curso (Graduação em Sistemas de Informação) - Universidade Federal de Uberlândia, Monte Carmelo, 2019.
url	https://repositorio.ufu.br/handle/123456789/26220
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	http://creativecommons.org/publicdomain/zero/1.0/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	http://creativecommons.org/publicdomain/zero/1.0/
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidade Federal de Uberlândia Brasil Sistemas de Informação
publisher.none.fl_str_mv	Universidade Federal de Uberlândia Brasil Sistemas de Informação
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFU instname:Universidade Federal de Uberlândia (UFU) instacron:UFU
instname_str	Universidade Federal de Uberlândia (UFU)
instacron_str	UFU
institution	UFU
reponame_str	Repositório Institucional da UFU
collection	Repositório Institucional da UFU
repository.name.fl_str_mv	Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)
repository.mail.fl_str_mv	diinf@dirbi.ufu.br
_version_	1813711332343545856

Aprendizado profundo para classificação de sentimentos em microblogs

Registros relacionados