In search of reputation assessment: experiences with polarity classification in RepLab 2013

Saias, José

In search of reputation assessment: experiences with polarity classification in RepLab 2013

Detalhes bibliográficos
Autor(a) principal:	Saias, José
Data de Publicação:	2013
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10174/10352
Resumo:	The diue system uses a supervised Machine Learning approach for the polarity classification subtask of RepLab. We used the Python NLTK for preprocessing, including file parsing, text analysis and feature extraction. Our best solution is a mixed strategy, combining bag-of-words with a limited set of features based on sentiment lexicons and superficial text analysis. This system begins by applying tokenization and lemmatization. Then each tweet content is analyzed and 18 features are obtained, related to presence of polarized term, negation before polarized expression and entity reference. For the first run, the learning and classification were performed with the Decision Tree algorithm, from the NLTK framework. In the second run, we used a pipeline of classifiers. The first classifier applies Naive Bayes in a bag-of-words feature model, with the 1500 most frequent words in the training set. The second classifier used the features from the first run plus another feature with the result from the previous classifier. Our system's best result had 0.54694 Accuracy and 0.31506 in F measure.

Metadados do item

id	RCAP_610a5c8de81c5f42c210249151b9e218
oai_identifier_str	oai:dspace.uevora.pt:10174/10352
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	In search of reputation assessment: experiences with polarity classification in RepLab 2013opinion miningreputation assessmentNLPMachine LearningThe diue system uses a supervised Machine Learning approach for the polarity classification subtask of RepLab. We used the Python NLTK for preprocessing, including file parsing, text analysis and feature extraction. Our best solution is a mixed strategy, combining bag-of-words with a limited set of features based on sentiment lexicons and superficial text analysis. This system begins by applying tokenization and lemmatization. Then each tweet content is analyzed and 18 features are obtained, related to presence of polarized term, negation before polarized expression and entity reference. For the first run, the learning and classification were performed with the Decision Tree algorithm, from the NLTK framework. In the second run, we used a pipeline of classifiers. The first classifier applies Naive Bayes in a bag-of-words feature model, with the 1500 most frequent words in the training set. The second classifier used the features from the first run plus another feature with the result from the previous classifier. Our system's best result had 0.54694 Accuracy and 0.31506 in F measure.clef2013.org2014-01-29T18:38:44Z2014-01-292013-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/10352http://hdl.handle.net/10174/10352engJosé Saias. In search of reputation assessment: Experiences with polarity classification in replab 2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Online Reputation Management (RepLab), Valencia, Spain, September 2013.978-88-904810-5-5http://www.clef-initiative.eu/documents/71612/10fcd949-e5f0-4f00-8e01-cbd2a213e147jsaias@uevora.pt283Saias, Joséinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T18:53:03Zoai:dspace.uevora.pt:10174/10352Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:04:13.620487Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	In search of reputation assessment: experiences with polarity classification in RepLab 2013
title	In search of reputation assessment: experiences with polarity classification in RepLab 2013
spellingShingle	In search of reputation assessment: experiences with polarity classification in RepLab 2013 Saias, José opinion mining reputation assessment NLP Machine Learning
title_short	In search of reputation assessment: experiences with polarity classification in RepLab 2013
title_full	In search of reputation assessment: experiences with polarity classification in RepLab 2013
title_fullStr	In search of reputation assessment: experiences with polarity classification in RepLab 2013
title_full_unstemmed	In search of reputation assessment: experiences with polarity classification in RepLab 2013
title_sort	In search of reputation assessment: experiences with polarity classification in RepLab 2013
author	Saias, José
author_facet	Saias, José
author_role	author
dc.contributor.author.fl_str_mv	Saias, José
dc.subject.por.fl_str_mv	opinion mining reputation assessment NLP Machine Learning
topic	opinion mining reputation assessment NLP Machine Learning
description	The diue system uses a supervised Machine Learning approach for the polarity classification subtask of RepLab. We used the Python NLTK for preprocessing, including file parsing, text analysis and feature extraction. Our best solution is a mixed strategy, combining bag-of-words with a limited set of features based on sentiment lexicons and superficial text analysis. This system begins by applying tokenization and lemmatization. Then each tweet content is analyzed and 18 features are obtained, related to presence of polarized term, negation before polarized expression and entity reference. For the first run, the learning and classification were performed with the Decision Tree algorithm, from the NLTK framework. In the second run, we used a pipeline of classifiers. The first classifier applies Naive Bayes in a bag-of-words feature model, with the 1500 most frequent words in the training set. The second classifier used the features from the first run plus another feature with the result from the previous classifier. Our system's best result had 0.54694 Accuracy and 0.31506 in F measure.
publishDate	2013
dc.date.none.fl_str_mv	2013-09-01T00:00:00Z 2014-01-29T18:38:44Z 2014-01-29
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10174/10352 http://hdl.handle.net/10174/10352
url	http://hdl.handle.net/10174/10352
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	José Saias. In search of reputation assessment: Experiences with polarity classification in replab 2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Online Reputation Management (RepLab), Valencia, Spain, September 2013. 978-88-904810-5-5 http://www.clef-initiative.eu/documents/71612/10fcd949-e5f0-4f00-8e01-cbd2a213e147 jsaias@uevora.pt 283
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	clef2013.org
publisher.none.fl_str_mv	clef2013.org
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136526373748736

In search of reputation assessment: experiences with polarity classification in RepLab 2013

Registros relacionados