Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis

Detalhes bibliográficos
Autor(a) principal: Machado, Mateus Tarcinalli
Data de Publicação: 2023
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/
Resumo: This work has as object of study the methods for identifying aspects, which are applications derived from the aspect-based sentiment analysis, and the area of natural language processing. The aspect-based sentiment analysis is focused on analyzing opinionated texts (texts containing opinions) seeking to identify and relate feelings and aspects (characteristics) of a given entity (products, services, among others). This work is focused on the first stage, which is the identification of aspects, emphasizing the so-called implicit aspects, that is, those that are not explicitly mentioned in the texts. In the course of this work, we implemented, adapted, and improved several aspect extraction methods, including frequency-based, rule-based, hybrid, machine learning, pre-trained language models, and large language models. We also developed novel resources such as a corpus extension with annotation of implicit aspects, lexicons of nominal forms, and a typology of implicit aspects. The latter allowed a better understanding of the knowledge necessary to relate an implicit aspect clue with an aspect and its use allowed a clear view of the strengths and weaknesses of each implemented method in relation to the detection of implicit aspects.
id USP_974d12163d0f0a724037f86bf921ffbd
oai_identifier_str oai:teses.usp.br:tde-16012024-151720
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysisMétodos de identificação de aspectos em textos de opinião em português: o caso dos aspectos implícitos e sua análise tipológicaAspectos explícitosAspectos implícitosExplicit aspectsIdentificação de aspectosIdentification of aspectsImplicit aspectsMineração de opiniãoOpinion miningThis work has as object of study the methods for identifying aspects, which are applications derived from the aspect-based sentiment analysis, and the area of natural language processing. The aspect-based sentiment analysis is focused on analyzing opinionated texts (texts containing opinions) seeking to identify and relate feelings and aspects (characteristics) of a given entity (products, services, among others). This work is focused on the first stage, which is the identification of aspects, emphasizing the so-called implicit aspects, that is, those that are not explicitly mentioned in the texts. In the course of this work, we implemented, adapted, and improved several aspect extraction methods, including frequency-based, rule-based, hybrid, machine learning, pre-trained language models, and large language models. We also developed novel resources such as a corpus extension with annotation of implicit aspects, lexicons of nominal forms, and a typology of implicit aspects. The latter allowed a better understanding of the knowledge necessary to relate an implicit aspect clue with an aspect and its use allowed a clear view of the strengths and weaknesses of each implemented method in relation to the detection of implicit aspects.Este trabalho tem como objeto de estudo os métodos para identificação de aspectos, que são aplicações derivadas da análise de sentimentos baseada em aspectos e da área de processamento de linguagem natural. A análise de sentimentos baseada em aspectos é focada em analisar textos avaliativos (textos contendo opiniões) buscando identificar e relacionar sentimentos e aspectos (características) de uma determinada entidade (produtos, serviços entre outros). Este trabalho está focado na primeira etapa, identificação de aspectos, dando ênfase aos chamados aspectos implícitos, ou seja, aqueles que não são mencionados explicitamente nos textos. No decorrer desse trabalho implementamos, adaptamos e melhoramos vários métodos de extração de aspectos, entres eles métodos baseados em frequências, baseados em regras, híbridos, aprendizado de máquina, modelos de linguagem pré-treinados e grandes modelos de linguagem. Também desenvolvemos novos recursos como uma extensão de corpus com anotação de aspectos implícitos, léxicos de formas nominais e uma tipologia de aspectos implícitos. Esta última permitiu um melhor entendimento dos conhecimentos necessários para relacionar uma pista de aspecto implícito com um aspecto e seu uso possibilitou uma visão clara dos pontos fortes e fracos de cada método implementado com relação à detecção de aspectos implícitos.Biblioteca Digitais de Teses e Dissertações da USPPardo, Thiago Alexandre SalgueiroMachado, Mateus Tarcinalli2023-10-16info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-01-16T17:24:02Zoai:teses.usp.br:tde-16012024-151720Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212024-01-16T17:24:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
Métodos de identificação de aspectos em textos de opinião em português: o caso dos aspectos implícitos e sua análise tipológica
title Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
spellingShingle Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
Machado, Mateus Tarcinalli
Aspectos explícitos
Aspectos implícitos
Explicit aspects
Identificação de aspectos
Identification of aspects
Implicit aspects
Mineração de opinião
Opinion mining
title_short Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
title_full Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
title_fullStr Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
title_full_unstemmed Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
title_sort Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
author Machado, Mateus Tarcinalli
author_facet Machado, Mateus Tarcinalli
author_role author
dc.contributor.none.fl_str_mv Pardo, Thiago Alexandre Salgueiro
dc.contributor.author.fl_str_mv Machado, Mateus Tarcinalli
dc.subject.por.fl_str_mv Aspectos explícitos
Aspectos implícitos
Explicit aspects
Identificação de aspectos
Identification of aspects
Implicit aspects
Mineração de opinião
Opinion mining
topic Aspectos explícitos
Aspectos implícitos
Explicit aspects
Identificação de aspectos
Identification of aspects
Implicit aspects
Mineração de opinião
Opinion mining
description This work has as object of study the methods for identifying aspects, which are applications derived from the aspect-based sentiment analysis, and the area of natural language processing. The aspect-based sentiment analysis is focused on analyzing opinionated texts (texts containing opinions) seeking to identify and relate feelings and aspects (characteristics) of a given entity (products, services, among others). This work is focused on the first stage, which is the identification of aspects, emphasizing the so-called implicit aspects, that is, those that are not explicitly mentioned in the texts. In the course of this work, we implemented, adapted, and improved several aspect extraction methods, including frequency-based, rule-based, hybrid, machine learning, pre-trained language models, and large language models. We also developed novel resources such as a corpus extension with annotation of implicit aspects, lexicons of nominal forms, and a typology of implicit aspects. The latter allowed a better understanding of the knowledge necessary to relate an implicit aspect clue with an aspect and its use allowed a clear view of the strengths and weaknesses of each implemented method in relation to the detection of implicit aspects.
publishDate 2023
dc.date.none.fl_str_mv 2023-10-16
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/
url https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815257012247724032