Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/ |
Resumo: | This work has as object of study the methods for identifying aspects, which are applications derived from the aspect-based sentiment analysis, and the area of natural language processing. The aspect-based sentiment analysis is focused on analyzing opinionated texts (texts containing opinions) seeking to identify and relate feelings and aspects (characteristics) of a given entity (products, services, among others). This work is focused on the first stage, which is the identification of aspects, emphasizing the so-called implicit aspects, that is, those that are not explicitly mentioned in the texts. In the course of this work, we implemented, adapted, and improved several aspect extraction methods, including frequency-based, rule-based, hybrid, machine learning, pre-trained language models, and large language models. We also developed novel resources such as a corpus extension with annotation of implicit aspects, lexicons of nominal forms, and a typology of implicit aspects. The latter allowed a better understanding of the knowledge necessary to relate an implicit aspect clue with an aspect and its use allowed a clear view of the strengths and weaknesses of each implemented method in relation to the detection of implicit aspects. |
id |
USP_974d12163d0f0a724037f86bf921ffbd |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-16012024-151720 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysisMétodos de identificação de aspectos em textos de opinião em português: o caso dos aspectos implícitos e sua análise tipológicaAspectos explícitosAspectos implícitosExplicit aspectsIdentificação de aspectosIdentification of aspectsImplicit aspectsMineração de opiniãoOpinion miningThis work has as object of study the methods for identifying aspects, which are applications derived from the aspect-based sentiment analysis, and the area of natural language processing. The aspect-based sentiment analysis is focused on analyzing opinionated texts (texts containing opinions) seeking to identify and relate feelings and aspects (characteristics) of a given entity (products, services, among others). This work is focused on the first stage, which is the identification of aspects, emphasizing the so-called implicit aspects, that is, those that are not explicitly mentioned in the texts. In the course of this work, we implemented, adapted, and improved several aspect extraction methods, including frequency-based, rule-based, hybrid, machine learning, pre-trained language models, and large language models. We also developed novel resources such as a corpus extension with annotation of implicit aspects, lexicons of nominal forms, and a typology of implicit aspects. The latter allowed a better understanding of the knowledge necessary to relate an implicit aspect clue with an aspect and its use allowed a clear view of the strengths and weaknesses of each implemented method in relation to the detection of implicit aspects.Este trabalho tem como objeto de estudo os métodos para identificação de aspectos, que são aplicações derivadas da análise de sentimentos baseada em aspectos e da área de processamento de linguagem natural. A análise de sentimentos baseada em aspectos é focada em analisar textos avaliativos (textos contendo opiniões) buscando identificar e relacionar sentimentos e aspectos (características) de uma determinada entidade (produtos, serviços entre outros). Este trabalho está focado na primeira etapa, identificação de aspectos, dando ênfase aos chamados aspectos implícitos, ou seja, aqueles que não são mencionados explicitamente nos textos. No decorrer desse trabalho implementamos, adaptamos e melhoramos vários métodos de extração de aspectos, entres eles métodos baseados em frequências, baseados em regras, híbridos, aprendizado de máquina, modelos de linguagem pré-treinados e grandes modelos de linguagem. Também desenvolvemos novos recursos como uma extensão de corpus com anotação de aspectos implícitos, léxicos de formas nominais e uma tipologia de aspectos implícitos. Esta última permitiu um melhor entendimento dos conhecimentos necessários para relacionar uma pista de aspecto implícito com um aspecto e seu uso possibilitou uma visão clara dos pontos fortes e fracos de cada método implementado com relação à detecção de aspectos implícitos.Biblioteca Digitais de Teses e Dissertações da USPPardo, Thiago Alexandre SalgueiroMachado, Mateus Tarcinalli2023-10-16info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-01-16T17:24:02Zoai:teses.usp.br:tde-16012024-151720Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212024-01-16T17:24:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis Métodos de identificação de aspectos em textos de opinião em português: o caso dos aspectos implícitos e sua análise tipológica |
title |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis |
spellingShingle |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis Machado, Mateus Tarcinalli Aspectos explícitos Aspectos implícitos Explicit aspects Identificação de aspectos Identification of aspects Implicit aspects Mineração de opinião Opinion mining |
title_short |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis |
title_full |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis |
title_fullStr |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis |
title_full_unstemmed |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis |
title_sort |
Methods for identifying aspects in opinion texts in Portuguese: the case of implicit aspects and their typological analysis |
author |
Machado, Mateus Tarcinalli |
author_facet |
Machado, Mateus Tarcinalli |
author_role |
author |
dc.contributor.none.fl_str_mv |
Pardo, Thiago Alexandre Salgueiro |
dc.contributor.author.fl_str_mv |
Machado, Mateus Tarcinalli |
dc.subject.por.fl_str_mv |
Aspectos explícitos Aspectos implícitos Explicit aspects Identificação de aspectos Identification of aspects Implicit aspects Mineração de opinião Opinion mining |
topic |
Aspectos explícitos Aspectos implícitos Explicit aspects Identificação de aspectos Identification of aspects Implicit aspects Mineração de opinião Opinion mining |
description |
This work has as object of study the methods for identifying aspects, which are applications derived from the aspect-based sentiment analysis, and the area of natural language processing. The aspect-based sentiment analysis is focused on analyzing opinionated texts (texts containing opinions) seeking to identify and relate feelings and aspects (characteristics) of a given entity (products, services, among others). This work is focused on the first stage, which is the identification of aspects, emphasizing the so-called implicit aspects, that is, those that are not explicitly mentioned in the texts. In the course of this work, we implemented, adapted, and improved several aspect extraction methods, including frequency-based, rule-based, hybrid, machine learning, pre-trained language models, and large language models. We also developed novel resources such as a corpus extension with annotation of implicit aspects, lexicons of nominal forms, and a typology of implicit aspects. The latter allowed a better understanding of the knowledge necessary to relate an implicit aspect clue with an aspect and its use allowed a clear view of the strengths and weaknesses of each implemented method in relation to the detection of implicit aspects. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-10-16 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/ |
url |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16012024-151720/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1815257012247724032 |