Unsupervised Dialogue Act Classification with Optimum-Path Forest
Main Author: | |
---|---|
Publication Date: | 2019 |
Other Authors: | |
Format: | Conference object |
Language: | eng |
Source: | Repositório Institucional da UNESP |
Download full: | http://dx.doi.org/10.1109/SIBGRAPI.2018.00010 http://hdl.handle.net/11449/190145 |
Summary: | Dialogue Act classification is a relevant problem for the Natural Language Processing field either as a standalone task or when used as input for downstream applications. Despite its importance, most of the existing approaches rely on supervised techniques, which depend on annotated samples, making it difficult to take advantage of the increasing amount of data available in different domains. In this paper, we briefly review the most commonly used datasets to evaluate Dialogue Act classification approaches and introduce the Optimum-Path Forest (OPF) classifier to this task. Instead of using its original strategy to determine the corresponding class for each cluster, we use a modified version based on majority voting, named M-OPF, which yields good results when compared to k-means and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), according to accuracy and V-measure. We also show that M-OPF, and consequently OPF, are less sensitive to hyper-parameter tuning when compared to HDBSCAN. |
id |
UNSP_3bfa742dbaf9de0e339d96caebb48857 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/190145 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Unsupervised Dialogue Act Classification with Optimum-Path ForestClusteringDialog ActOptimum Path ForestDialogue Act classification is a relevant problem for the Natural Language Processing field either as a standalone task or when used as input for downstream applications. Despite its importance, most of the existing approaches rely on supervised techniques, which depend on annotated samples, making it difficult to take advantage of the increasing amount of data available in different domains. In this paper, we briefly review the most commonly used datasets to evaluate Dialogue Act classification approaches and introduce the Optimum-Path Forest (OPF) classifier to this task. Instead of using its original strategy to determine the corresponding class for each cluster, we use a modified version based on majority voting, named M-OPF, which yields good results when compared to k-means and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), according to accuracy and V-measure. We also show that M-OPF, and consequently OPF, are less sensitive to hyper-parameter tuning when compared to HDBSCAN.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Department of Computing São Paulo State University - UNESPDepartment of Computing São Paulo State University - UNESPFAPESP: #2013/07375-0FAPESP: #2014/12236-1FAPESP: #2014/16250-9FAPESP: #2016/19403-6CNPq: #307066/2017-7Universidade Estadual Paulista (Unesp)Ribeiro, Luiz Carlos Felix [UNESP]Papa, João Paulo [UNESP]2019-10-06T17:03:44Z2019-10-06T17:03:44Z2019-01-15info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject25-32http://dx.doi.org/10.1109/SIBGRAPI.2018.00010Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, p. 25-32.http://hdl.handle.net/11449/19014510.1109/SIBGRAPI.2018.000102-s2.0-85062206998Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengProceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018info:eu-repo/semantics/openAccess2024-04-23T16:11:33Zoai:repositorio.unesp.br:11449/190145Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-04-23T16:11:33Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
title |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
spellingShingle |
Unsupervised Dialogue Act Classification with Optimum-Path Forest Ribeiro, Luiz Carlos Felix [UNESP] Clustering Dialog Act Optimum Path Forest |
title_short |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
title_full |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
title_fullStr |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
title_full_unstemmed |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
title_sort |
Unsupervised Dialogue Act Classification with Optimum-Path Forest |
author |
Ribeiro, Luiz Carlos Felix [UNESP] |
author_facet |
Ribeiro, Luiz Carlos Felix [UNESP] Papa, João Paulo [UNESP] |
author_role |
author |
author2 |
Papa, João Paulo [UNESP] |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Ribeiro, Luiz Carlos Felix [UNESP] Papa, João Paulo [UNESP] |
dc.subject.por.fl_str_mv |
Clustering Dialog Act Optimum Path Forest |
topic |
Clustering Dialog Act Optimum Path Forest |
description |
Dialogue Act classification is a relevant problem for the Natural Language Processing field either as a standalone task or when used as input for downstream applications. Despite its importance, most of the existing approaches rely on supervised techniques, which depend on annotated samples, making it difficult to take advantage of the increasing amount of data available in different domains. In this paper, we briefly review the most commonly used datasets to evaluate Dialogue Act classification approaches and introduce the Optimum-Path Forest (OPF) classifier to this task. Instead of using its original strategy to determine the corresponding class for each cluster, we use a modified version based on majority voting, named M-OPF, which yields good results when compared to k-means and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), according to accuracy and V-measure. We also show that M-OPF, and consequently OPF, are less sensitive to hyper-parameter tuning when compared to HDBSCAN. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-10-06T17:03:44Z 2019-10-06T17:03:44Z 2019-01-15 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1109/SIBGRAPI.2018.00010 Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, p. 25-32. http://hdl.handle.net/11449/190145 10.1109/SIBGRAPI.2018.00010 2-s2.0-85062206998 |
url |
http://dx.doi.org/10.1109/SIBGRAPI.2018.00010 http://hdl.handle.net/11449/190145 |
identifier_str_mv |
Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, p. 25-32. 10.1109/SIBGRAPI.2018.00010 2-s2.0-85062206998 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
25-32 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1797790225753178112 |