Unsupervised Dialogue Act Classification with Optimum-Path Forest

Ribeiro, Luiz Carlos Felix [UNESP]; Papa, João Paulo [UNESP]

Unsupervised Dialogue Act Classification with Optimum-Path Forest

Bibliographic Details
Main Author:	Ribeiro, Luiz Carlos Felix [UNESP]
Publication Date:	2019
Other Authors:	Papa, João Paulo [UNESP]
Format:	Conference object
Language:	eng
Source:	Repositório Institucional da UNESP
Download full:	http://dx.doi.org/10.1109/SIBGRAPI.2018.00010 http://hdl.handle.net/11449/190145
Summary:	Dialogue Act classification is a relevant problem for the Natural Language Processing field either as a standalone task or when used as input for downstream applications. Despite its importance, most of the existing approaches rely on supervised techniques, which depend on annotated samples, making it difficult to take advantage of the increasing amount of data available in different domains. In this paper, we briefly review the most commonly used datasets to evaluate Dialogue Act classification approaches and introduce the Optimum-Path Forest (OPF) classifier to this task. Instead of using its original strategy to determine the corresponding class for each cluster, we use a modified version based on majority voting, named M-OPF, which yields good results when compared to k-means and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), according to accuracy and V-measure. We also show that M-OPF, and consequently OPF, are less sensitive to hyper-parameter tuning when compared to HDBSCAN.

Item metadata

id	UNSP_3bfa742dbaf9de0e339d96caebb48857
oai_identifier_str	oai:repositorio.unesp.br:11449/190145
network_acronym_str	UNSP
network_name_str	Repositório Institucional da UNESP
repository_id_str	2946
spelling	Unsupervised Dialogue Act Classification with Optimum-Path ForestClusteringDialog ActOptimum Path ForestDialogue Act classification is a relevant problem for the Natural Language Processing field either as a standalone task or when used as input for downstream applications. Despite its importance, most of the existing approaches rely on supervised techniques, which depend on annotated samples, making it difficult to take advantage of the increasing amount of data available in different domains. In this paper, we briefly review the most commonly used datasets to evaluate Dialogue Act classification approaches and introduce the Optimum-Path Forest (OPF) classifier to this task. Instead of using its original strategy to determine the corresponding class for each cluster, we use a modified version based on majority voting, named M-OPF, which yields good results when compared to k-means and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), according to accuracy and V-measure. We also show that M-OPF, and consequently OPF, are less sensitive to hyper-parameter tuning when compared to HDBSCAN.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Department of Computing São Paulo State University - UNESPDepartment of Computing São Paulo State University - UNESPFAPESP: #2013/07375-0FAPESP: #2014/12236-1FAPESP: #2014/16250-9FAPESP: #2016/19403-6CNPq: #307066/2017-7Universidade Estadual Paulista (Unesp)Ribeiro, Luiz Carlos Felix [UNESP]Papa, João Paulo [UNESP]2019-10-06T17:03:44Z2019-10-06T17:03:44Z2019-01-15info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject25-32http://dx.doi.org/10.1109/SIBGRAPI.2018.00010Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, p. 25-32.http://hdl.handle.net/11449/19014510.1109/SIBGRAPI.2018.000102-s2.0-85062206998Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengProceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018info:eu-repo/semantics/openAccess2024-04-23T16:11:33Zoai:repositorio.unesp.br:11449/190145Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-04-23T16:11:33Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv	Unsupervised Dialogue Act Classification with Optimum-Path Forest
title	Unsupervised Dialogue Act Classification with Optimum-Path Forest
spellingShingle	Unsupervised Dialogue Act Classification with Optimum-Path Forest Ribeiro, Luiz Carlos Felix [UNESP] Clustering Dialog Act Optimum Path Forest
title_short	Unsupervised Dialogue Act Classification with Optimum-Path Forest
title_full	Unsupervised Dialogue Act Classification with Optimum-Path Forest
title_fullStr	Unsupervised Dialogue Act Classification with Optimum-Path Forest
title_full_unstemmed	Unsupervised Dialogue Act Classification with Optimum-Path Forest
title_sort	Unsupervised Dialogue Act Classification with Optimum-Path Forest
author	Ribeiro, Luiz Carlos Felix [UNESP]
author_facet	Ribeiro, Luiz Carlos Felix [UNESP] Papa, João Paulo [UNESP]
author_role	author
author2	Papa, João Paulo [UNESP]
author2_role	author
dc.contributor.none.fl_str_mv	Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv	Ribeiro, Luiz Carlos Felix [UNESP] Papa, João Paulo [UNESP]
dc.subject.por.fl_str_mv	Clustering Dialog Act Optimum Path Forest
topic	Clustering Dialog Act Optimum Path Forest
description	Dialogue Act classification is a relevant problem for the Natural Language Processing field either as a standalone task or when used as input for downstream applications. Despite its importance, most of the existing approaches rely on supervised techniques, which depend on annotated samples, making it difficult to take advantage of the increasing amount of data available in different domains. In this paper, we briefly review the most commonly used datasets to evaluate Dialogue Act classification approaches and introduce the Optimum-Path Forest (OPF) classifier to this task. Instead of using its original strategy to determine the corresponding class for each cluster, we use a modified version based on majority voting, named M-OPF, which yields good results when compared to k-means and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), according to accuracy and V-measure. We also show that M-OPF, and consequently OPF, are less sensitive to hyper-parameter tuning when compared to HDBSCAN.
publishDate	2019
dc.date.none.fl_str_mv	2019-10-06T17:03:44Z 2019-10-06T17:03:44Z 2019-01-15
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/conferenceObject
format	conferenceObject
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://dx.doi.org/10.1109/SIBGRAPI.2018.00010 Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, p. 25-32. http://hdl.handle.net/11449/190145 10.1109/SIBGRAPI.2018.00010 2-s2.0-85062206998
url	http://dx.doi.org/10.1109/SIBGRAPI.2018.00010 http://hdl.handle.net/11449/190145
identifier_str_mv	Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, p. 25-32. 10.1109/SIBGRAPI.2018.00010 2-s2.0-85062206998
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	Proceedings - 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	25-32
dc.source.none.fl_str_mv	Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP
instname_str	Universidade Estadual Paulista (UNESP)
instacron_str	UNESP
institution	UNESP
reponame_str	Repositório Institucional da UNESP
collection	Repositório Institucional da UNESP
repository.name.fl_str_mv	Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_	1797790225753178112

Unsupervised Dialogue Act Classification with Optimum-Path Forest

Similar Items