A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.

Detalhes bibliográficos
Autor(a) principal: Sequeira, João
Data de Publicação: 2018
Outros Autores: Gonçalves, Teresa, Quaresma, Paulo, Mendes, Amália, Hendrickx, Iris
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10451/37352
Resumo: This work presents a comparative study between two different approaches to build an automatic classification system for Modalityvalues in the Portuguese language. One approach uses a single multi-class classifier with the full dataset that includes eleven modal verbs; the other builds different classifiers, one for each verb. The performance is measured using precision, recall and F1. Due to the unbalanced nature of the dataset a weighted average approach was calculated for each metric. We use support vector machines as ourclassifier and experimented with various SVM kernels to find the optimal classifier for the task at hand. We experimented with several different types of feature attributes representing parse tree information and compare these complex feature representation against a simple bag-of-words feature representation as baseline. The best obtained F1values are above 0.60 and from the results it is possible to conclude that there is no significant difference between both approaches.
id RCAP_1677a8bc38fd7b1f2e409c7db1e1d7ae
oai_identifier_str oai:repositorio.ul.pt:10451/37352
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.Natural language processingModalityFeature selectionSupport Vector MachinesThis work presents a comparative study between two different approaches to build an automatic classification system for Modalityvalues in the Portuguese language. One approach uses a single multi-class classifier with the full dataset that includes eleven modal verbs; the other builds different classifiers, one for each verb. The performance is measured using precision, recall and F1. Due to the unbalanced nature of the dataset a weighted average approach was calculated for each metric. We use support vector machines as ourclassifier and experimented with various SVM kernels to find the optimal classifier for the task at hand. We experimented with several different types of feature attributes representing parse tree information and compare these complex feature representation against a simple bag-of-words feature representation as baseline. The best obtained F1values are above 0.60 and from the results it is possible to conclude that there is no significant difference between both approaches.European Language Resources AssociationRepositório da Universidade de LisboaSequeira, JoãoGonçalves, TeresaQuaresma, PauloMendes, AmáliaHendrickx, Iris2019-03-07T14:22:05Z20182018-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10451/37352engSequeira, João, Teresa Gonçalves, Paulo Quaresma, Amália Mendes, Iris Hendrickx (2018) A Multi-versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language. In Proceedings of the 11th Language Resources and Evaluation Conference - LREC’2018, 7-12 May 2018, Miyazaki, Japan, pp. 1000-1005.979-10-95546-00-9info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T16:33:58Zoai:repositorio.ul.pt:10451/37352Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:51:09.974496Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
title A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
spellingShingle A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
Sequeira, João
Natural language processing
Modality
Feature selection
Support Vector Machines
title_short A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
title_full A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
title_fullStr A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
title_full_unstemmed A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
title_sort A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language.
author Sequeira, João
author_facet Sequeira, João
Gonçalves, Teresa
Quaresma, Paulo
Mendes, Amália
Hendrickx, Iris
author_role author
author2 Gonçalves, Teresa
Quaresma, Paulo
Mendes, Amália
Hendrickx, Iris
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Sequeira, João
Gonçalves, Teresa
Quaresma, Paulo
Mendes, Amália
Hendrickx, Iris
dc.subject.por.fl_str_mv Natural language processing
Modality
Feature selection
Support Vector Machines
topic Natural language processing
Modality
Feature selection
Support Vector Machines
description This work presents a comparative study between two different approaches to build an automatic classification system for Modalityvalues in the Portuguese language. One approach uses a single multi-class classifier with the full dataset that includes eleven modal verbs; the other builds different classifiers, one for each verb. The performance is measured using precision, recall and F1. Due to the unbalanced nature of the dataset a weighted average approach was calculated for each metric. We use support vector machines as ourclassifier and experimented with various SVM kernels to find the optimal classifier for the task at hand. We experimented with several different types of feature attributes representing parse tree information and compare these complex feature representation against a simple bag-of-words feature representation as baseline. The best obtained F1values are above 0.60 and from the results it is possible to conclude that there is no significant difference between both approaches.
publishDate 2018
dc.date.none.fl_str_mv 2018
2018-01-01T00:00:00Z
2019-03-07T14:22:05Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10451/37352
url http://hdl.handle.net/10451/37352
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Sequeira, João, Teresa Gonçalves, Paulo Quaresma, Amália Mendes, Iris Hendrickx (2018) A Multi-versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language. In Proceedings of the 11th Language Resources and Evaluation Conference - LREC’2018, 7-12 May 2018, Miyazaki, Japan, pp. 1000-1005.
979-10-95546-00-9
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv European Language Resources Association
publisher.none.fl_str_mv European Language Resources Association
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134447366307840