Part of speech and tagging in Computational Linguistics

Detalhes bibliográficos
Autor(a) principal: Oliveira, Claudia
Data de Publicação: 2021
Outros Autores: Freitas, Maria Claudia de
Tipo de documento: Artigo
Idioma: por
Título da fonte: Calidoscópio (Online)
Texto Completo: https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003
Resumo: The categorization of words according to features that determine the position they occupy in the language system is a formal requirement of any grammatical description. In Computational Linguistics, tagging is the assignment of categories to portions of a text. The objective of this paper is to discuss, in the context of Computational Linguistics, the source of linguistic information in POS tagging – part of speech, in English. As we present a critical view of this process, it becomes clear that the linguist has a very relevant part to play in the elaboration theoretically sound tagsets for Natural Language Processing. We focus, in particular, three part of speech related language phenomena that have notoriously been overlooked in linguistic studies: the participle verb form, the denotative words, and the appositive. Key words: tagset, participle, appositive, denotative words, computational linguistics, NLP.
id Unisinos-3_e61360c0e74e1df8e0d6c7f6bb267853
oai_identifier_str oai:ojs2.revistas.unisinos.br:article/6003
network_acronym_str Unisinos-3
network_name_str Calidoscópio (Online)
repository_id_str
spelling Part of speech and tagging in Computational LinguisticsClasses de palavras e etiquetagem na Lingüística ComputacionalThe categorization of words according to features that determine the position they occupy in the language system is a formal requirement of any grammatical description. In Computational Linguistics, tagging is the assignment of categories to portions of a text. The objective of this paper is to discuss, in the context of Computational Linguistics, the source of linguistic information in POS tagging – part of speech, in English. As we present a critical view of this process, it becomes clear that the linguist has a very relevant part to play in the elaboration theoretically sound tagsets for Natural Language Processing. We focus, in particular, three part of speech related language phenomena that have notoriously been overlooked in linguistic studies: the participle verb form, the denotative words, and the appositive. Key words: tagset, participle, appositive, denotative words, computational linguistics, NLP.A categorização da palavra de acordo com traços que a posicionam dentro do sistema lingüístico é um elemento formal subjacente a qualquer descrição gramatical. Na Lingüística Computacional, etiquetagem consiste na atribuição de categorias a porções do texto. O objetivo desse artigo é discutir, no contexto da Lingüística Computacional, a procedência da informação lingüística nos conjuntos de etiquetas de POS – do inglês part of speech. Ao longo da discussão evidenciamos a relevância da participação do lingüista na compilação teoricamente bem fundamentada dos conjuntos de etiquetas da prática do Processamento de Linguagem Natural (PLN). Direcionamos nosso olhar, especificamente, para fenômenos relacionados à anotação por classes de palavras, mas que têm recebido um tratamento secundário por parte da lingüística - como as formas nominais do verbo, notadamente o particípio, as palavras denotativas e o aposto. Palavras-chave: conjunto de etiquetas, particípio, aposto, palavras denotativas, lingüística computacional, PLN.Unisinos2021-05-27info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.unisinos.br/index.php/calidoscopio/article/view/6003Calidoscópio; Vol. 4 No. 3 (2006): September/December; 179-188Calidoscópio; v. 4 n. 3 (2006): Setembro/Dezembro; 179-1882177-6202reponame:Calidoscópio (Online)instname:Universidade do Vale do Rio dos Sinos (UNISINOS)instacron:Unisinosporhttps://revistas.unisinos.br/index.php/calidoscopio/article/view/6003/3179Copyright (c) 2021 Calidoscópioinfo:eu-repo/semantics/openAccessOliveira, ClaudiaFreitas, Maria Claudia de2021-05-27T18:20:42Zoai:ojs2.revistas.unisinos.br:article/6003Revistahttps://revistas.unisinos.br/index.php/calidoscopioPUBhttps://revistas.unisinos.br/index.php/calidoscopio/oaicmira@unisinos.br || cmira@unisinos.br2177-62022177-6202opendoar:2021-05-27T18:20:42Calidoscópio (Online) - Universidade do Vale do Rio dos Sinos (UNISINOS)false
dc.title.none.fl_str_mv Part of speech and tagging in Computational Linguistics
Classes de palavras e etiquetagem na Lingüística Computacional
title Part of speech and tagging in Computational Linguistics
spellingShingle Part of speech and tagging in Computational Linguistics
Oliveira, Claudia
title_short Part of speech and tagging in Computational Linguistics
title_full Part of speech and tagging in Computational Linguistics
title_fullStr Part of speech and tagging in Computational Linguistics
title_full_unstemmed Part of speech and tagging in Computational Linguistics
title_sort Part of speech and tagging in Computational Linguistics
author Oliveira, Claudia
author_facet Oliveira, Claudia
Freitas, Maria Claudia de
author_role author
author2 Freitas, Maria Claudia de
author2_role author
dc.contributor.author.fl_str_mv Oliveira, Claudia
Freitas, Maria Claudia de
description The categorization of words according to features that determine the position they occupy in the language system is a formal requirement of any grammatical description. In Computational Linguistics, tagging is the assignment of categories to portions of a text. The objective of this paper is to discuss, in the context of Computational Linguistics, the source of linguistic information in POS tagging – part of speech, in English. As we present a critical view of this process, it becomes clear that the linguist has a very relevant part to play in the elaboration theoretically sound tagsets for Natural Language Processing. We focus, in particular, three part of speech related language phenomena that have notoriously been overlooked in linguistic studies: the participle verb form, the denotative words, and the appositive. Key words: tagset, participle, appositive, denotative words, computational linguistics, NLP.
publishDate 2021
dc.date.none.fl_str_mv 2021-05-27
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003
url https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003/3179
dc.rights.driver.fl_str_mv Copyright (c) 2021 Calidoscópio
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2021 Calidoscópio
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Unisinos
publisher.none.fl_str_mv Unisinos
dc.source.none.fl_str_mv Calidoscópio; Vol. 4 No. 3 (2006): September/December; 179-188
Calidoscópio; v. 4 n. 3 (2006): Setembro/Dezembro; 179-188
2177-6202
reponame:Calidoscópio (Online)
instname:Universidade do Vale do Rio dos Sinos (UNISINOS)
instacron:Unisinos
instname_str Universidade do Vale do Rio dos Sinos (UNISINOS)
instacron_str Unisinos
institution Unisinos
reponame_str Calidoscópio (Online)
collection Calidoscópio (Online)
repository.name.fl_str_mv Calidoscópio (Online) - Universidade do Vale do Rio dos Sinos (UNISINOS)
repository.mail.fl_str_mv cmira@unisinos.br || cmira@unisinos.br
_version_ 1792203885795868672