Part of speech and tagging in Computational Linguistics
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Calidoscópio (Online) |
Texto Completo: | https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003 |
Resumo: | The categorization of words according to features that determine the position they occupy in the language system is a formal requirement of any grammatical description. In Computational Linguistics, tagging is the assignment of categories to portions of a text. The objective of this paper is to discuss, in the context of Computational Linguistics, the source of linguistic information in POS tagging – part of speech, in English. As we present a critical view of this process, it becomes clear that the linguist has a very relevant part to play in the elaboration theoretically sound tagsets for Natural Language Processing. We focus, in particular, three part of speech related language phenomena that have notoriously been overlooked in linguistic studies: the participle verb form, the denotative words, and the appositive. Key words: tagset, participle, appositive, denotative words, computational linguistics, NLP. |
id |
Unisinos-3_e61360c0e74e1df8e0d6c7f6bb267853 |
---|---|
oai_identifier_str |
oai:ojs2.revistas.unisinos.br:article/6003 |
network_acronym_str |
Unisinos-3 |
network_name_str |
Calidoscópio (Online) |
repository_id_str |
|
spelling |
Part of speech and tagging in Computational LinguisticsClasses de palavras e etiquetagem na Lingüística ComputacionalThe categorization of words according to features that determine the position they occupy in the language system is a formal requirement of any grammatical description. In Computational Linguistics, tagging is the assignment of categories to portions of a text. The objective of this paper is to discuss, in the context of Computational Linguistics, the source of linguistic information in POS tagging – part of speech, in English. As we present a critical view of this process, it becomes clear that the linguist has a very relevant part to play in the elaboration theoretically sound tagsets for Natural Language Processing. We focus, in particular, three part of speech related language phenomena that have notoriously been overlooked in linguistic studies: the participle verb form, the denotative words, and the appositive. Key words: tagset, participle, appositive, denotative words, computational linguistics, NLP.A categorização da palavra de acordo com traços que a posicionam dentro do sistema lingüístico é um elemento formal subjacente a qualquer descrição gramatical. Na Lingüística Computacional, etiquetagem consiste na atribuição de categorias a porções do texto. O objetivo desse artigo é discutir, no contexto da Lingüística Computacional, a procedência da informação lingüística nos conjuntos de etiquetas de POS – do inglês part of speech. Ao longo da discussão evidenciamos a relevância da participação do lingüista na compilação teoricamente bem fundamentada dos conjuntos de etiquetas da prática do Processamento de Linguagem Natural (PLN). Direcionamos nosso olhar, especificamente, para fenômenos relacionados à anotação por classes de palavras, mas que têm recebido um tratamento secundário por parte da lingüística - como as formas nominais do verbo, notadamente o particípio, as palavras denotativas e o aposto. Palavras-chave: conjunto de etiquetas, particípio, aposto, palavras denotativas, lingüística computacional, PLN.Unisinos2021-05-27info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.unisinos.br/index.php/calidoscopio/article/view/6003Calidoscópio; Vol. 4 No. 3 (2006): September/December; 179-188Calidoscópio; v. 4 n. 3 (2006): Setembro/Dezembro; 179-1882177-6202reponame:Calidoscópio (Online)instname:Universidade do Vale do Rio dos Sinos (UNISINOS)instacron:Unisinosporhttps://revistas.unisinos.br/index.php/calidoscopio/article/view/6003/3179Copyright (c) 2021 Calidoscópioinfo:eu-repo/semantics/openAccessOliveira, ClaudiaFreitas, Maria Claudia de2021-05-27T18:20:42Zoai:ojs2.revistas.unisinos.br:article/6003Revistahttps://revistas.unisinos.br/index.php/calidoscopioPUBhttps://revistas.unisinos.br/index.php/calidoscopio/oaicmira@unisinos.br || cmira@unisinos.br2177-62022177-6202opendoar:2021-05-27T18:20:42Calidoscópio (Online) - Universidade do Vale do Rio dos Sinos (UNISINOS)false |
dc.title.none.fl_str_mv |
Part of speech and tagging in Computational Linguistics Classes de palavras e etiquetagem na Lingüística Computacional |
title |
Part of speech and tagging in Computational Linguistics |
spellingShingle |
Part of speech and tagging in Computational Linguistics Oliveira, Claudia |
title_short |
Part of speech and tagging in Computational Linguistics |
title_full |
Part of speech and tagging in Computational Linguistics |
title_fullStr |
Part of speech and tagging in Computational Linguistics |
title_full_unstemmed |
Part of speech and tagging in Computational Linguistics |
title_sort |
Part of speech and tagging in Computational Linguistics |
author |
Oliveira, Claudia |
author_facet |
Oliveira, Claudia Freitas, Maria Claudia de |
author_role |
author |
author2 |
Freitas, Maria Claudia de |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Oliveira, Claudia Freitas, Maria Claudia de |
description |
The categorization of words according to features that determine the position they occupy in the language system is a formal requirement of any grammatical description. In Computational Linguistics, tagging is the assignment of categories to portions of a text. The objective of this paper is to discuss, in the context of Computational Linguistics, the source of linguistic information in POS tagging – part of speech, in English. As we present a critical view of this process, it becomes clear that the linguist has a very relevant part to play in the elaboration theoretically sound tagsets for Natural Language Processing. We focus, in particular, three part of speech related language phenomena that have notoriously been overlooked in linguistic studies: the participle verb form, the denotative words, and the appositive. Key words: tagset, participle, appositive, denotative words, computational linguistics, NLP. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-05-27 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003 |
url |
https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
https://revistas.unisinos.br/index.php/calidoscopio/article/view/6003/3179 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2021 Calidoscópio info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2021 Calidoscópio |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Unisinos |
publisher.none.fl_str_mv |
Unisinos |
dc.source.none.fl_str_mv |
Calidoscópio; Vol. 4 No. 3 (2006): September/December; 179-188 Calidoscópio; v. 4 n. 3 (2006): Setembro/Dezembro; 179-188 2177-6202 reponame:Calidoscópio (Online) instname:Universidade do Vale do Rio dos Sinos (UNISINOS) instacron:Unisinos |
instname_str |
Universidade do Vale do Rio dos Sinos (UNISINOS) |
instacron_str |
Unisinos |
institution |
Unisinos |
reponame_str |
Calidoscópio (Online) |
collection |
Calidoscópio (Online) |
repository.name.fl_str_mv |
Calidoscópio (Online) - Universidade do Vale do Rio dos Sinos (UNISINOS) |
repository.mail.fl_str_mv |
cmira@unisinos.br || cmira@unisinos.br |
_version_ |
1792203885795868672 |