Annotation of Named Entities in the Gaming domain

Detalhes bibliográficos
Autor(a) principal: Silva, Rita
Data de Publicação: 2022
Outros Autores: Cabarrão, Vera, Mendes, Sara
Tipo de documento: Artigo
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://doi.org/10.26334/2183-9077/rapln9ano2022a15
Resumo: This paper aims to analyse the effects of including gaming entities in the performance of the NER system, for the English language and in a machine translation industrial context of customer support content. To identify and classify gaming entities (by the Named Entity Recognition (NER) model), three new categories were created and added to the already used annotation typology: GAME NAME, GAME FEATURE and GAME CURRENCY. A set of reference annotations (gold standard) was also developed, allowing not only the training of the NER system but also the evaluation of its performance and accuracy in a more objective way, namely by counting the number of entities that the system identifies and categorises correctly. In the scope of this work, 6618 sentences from 7 gaming clients were manually annotated, constituting the gold standard which was then used to train and evaluate the NER system. The objective of the experiments was to assess whether the existing NER system improved its performance when trained with the gold standard created specifically for the gaming domain and if it could handle the new gaming categories added to the typology by identifying and categorizing them correctly. The results of both experiments were auspicious and positive, demonstrating the relevance of greater investment in domain-specific entity recognition, namely in the context of customer service text processing.
id RCAP_c3cc1e9cfe0724ed36895dd5e69b5486
oai_identifier_str oai:ojs3.ojs.apl.pt:article/149
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Annotation of Named Entities in the Gaming domainAnotação de Entidades Mencionadas na área do GamingEntidades MencionadasReconhecimento de Entidades MencionadasAnotaçãoNamed EntitiesNamed Entity RecognitionAnnotationGamingThis paper aims to analyse the effects of including gaming entities in the performance of the NER system, for the English language and in a machine translation industrial context of customer support content. To identify and classify gaming entities (by the Named Entity Recognition (NER) model), three new categories were created and added to the already used annotation typology: GAME NAME, GAME FEATURE and GAME CURRENCY. A set of reference annotations (gold standard) was also developed, allowing not only the training of the NER system but also the evaluation of its performance and accuracy in a more objective way, namely by counting the number of entities that the system identifies and categorises correctly. In the scope of this work, 6618 sentences from 7 gaming clients were manually annotated, constituting the gold standard which was then used to train and evaluate the NER system. The objective of the experiments was to assess whether the existing NER system improved its performance when trained with the gold standard created specifically for the gaming domain and if it could handle the new gaming categories added to the typology by identifying and categorizing them correctly. The results of both experiments were auspicious and positive, demonstrating the relevance of greater investment in domain-specific entity recognition, namely in the context of customer service text processing.Associação Portuguesa de Linguística2022-10-25info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.26334/2183-9077/rapln9ano2022a15https://doi.org/10.26334/2183-9077/rapln9ano2022a15Revista da Associação Portuguesa de Linguística; No. 9 (2022): Journal of the Portuguese Linguistics Association; 223-235Revista da Associação Portuguesa de Linguística; N.º 9 (2022): Revista da Associação Portuguesa de Linguística; 223-2352183-907710.26334/2183-9077/rapln9ano2022reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPporhttps://ojs.apl.pt/index.php/rapl/article/view/149https://ojs.apl.pt/index.php/rapl/article/view/149/142Direitos de Autor (c) 2022 Rita Silva, Vera Cabarrão, Sara Mendesinfo:eu-repo/semantics/openAccessSilva, RitaCabarrão, VeraMendes, Sara2023-12-02T10:18:01Zoai:ojs3.ojs.apl.pt:article/149Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:36:03.058886Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Annotation of Named Entities in the Gaming domain
Anotação de Entidades Mencionadas na área do Gaming
title Annotation of Named Entities in the Gaming domain
spellingShingle Annotation of Named Entities in the Gaming domain
Silva, Rita
Entidades Mencionadas
Reconhecimento de Entidades Mencionadas
Anotação
Named Entities
Named Entity Recognition
Annotation
Gaming
title_short Annotation of Named Entities in the Gaming domain
title_full Annotation of Named Entities in the Gaming domain
title_fullStr Annotation of Named Entities in the Gaming domain
title_full_unstemmed Annotation of Named Entities in the Gaming domain
title_sort Annotation of Named Entities in the Gaming domain
author Silva, Rita
author_facet Silva, Rita
Cabarrão, Vera
Mendes, Sara
author_role author
author2 Cabarrão, Vera
Mendes, Sara
author2_role author
author
dc.contributor.author.fl_str_mv Silva, Rita
Cabarrão, Vera
Mendes, Sara
dc.subject.por.fl_str_mv Entidades Mencionadas
Reconhecimento de Entidades Mencionadas
Anotação
Named Entities
Named Entity Recognition
Annotation
Gaming
topic Entidades Mencionadas
Reconhecimento de Entidades Mencionadas
Anotação
Named Entities
Named Entity Recognition
Annotation
Gaming
description This paper aims to analyse the effects of including gaming entities in the performance of the NER system, for the English language and in a machine translation industrial context of customer support content. To identify and classify gaming entities (by the Named Entity Recognition (NER) model), three new categories were created and added to the already used annotation typology: GAME NAME, GAME FEATURE and GAME CURRENCY. A set of reference annotations (gold standard) was also developed, allowing not only the training of the NER system but also the evaluation of its performance and accuracy in a more objective way, namely by counting the number of entities that the system identifies and categorises correctly. In the scope of this work, 6618 sentences from 7 gaming clients were manually annotated, constituting the gold standard which was then used to train and evaluate the NER system. The objective of the experiments was to assess whether the existing NER system improved its performance when trained with the gold standard created specifically for the gaming domain and if it could handle the new gaming categories added to the typology by identifying and categorizing them correctly. The results of both experiments were auspicious and positive, demonstrating the relevance of greater investment in domain-specific entity recognition, namely in the context of customer service text processing.
publishDate 2022
dc.date.none.fl_str_mv 2022-10-25
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.26334/2183-9077/rapln9ano2022a15
https://doi.org/10.26334/2183-9077/rapln9ano2022a15
url https://doi.org/10.26334/2183-9077/rapln9ano2022a15
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://ojs.apl.pt/index.php/rapl/article/view/149
https://ojs.apl.pt/index.php/rapl/article/view/149/142
dc.rights.driver.fl_str_mv Direitos de Autor (c) 2022 Rita Silva, Vera Cabarrão, Sara Mendes
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Direitos de Autor (c) 2022 Rita Silva, Vera Cabarrão, Sara Mendes
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Associação Portuguesa de Linguística
publisher.none.fl_str_mv Associação Portuguesa de Linguística
dc.source.none.fl_str_mv Revista da Associação Portuguesa de Linguística; No. 9 (2022): Journal of the Portuguese Linguistics Association; 223-235
Revista da Associação Portuguesa de Linguística; N.º 9 (2022): Revista da Associação Portuguesa de Linguística; 223-235
2183-9077
10.26334/2183-9077/rapln9ano2022
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799133623838834688