Information retrieval techniques applied to the development of a thesaurus

Detalhes bibliográficos
Autor(a) principal: GIL URDICIAIN, Blanca
Data de Publicação: 2014
Outros Autores: Sánchez JIMÉNEZ, Rodrigo
Tipo de documento: Artigo
Idioma: por
Título da fonte: Transinformação (Online)
Texto Completo: https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094
Resumo: The aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually.
id PUC_CAMP-4_daccdd24e3907d71f72d5647c92c5c12
oai_identifier_str oai:ojs.periodicos.puc-campinas.edu.br:article/6094
network_acronym_str PUC_CAMP-4
network_name_str Transinformação (Online)
repository_id_str
spelling Information retrieval techniques applied to the development of a thesaurusTécnicas de recuperación de información aplicadas a la construcción de tesaurosThesaurus developmentClusteringVector space modelGeneralized vector space modelLatent semantic indexing modelConstrucción de tesaurosClusteringModelo de espacio vectorialModelo generalizado de espacio vectorialSemántica latenteThe aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually.El artículo propone la aplicación de un conjunto de técnicas propias del ámbito de la Recuperación de Información a la elaboraciónde Tesauros. Las propuestas que se presentan se aplicaron en la selección de la terminología, en la categorización de términosmediante clusters, y en el establecimiento de relaciones semánticas entre los términos, por procedimientos de similitud, quedieron como resultado un Tesauro de Comercio Exterior, de 7.790 términos. De tales resultados se puede concluir que las técnicasutilizadas simplifican de forma considerable las tareas para la recopilación de la terminología, y pueden suponer una mejora de lacalidad del Tesauro resultante, en tanto que permiten el análisis de las condiciones de la colección para la que se utilizará elTesauro, así como aportar información extra a los expertos que es difícilmente obtenible de forma manual.Núcleo de Editoração - PUC-Campinas2014-03-25info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionPeer-reviewed ArticleArtículo revisado por paresAvaliado pelos Paresapplication/pdfhttps://periodicos.puc-campinas.edu.br/transinfo/article/view/6094Transinformação; Vol. 26 No. 1 (2014)Transinformação; Vol. 26 Núm. 1 (2014)Transinformação; v. 26 n. 1 (2014)2318-08890103-3786reponame:Transinformação (Online)instname:Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)instacron:PUC_CAMPporhttps://periodicos.puc-campinas.edu.br/transinfo/article/view/6094/3807Copyright (c) 2022 Transinformaçãohttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessGIL URDICIAIN, Blanca Sánchez JIMÉNEZ, Rodrigo 2024-04-01T15:41:16Zoai:ojs.periodicos.puc-campinas.edu.br:article/6094Revistahttp://periodicos.puc-campinas.edu.br/seer/index.php/transinfo/indexPRIhttps://old.scielo.br/oai/scielo-oai.phpsbi.nucleodeeditoracao@puc-campinas.edu.br2318-08890103-3786opendoar:2024-04-01T15:41:16Transinformação (Online) - Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)false
dc.title.none.fl_str_mv Information retrieval techniques applied to the development of a thesaurus
Técnicas de recuperación de información aplicadas a la construcción de tesauros
title Information retrieval techniques applied to the development of a thesaurus
spellingShingle Information retrieval techniques applied to the development of a thesaurus
GIL URDICIAIN, Blanca
Thesaurus development
Clustering
Vector space model
Generalized vector space model
Latent semantic indexing model
Construcción de tesauros
Clustering
Modelo de espacio vectorial
Modelo generalizado de espacio vectorial
Semántica latente
title_short Information retrieval techniques applied to the development of a thesaurus
title_full Information retrieval techniques applied to the development of a thesaurus
title_fullStr Information retrieval techniques applied to the development of a thesaurus
title_full_unstemmed Information retrieval techniques applied to the development of a thesaurus
title_sort Information retrieval techniques applied to the development of a thesaurus
author GIL URDICIAIN, Blanca
author_facet GIL URDICIAIN, Blanca
Sánchez JIMÉNEZ, Rodrigo
author_role author
author2 Sánchez JIMÉNEZ, Rodrigo
author2_role author
dc.contributor.author.fl_str_mv GIL URDICIAIN, Blanca
Sánchez JIMÉNEZ, Rodrigo
dc.subject.por.fl_str_mv Thesaurus development
Clustering
Vector space model
Generalized vector space model
Latent semantic indexing model
Construcción de tesauros
Clustering
Modelo de espacio vectorial
Modelo generalizado de espacio vectorial
Semántica latente
topic Thesaurus development
Clustering
Vector space model
Generalized vector space model
Latent semantic indexing model
Construcción de tesauros
Clustering
Modelo de espacio vectorial
Modelo generalizado de espacio vectorial
Semántica latente
description The aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually.
publishDate 2014
dc.date.none.fl_str_mv 2014-03-25
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Peer-reviewed Article
Artículo revisado por pares
Avaliado pelos Pares
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094
url https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094/3807
dc.rights.driver.fl_str_mv Copyright (c) 2022 Transinformação
https://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2022 Transinformação
https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Núcleo de Editoração - PUC-Campinas
publisher.none.fl_str_mv Núcleo de Editoração - PUC-Campinas
dc.source.none.fl_str_mv Transinformação; Vol. 26 No. 1 (2014)
Transinformação; Vol. 26 Núm. 1 (2014)
Transinformação; v. 26 n. 1 (2014)
2318-0889
0103-3786
reponame:Transinformação (Online)
instname:Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)
instacron:PUC_CAMP
instname_str Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)
instacron_str PUC_CAMP
institution PUC_CAMP
reponame_str Transinformação (Online)
collection Transinformação (Online)
repository.name.fl_str_mv Transinformação (Online) - Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)
repository.mail.fl_str_mv sbi.nucleodeeditoracao@puc-campinas.edu.br
_version_ 1799125985084309504