Information retrieval techniques applied to the development of a thesaurus
Autor(a) principal: | |
---|---|
Data de Publicação: | 2014 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Transinformação (Online) |
Texto Completo: | https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094 |
Resumo: | The aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually. |
id |
PUC_CAMP-4_daccdd24e3907d71f72d5647c92c5c12 |
---|---|
oai_identifier_str |
oai:ojs.periodicos.puc-campinas.edu.br:article/6094 |
network_acronym_str |
PUC_CAMP-4 |
network_name_str |
Transinformação (Online) |
repository_id_str |
|
spelling |
Information retrieval techniques applied to the development of a thesaurusTécnicas de recuperación de información aplicadas a la construcción de tesaurosThesaurus developmentClusteringVector space modelGeneralized vector space modelLatent semantic indexing modelConstrucción de tesaurosClusteringModelo de espacio vectorialModelo generalizado de espacio vectorialSemántica latenteThe aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually.El artículo propone la aplicación de un conjunto de técnicas propias del ámbito de la Recuperación de Información a la elaboraciónde Tesauros. Las propuestas que se presentan se aplicaron en la selección de la terminología, en la categorización de términosmediante clusters, y en el establecimiento de relaciones semánticas entre los términos, por procedimientos de similitud, quedieron como resultado un Tesauro de Comercio Exterior, de 7.790 términos. De tales resultados se puede concluir que las técnicasutilizadas simplifican de forma considerable las tareas para la recopilación de la terminología, y pueden suponer una mejora de lacalidad del Tesauro resultante, en tanto que permiten el análisis de las condiciones de la colección para la que se utilizará elTesauro, así como aportar información extra a los expertos que es difícilmente obtenible de forma manual.Núcleo de Editoração - PUC-Campinas2014-03-25info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionPeer-reviewed ArticleArtículo revisado por paresAvaliado pelos Paresapplication/pdfhttps://periodicos.puc-campinas.edu.br/transinfo/article/view/6094Transinformação; Vol. 26 No. 1 (2014)Transinformação; Vol. 26 Núm. 1 (2014)Transinformação; v. 26 n. 1 (2014)2318-08890103-3786reponame:Transinformação (Online)instname:Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)instacron:PUC_CAMPporhttps://periodicos.puc-campinas.edu.br/transinfo/article/view/6094/3807Copyright (c) 2022 Transinformaçãohttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessGIL URDICIAIN, Blanca Sánchez JIMÉNEZ, Rodrigo 2024-04-01T15:41:16Zoai:ojs.periodicos.puc-campinas.edu.br:article/6094Revistahttp://periodicos.puc-campinas.edu.br/seer/index.php/transinfo/indexPRIhttps://old.scielo.br/oai/scielo-oai.phpsbi.nucleodeeditoracao@puc-campinas.edu.br2318-08890103-3786opendoar:2024-04-01T15:41:16Transinformação (Online) - Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS)false |
dc.title.none.fl_str_mv |
Information retrieval techniques applied to the development of a thesaurus Técnicas de recuperación de información aplicadas a la construcción de tesauros |
title |
Information retrieval techniques applied to the development of a thesaurus |
spellingShingle |
Information retrieval techniques applied to the development of a thesaurus GIL URDICIAIN, Blanca Thesaurus development Clustering Vector space model Generalized vector space model Latent semantic indexing model Construcción de tesauros Clustering Modelo de espacio vectorial Modelo generalizado de espacio vectorial Semántica latente |
title_short |
Information retrieval techniques applied to the development of a thesaurus |
title_full |
Information retrieval techniques applied to the development of a thesaurus |
title_fullStr |
Information retrieval techniques applied to the development of a thesaurus |
title_full_unstemmed |
Information retrieval techniques applied to the development of a thesaurus |
title_sort |
Information retrieval techniques applied to the development of a thesaurus |
author |
GIL URDICIAIN, Blanca |
author_facet |
GIL URDICIAIN, Blanca Sánchez JIMÉNEZ, Rodrigo |
author_role |
author |
author2 |
Sánchez JIMÉNEZ, Rodrigo |
author2_role |
author |
dc.contributor.author.fl_str_mv |
GIL URDICIAIN, Blanca Sánchez JIMÉNEZ, Rodrigo |
dc.subject.por.fl_str_mv |
Thesaurus development Clustering Vector space model Generalized vector space model Latent semantic indexing model Construcción de tesauros Clustering Modelo de espacio vectorial Modelo generalizado de espacio vectorial Semántica latente |
topic |
Thesaurus development Clustering Vector space model Generalized vector space model Latent semantic indexing model Construcción de tesauros Clustering Modelo de espacio vectorial Modelo generalizado de espacio vectorial Semántica latente |
description |
The aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014-03-25 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Peer-reviewed Article Artículo revisado por pares Avaliado pelos Pares |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094 |
url |
https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
https://periodicos.puc-campinas.edu.br/transinfo/article/view/6094/3807 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2022 Transinformação https://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2022 Transinformação https://creativecommons.org/licenses/by/4.0 |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Núcleo de Editoração - PUC-Campinas |
publisher.none.fl_str_mv |
Núcleo de Editoração - PUC-Campinas |
dc.source.none.fl_str_mv |
Transinformação; Vol. 26 No. 1 (2014) Transinformação; Vol. 26 Núm. 1 (2014) Transinformação; v. 26 n. 1 (2014) 2318-0889 0103-3786 reponame:Transinformação (Online) instname:Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS) instacron:PUC_CAMP |
instname_str |
Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS) |
instacron_str |
PUC_CAMP |
institution |
PUC_CAMP |
reponame_str |
Transinformação (Online) |
collection |
Transinformação (Online) |
repository.name.fl_str_mv |
Transinformação (Online) - Pontifícia Universidade Católica de Campinas (PUC-CAMPINAS) |
repository.mail.fl_str_mv |
sbi.nucleodeeditoracao@puc-campinas.edu.br |
_version_ |
1799125985084309504 |