Information retrieval system using Multiwords Expressions (MWE) as descriptors

Detalhes bibliográficos
Autor(a) principal: Silva, Edson Marchetti da
Data de Publicação: 2012
Outros Autores: Souza, Renato Rocha
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Journal of Information Systems and Technology Management (Online)
Texto Completo: https://www.revistas.usp.br/jistem/article/view/45596
Resumo: This paper aims to propose an alternative method for retrieving documents using Multiwords Expressions (MWE) extracted from a document base to be used as descriptors in search of an Information Retrieval System (IRS). In this sense, unlike methods that consider the text as a set of words, bag of words, we propose a method that takes into account the characteristics of the physical structure of the document in the extraction process of MWE. From this set of terms comparing pre-processed using an exhaustive algorithmic technique proposed by the authors with the results obtained for thirteen different measures of association statistics generated by the software Ngram Statistics Package (NSP). To perform this experiment was set up with a corpus of documents in digital format.
id USP-33_932f04095b9cedda2a27c8bc86bf1254
oai_identifier_str oai:revistas.usp.br:article/45596
network_acronym_str USP-33
network_name_str Journal of Information Systems and Technology Management (Online)
repository_id_str
spelling Information retrieval system using Multiwords Expressions (MWE) as descriptorsExtraction of Expressions MultiwordsMeasures of Association StatisticsCompared SearchInformation Retrieval Systemthe Document StructureThis paper aims to propose an alternative method for retrieving documents using Multiwords Expressions (MWE) extracted from a document base to be used as descriptors in search of an Information Retrieval System (IRS). In this sense, unlike methods that consider the text as a set of words, bag of words, we propose a method that takes into account the characteristics of the physical structure of the document in the extraction process of MWE. From this set of terms comparing pre-processed using an exhaustive algorithmic technique proposed by the authors with the results obtained for thirteen different measures of association statistics generated by the software Ngram Statistics Package (NSP). To perform this experiment was set up with a corpus of documents in digital format.TECSI - FEA - Universidade de São Paulo. Faculdade de Economia, Administração, Contabilidade e Atuária2012-08-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://www.revistas.usp.br/jistem/article/view/4559610.4301/S1807-17752012000200002Journal of Information Systems and Technology Management; v. 9 n. 2 (2012); 213-234Journal of Information Systems and Technology Management; Vol. 9 No. 2 (2012); 213-234Journal of Information Systems and Technology Management; Vol. 9 Núm. 2 (2012); 213-2341807-1775reponame:Journal of Information Systems and Technology Management (Online)instname:Universidade de São Paulo (USP)instacron:USPenghttps://www.revistas.usp.br/jistem/article/view/45596/49195Copyright (c) 2018 JISTEM - Journal of Information Systems and Technology Management (Online)info:eu-repo/semantics/openAccessSilva, Edson Marchetti daSouza, Renato Rocha2014-05-18T13:32:34Zoai:revistas.usp.br:article/45596Revistahttp://www.scielo.br/scielo.php?script=sci_serial&pid=1807-1775&lng=pt&nrm=isoPUBhttps://old.scielo.br/oai/scielo-oai.php||jistem@usp.br1807-17751807-1775opendoar:2014-05-18T13:32:34Journal of Information Systems and Technology Management (Online) - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Information retrieval system using Multiwords Expressions (MWE) as descriptors
title Information retrieval system using Multiwords Expressions (MWE) as descriptors
spellingShingle Information retrieval system using Multiwords Expressions (MWE) as descriptors
Silva, Edson Marchetti da
Extraction of Expressions Multiwords
Measures of Association Statistics
Compared Search
Information Retrieval System
the Document Structure
title_short Information retrieval system using Multiwords Expressions (MWE) as descriptors
title_full Information retrieval system using Multiwords Expressions (MWE) as descriptors
title_fullStr Information retrieval system using Multiwords Expressions (MWE) as descriptors
title_full_unstemmed Information retrieval system using Multiwords Expressions (MWE) as descriptors
title_sort Information retrieval system using Multiwords Expressions (MWE) as descriptors
author Silva, Edson Marchetti da
author_facet Silva, Edson Marchetti da
Souza, Renato Rocha
author_role author
author2 Souza, Renato Rocha
author2_role author
dc.contributor.author.fl_str_mv Silva, Edson Marchetti da
Souza, Renato Rocha
dc.subject.por.fl_str_mv Extraction of Expressions Multiwords
Measures of Association Statistics
Compared Search
Information Retrieval System
the Document Structure
topic Extraction of Expressions Multiwords
Measures of Association Statistics
Compared Search
Information Retrieval System
the Document Structure
description This paper aims to propose an alternative method for retrieving documents using Multiwords Expressions (MWE) extracted from a document base to be used as descriptors in search of an Information Retrieval System (IRS). In this sense, unlike methods that consider the text as a set of words, bag of words, we propose a method that takes into account the characteristics of the physical structure of the document in the extraction process of MWE. From this set of terms comparing pre-processed using an exhaustive algorithmic technique proposed by the authors with the results obtained for thirteen different measures of association statistics generated by the software Ngram Statistics Package (NSP). To perform this experiment was set up with a corpus of documents in digital format.
publishDate 2012
dc.date.none.fl_str_mv 2012-08-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.revistas.usp.br/jistem/article/view/45596
10.4301/S1807-17752012000200002
url https://www.revistas.usp.br/jistem/article/view/45596
identifier_str_mv 10.4301/S1807-17752012000200002
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://www.revistas.usp.br/jistem/article/view/45596/49195
dc.rights.driver.fl_str_mv Copyright (c) 2018 JISTEM - Journal of Information Systems and Technology Management (Online)
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2018 JISTEM - Journal of Information Systems and Technology Management (Online)
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv TECSI - FEA - Universidade de São Paulo. Faculdade de Economia, Administração, Contabilidade e Atuária
publisher.none.fl_str_mv TECSI - FEA - Universidade de São Paulo. Faculdade de Economia, Administração, Contabilidade e Atuária
dc.source.none.fl_str_mv Journal of Information Systems and Technology Management; v. 9 n. 2 (2012); 213-234
Journal of Information Systems and Technology Management; Vol. 9 No. 2 (2012); 213-234
Journal of Information Systems and Technology Management; Vol. 9 Núm. 2 (2012); 213-234
1807-1775
reponame:Journal of Information Systems and Technology Management (Online)
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Journal of Information Systems and Technology Management (Online)
collection Journal of Information Systems and Technology Management (Online)
repository.name.fl_str_mv Journal of Information Systems and Technology Management (Online) - Universidade de São Paulo (USP)
repository.mail.fl_str_mv ||jistem@usp.br
_version_ 1800222952454619136