Searching a database based web site
Autor(a) principal: | |
---|---|
Data de Publicação: | 2003 |
Outros Autores: | |
Tipo de documento: | Livro |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/615 |
Resumo: | Currently, information systems are usually supported by databases (DB) and accessed through a Web interface. Pages in such Web sites are not drawn from HTML files but are generated on the fly upon request. Indexing and searching such dynamic pages raises several extra difficulties not solved by most search engines, which were designed for static contents. In this paper we describe the development of a search engine that overcomes most of the problems for a specific Web site, how the limitations put to indexing dynamic Web pages were circumvented, and an evaluation of the results obtained. The solution involves using a locally developed crawler, the Oracle Text full text indexer, and meta-information automatically drawn from the DB or manually added to improve the relevance factor calculation. It has the advantage of uniformly covering the dynamic pages and the static Web pages of the site. |
id |
RCAP_fbdfce54469acea0b640f378e9b66e8e |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/615 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Searching a database based web siteCurrently, information systems are usually supported by databases (DB) and accessed through a Web interface. Pages in such Web sites are not drawn from HTML files but are generated on the fly upon request. Indexing and searching such dynamic pages raises several extra difficulties not solved by most search engines, which were designed for static contents. In this paper we describe the development of a search engine that overcomes most of the problems for a specific Web site, how the limitations put to indexing dynamic Web pages were circumvented, and an evaluation of the results obtained. The solution involves using a locally developed crawler, the Oracle Text full text indexer, and meta-information automatically drawn from the DB or manually added to improve the relevance factor calculation. It has the advantage of uniformly covering the dynamic pages and the static Web pages of the site.20032003-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookapplication/pdfhttps://hdl.handle.net/10216/615porFilipe SilvaGabriel Davidinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T15:42:26Zoai:repositorio-aberto.up.pt:10216/615Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:30:07.149011Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Searching a database based web site |
title |
Searching a database based web site |
spellingShingle |
Searching a database based web site Filipe Silva |
title_short |
Searching a database based web site |
title_full |
Searching a database based web site |
title_fullStr |
Searching a database based web site |
title_full_unstemmed |
Searching a database based web site |
title_sort |
Searching a database based web site |
author |
Filipe Silva |
author_facet |
Filipe Silva Gabriel David |
author_role |
author |
author2 |
Gabriel David |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Filipe Silva Gabriel David |
description |
Currently, information systems are usually supported by databases (DB) and accessed through a Web interface. Pages in such Web sites are not drawn from HTML files but are generated on the fly upon request. Indexing and searching such dynamic pages raises several extra difficulties not solved by most search engines, which were designed for static contents. In this paper we describe the development of a search engine that overcomes most of the problems for a specific Web site, how the limitations put to indexing dynamic Web pages were circumvented, and an evaluation of the results obtained. The solution involves using a locally developed crawler, the Oracle Text full text indexer, and meta-information automatically drawn from the DB or manually added to improve the relevance factor calculation. It has the advantage of uniformly covering the dynamic pages and the static Web pages of the site. |
publishDate |
2003 |
dc.date.none.fl_str_mv |
2003 2003-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/book |
format |
book |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/615 |
url |
https://hdl.handle.net/10216/615 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136211354255360 |