Query driven sequence pattern mining
Autor(a) principal: | |
---|---|
Data de Publicação: | 2006 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/1822/6588 |
Resumo: | The discovery of frequent patterns present in biological sequences has a large number of applications, ranging from classification, clustering and understanding sequence structure and function. This paper presents an algorithm that discovers frequent sequence patterns (motifs) present in a query sequence in respect to a database of sequences. The query is used to guide the mining process and thus only the patterns present in the query are reported. Two main types of patterns can be identified: flexible and rigid gap patterns. The user can choose to report all or only maximal patterns. Constraints and Substitution Sets are pushed directly into the mining process. Experimental evaluation shows the efficiency of the algorithm, the usefulness and the relevance of the extracted patterns. |
id |
RCAP_64930df986ab02b97dfc3eaf1ba981fd |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/6588 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Query driven sequence pattern miningBioinformaticsDatabasesThe discovery of frequent patterns present in biological sequences has a large number of applications, ranging from classification, clustering and understanding sequence structure and function. This paper presents an algorithm that discovers frequent sequence patterns (motifs) present in a query sequence in respect to a database of sequences. The query is used to guide the mining process and thus only the patterns present in the query are reported. Two main types of patterns can be identified: flexible and rigid gap patterns. The user can choose to report all or only maximal patterns. Constraints and Substitution Sets are pushed directly into the mining process. Experimental evaluation shows the efficiency of the algorithm, the usefulness and the relevance of the extracted patterns.Fundação para a Ciência e a Tecnologia (FCT)Universidade do MinhoAzevedo, Paulo J.Ferreira, Pedro Gabriel20062006-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/6588engSIMPÓSIO BRASILEIRO DE BANCO DE DADOS, 21, Florianópolis, 2006 – “Simpósio Brasileiro de Banco de Dados : Anais”. [S.l. : s.n., 2006].info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:29:22Zoai:repositorium.sdum.uminho.pt:1822/6588Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:24:20.912881Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Query driven sequence pattern mining |
title |
Query driven sequence pattern mining |
spellingShingle |
Query driven sequence pattern mining Azevedo, Paulo J. Bioinformatics Databases |
title_short |
Query driven sequence pattern mining |
title_full |
Query driven sequence pattern mining |
title_fullStr |
Query driven sequence pattern mining |
title_full_unstemmed |
Query driven sequence pattern mining |
title_sort |
Query driven sequence pattern mining |
author |
Azevedo, Paulo J. |
author_facet |
Azevedo, Paulo J. Ferreira, Pedro Gabriel |
author_role |
author |
author2 |
Ferreira, Pedro Gabriel |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Azevedo, Paulo J. Ferreira, Pedro Gabriel |
dc.subject.por.fl_str_mv |
Bioinformatics Databases |
topic |
Bioinformatics Databases |
description |
The discovery of frequent patterns present in biological sequences has a large number of applications, ranging from classification, clustering and understanding sequence structure and function. This paper presents an algorithm that discovers frequent sequence patterns (motifs) present in a query sequence in respect to a database of sequences. The query is used to guide the mining process and thus only the patterns present in the query are reported. Two main types of patterns can be identified: flexible and rigid gap patterns. The user can choose to report all or only maximal patterns. Constraints and Substitution Sets are pushed directly into the mining process. Experimental evaluation shows the efficiency of the algorithm, the usefulness and the relevance of the extracted patterns. |
publishDate |
2006 |
dc.date.none.fl_str_mv |
2006 2006-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1822/6588 |
url |
http://hdl.handle.net/1822/6588 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
SIMPÓSIO BRASILEIRO DE BANCO DE DADOS, 21, Florianópolis, 2006 – “Simpósio Brasileiro de Banco de Dados : Anais”. [S.l. : s.n., 2006]. |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799132722578325504 |