GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10198/2259 |
Resumo: | In this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm’s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors. |
id |
RCAP_d8b0572e3c0db6014140a9257143e264 |
---|---|
oai_identifier_str |
oai:bibliotecadigital.ipb.pt:10198/2259 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequencesPattern-matchingSequence searching and analisysMotif discoveryIn this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm’s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors.Inderscience PublishersBiblioteca Digital do IPBDeusdado, SérgioCarvalho, Paulo2010-04-29T08:42:35Z20092009-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10198/2259engInternational Journal of Bioinformatics Research and Applications. ISSN 1744-5485. 5:4 (2009) p. 385-4011744-5485DOI: 10.1504/IJBRA.2009.027510info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-21T10:06:56Zoai:bibliotecadigital.ipb.pt:10198/2259Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:55:37.238799Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
title |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
spellingShingle |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences Deusdado, Sérgio Pattern-matching Sequence searching and analisys Motif discovery |
title_short |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
title_full |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
title_fullStr |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
title_full_unstemmed |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
title_sort |
GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences |
author |
Deusdado, Sérgio |
author_facet |
Deusdado, Sérgio Carvalho, Paulo |
author_role |
author |
author2 |
Carvalho, Paulo |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Biblioteca Digital do IPB |
dc.contributor.author.fl_str_mv |
Deusdado, Sérgio Carvalho, Paulo |
dc.subject.por.fl_str_mv |
Pattern-matching Sequence searching and analisys Motif discovery |
topic |
Pattern-matching Sequence searching and analisys Motif discovery |
description |
In this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm’s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009 2009-01-01T00:00:00Z 2010-04-29T08:42:35Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10198/2259 |
url |
http://hdl.handle.net/10198/2259 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
International Journal of Bioinformatics Research and Applications. ISSN 1744-5485. 5:4 (2009) p. 385-401 1744-5485 DOI: 10.1504/IJBRA.2009.027510 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Inderscience Publishers |
publisher.none.fl_str_mv |
Inderscience Publishers |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135160529059840 |