GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences

Detalhes bibliográficos
Autor(a) principal: Deusdado, Sérgio
Data de Publicação: 2009
Outros Autores: Carvalho, Paulo
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10198/2259
Resumo: In this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm’s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors.
id RCAP_d8b0572e3c0db6014140a9257143e264
oai_identifier_str oai:bibliotecadigital.ipb.pt:10198/2259
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling GRASPm: an efficient algorithm for exact pattern-matching in genomic sequencesPattern-matchingSequence searching and analisysMotif discoveryIn this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm’s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors.Inderscience PublishersBiblioteca Digital do IPBDeusdado, SérgioCarvalho, Paulo2010-04-29T08:42:35Z20092009-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10198/2259engInternational Journal of Bioinformatics Research and Applications. ISSN 1744-5485. 5:4 (2009) p. 385-4011744-5485DOI: 10.1504/IJBRA.2009.027510info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-21T10:06:56Zoai:bibliotecadigital.ipb.pt:10198/2259Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:55:37.238799Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
title GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
spellingShingle GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
Deusdado, Sérgio
Pattern-matching
Sequence searching and analisys
Motif discovery
title_short GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
title_full GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
title_fullStr GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
title_full_unstemmed GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
title_sort GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
author Deusdado, Sérgio
author_facet Deusdado, Sérgio
Carvalho, Paulo
author_role author
author2 Carvalho, Paulo
author2_role author
dc.contributor.none.fl_str_mv Biblioteca Digital do IPB
dc.contributor.author.fl_str_mv Deusdado, Sérgio
Carvalho, Paulo
dc.subject.por.fl_str_mv Pattern-matching
Sequence searching and analisys
Motif discovery
topic Pattern-matching
Sequence searching and analisys
Motif discovery
description In this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm’s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors.
publishDate 2009
dc.date.none.fl_str_mv 2009
2009-01-01T00:00:00Z
2010-04-29T08:42:35Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10198/2259
url http://hdl.handle.net/10198/2259
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv International Journal of Bioinformatics Research and Applications. ISSN 1744-5485. 5:4 (2009) p. 385-401
1744-5485
DOI: 10.1504/IJBRA.2009.027510
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Inderscience Publishers
publisher.none.fl_str_mv Inderscience Publishers
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135160529059840