Biological sequences as pictures - a generic two dimensional solution for iterated maps

Detalhes bibliográficos
Autor(a) principal: Almeida, J. S.
Data de Publicação: 2009
Outros Autores: Vinga, Susana
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/24225
Resumo: Background: Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data. Results: The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains. Conclusion: We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation.
id RCAP_639fc7eb1d43828943b05143001540fa
oai_identifier_str oai:run.unl.pt:10362/24225
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Biological sequences as pictures - a generic two dimensional solution for iterated mapsENTROPIC PROFILESCHAOS GAME REPRESENTATIONDNA-SEQUENCESCLASSIFICATIONBackground: Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data. Results: The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains. Conclusion: We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation.NOVA Medical School|Faculdade de Ciências Médicas (NMS|FCM)RUNAlmeida, J. S.Vinga, Susana2017-10-16T22:01:08Z2009-03-312009-03-31T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article8application/pdfhttp://hdl.handle.net/10362/24225eng1471-2105PURE: 294102https://doi.org/10.1186/1471-2105-10-100info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:12:34Zoai:run.unl.pt:10362/24225Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:28:01.052015Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Biological sequences as pictures - a generic two dimensional solution for iterated maps
title Biological sequences as pictures - a generic two dimensional solution for iterated maps
spellingShingle Biological sequences as pictures - a generic two dimensional solution for iterated maps
Almeida, J. S.
ENTROPIC PROFILES
CHAOS GAME REPRESENTATION
DNA-SEQUENCES
CLASSIFICATION
title_short Biological sequences as pictures - a generic two dimensional solution for iterated maps
title_full Biological sequences as pictures - a generic two dimensional solution for iterated maps
title_fullStr Biological sequences as pictures - a generic two dimensional solution for iterated maps
title_full_unstemmed Biological sequences as pictures - a generic two dimensional solution for iterated maps
title_sort Biological sequences as pictures - a generic two dimensional solution for iterated maps
author Almeida, J. S.
author_facet Almeida, J. S.
Vinga, Susana
author_role author
author2 Vinga, Susana
author2_role author
dc.contributor.none.fl_str_mv NOVA Medical School|Faculdade de Ciências Médicas (NMS|FCM)
RUN
dc.contributor.author.fl_str_mv Almeida, J. S.
Vinga, Susana
dc.subject.por.fl_str_mv ENTROPIC PROFILES
CHAOS GAME REPRESENTATION
DNA-SEQUENCES
CLASSIFICATION
topic ENTROPIC PROFILES
CHAOS GAME REPRESENTATION
DNA-SEQUENCES
CLASSIFICATION
description Background: Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data. Results: The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains. Conclusion: We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation.
publishDate 2009
dc.date.none.fl_str_mv 2009-03-31
2009-03-31T00:00:00Z
2017-10-16T22:01:08Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/24225
url http://hdl.handle.net/10362/24225
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1471-2105
PURE: 294102
https://doi.org/10.1186/1471-2105-10-100
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 8
application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137906697175040