Biological sequences as pictures - a generic two dimensional solution for iterated maps
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/24225 |
Resumo: | Background: Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data. Results: The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains. Conclusion: We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation. |
id |
RCAP_639fc7eb1d43828943b05143001540fa |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/24225 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Biological sequences as pictures - a generic two dimensional solution for iterated mapsENTROPIC PROFILESCHAOS GAME REPRESENTATIONDNA-SEQUENCESCLASSIFICATIONBackground: Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data. Results: The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains. Conclusion: We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation.NOVA Medical School|Faculdade de Ciências Médicas (NMS|FCM)RUNAlmeida, J. S.Vinga, Susana2017-10-16T22:01:08Z2009-03-312009-03-31T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article8application/pdfhttp://hdl.handle.net/10362/24225eng1471-2105PURE: 294102https://doi.org/10.1186/1471-2105-10-100info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T04:12:34Zoai:run.unl.pt:10362/24225Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:28:01.052015Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
title |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
spellingShingle |
Biological sequences as pictures - a generic two dimensional solution for iterated maps Almeida, J. S. ENTROPIC PROFILES CHAOS GAME REPRESENTATION DNA-SEQUENCES CLASSIFICATION |
title_short |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
title_full |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
title_fullStr |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
title_full_unstemmed |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
title_sort |
Biological sequences as pictures - a generic two dimensional solution for iterated maps |
author |
Almeida, J. S. |
author_facet |
Almeida, J. S. Vinga, Susana |
author_role |
author |
author2 |
Vinga, Susana |
author2_role |
author |
dc.contributor.none.fl_str_mv |
NOVA Medical School|Faculdade de Ciências Médicas (NMS|FCM) RUN |
dc.contributor.author.fl_str_mv |
Almeida, J. S. Vinga, Susana |
dc.subject.por.fl_str_mv |
ENTROPIC PROFILES CHAOS GAME REPRESENTATION DNA-SEQUENCES CLASSIFICATION |
topic |
ENTROPIC PROFILES CHAOS GAME REPRESENTATION DNA-SEQUENCES CLASSIFICATION |
description |
Background: Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data. Results: The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains. Conclusion: We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-03-31 2009-03-31T00:00:00Z 2017-10-16T22:01:08Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/24225 |
url |
http://hdl.handle.net/10362/24225 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1471-2105 PURE: 294102 https://doi.org/10.1186/1471-2105-10-100 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
8 application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137906697175040 |