Concentration of inverted repeats along human DNA

Detalhes bibliográficos
Autor(a) principal: Bastos, Carlos A. C.
Data de Publicação: 2023
Outros Autores: Afreixo, Vera, Rodrigues, João M. O. S., Pinho, Armando J.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/39820
Resumo: This work aims to describe the observed enrichment of inverted repeats in the human genome; and to identify and describe, with detailed length profiles, the regions with significant and relevant enriched occurrence of inverted repeats. The enrichment is assessed and tested with a recently proposed measure (z-scores based measure). We simulate a genome using an order 7 Markov model trained with the data from the real genome. The simulated genome is used to establish the critical values which are used as decision thresholds to identify the regions with significant enriched concentrations. Several human genome regions are highly enriched in the occurrence of inverted repeats. This is observed in all the human chromosomes. The distribution of inverted repeat lengths varies along the genome. The majority of the regions with severely exaggerated enrichment contain mainly short length inverted repeats. There are also regions with regular peaks along the inverted repeats lengths distribution (periodic regularities) and other regions with exaggerated enrichment for long lengths (less frequent). However, adjacent regions tend to have similar distributions.
id RCAP_50853a3c940bb179e31b260a2e37fc66
oai_identifier_str oai:ria.ua.pt:10773/39820
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Concentration of inverted repeats along human DNADistance distributionHuman genomeInverted repeatsMarkov modelThis work aims to describe the observed enrichment of inverted repeats in the human genome; and to identify and describe, with detailed length profiles, the regions with significant and relevant enriched occurrence of inverted repeats. The enrichment is assessed and tested with a recently proposed measure (z-scores based measure). We simulate a genome using an order 7 Markov model trained with the data from the real genome. The simulated genome is used to establish the critical values which are used as decision thresholds to identify the regions with significant enriched concentrations. Several human genome regions are highly enriched in the occurrence of inverted repeats. This is observed in all the human chromosomes. The distribution of inverted repeat lengths varies along the genome. The majority of the regions with severely exaggerated enrichment contain mainly short length inverted repeats. There are also regions with regular peaks along the inverted repeats lengths distribution (periodic regularities) and other regions with exaggerated enrichment for long lengths (less frequent). However, adjacent regions tend to have similar distributions.De Gruyter2023-12-15T10:48:15Z2023-01-01T00:00:00Z2023info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10773/39820eng10.1515/jib-2022-0052Bastos, Carlos A. C.Afreixo, VeraRodrigues, João M. O. S.Pinho, Armando J.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:08:00Zoai:ria.ua.pt:10773/39820Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:06:22.148994Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Concentration of inverted repeats along human DNA
title Concentration of inverted repeats along human DNA
spellingShingle Concentration of inverted repeats along human DNA
Bastos, Carlos A. C.
Distance distribution
Human genome
Inverted repeats
Markov model
title_short Concentration of inverted repeats along human DNA
title_full Concentration of inverted repeats along human DNA
title_fullStr Concentration of inverted repeats along human DNA
title_full_unstemmed Concentration of inverted repeats along human DNA
title_sort Concentration of inverted repeats along human DNA
author Bastos, Carlos A. C.
author_facet Bastos, Carlos A. C.
Afreixo, Vera
Rodrigues, João M. O. S.
Pinho, Armando J.
author_role author
author2 Afreixo, Vera
Rodrigues, João M. O. S.
Pinho, Armando J.
author2_role author
author
author
dc.contributor.author.fl_str_mv Bastos, Carlos A. C.
Afreixo, Vera
Rodrigues, João M. O. S.
Pinho, Armando J.
dc.subject.por.fl_str_mv Distance distribution
Human genome
Inverted repeats
Markov model
topic Distance distribution
Human genome
Inverted repeats
Markov model
description This work aims to describe the observed enrichment of inverted repeats in the human genome; and to identify and describe, with detailed length profiles, the regions with significant and relevant enriched occurrence of inverted repeats. The enrichment is assessed and tested with a recently proposed measure (z-scores based measure). We simulate a genome using an order 7 Markov model trained with the data from the real genome. The simulated genome is used to establish the critical values which are used as decision thresholds to identify the regions with significant enriched concentrations. Several human genome regions are highly enriched in the occurrence of inverted repeats. This is observed in all the human chromosomes. The distribution of inverted repeat lengths varies along the genome. The majority of the regions with severely exaggerated enrichment contain mainly short length inverted repeats. There are also regions with regular peaks along the inverted repeats lengths distribution (periodic regularities) and other regions with exaggerated enrichment for long lengths (less frequent). However, adjacent regions tend to have similar distributions.
publishDate 2023
dc.date.none.fl_str_mv 2023-12-15T10:48:15Z
2023-01-01T00:00:00Z
2023
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/39820
url http://hdl.handle.net/10773/39820
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1515/jib-2022-0052
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv De Gruyter
publisher.none.fl_str_mv De Gruyter
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137718912942080