Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/19414 |
Resumo: | This paper tackles the information of 133 RNA viruses available in public databases under the light of several mathematical and computational tools. First, the formal concepts of distance metrics, Kolmogorov complexity and Shannon information are recalled. Second, the computational tools available presently for tackling and visualizing patterns embedded in datasets, such as the hierarchical clustering and the multidimensional scaling, are discussed. The synergies of the common application of the mathematical and computational resources are then used for exploring the RNA data, cross-evaluating the normalized compression distance, entropy and Jensen–Shannon divergence, versus representations in two and three dimensions. The results of these different perspectives give extra light in what concerns the relations between the distinct RNA viruses. |
id |
RCAP_3689645d358372f83487a67e2df3739c |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/19414 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theoriesCOVID-19Kolmogorov complexity theoryShannon information theoryHierarchical clusteringMultidimensional scalingThis paper tackles the information of 133 RNA viruses available in public databases under the light of several mathematical and computational tools. First, the formal concepts of distance metrics, Kolmogorov complexity and Shannon information are recalled. Second, the computational tools available presently for tackling and visualizing patterns embedded in datasets, such as the hierarchical clustering and the multidimensional scaling, are discussed. The synergies of the common application of the mathematical and computational resources are then used for exploring the RNA data, cross-evaluating the normalized compression distance, entropy and Jensen–Shannon divergence, versus representations in two and three dimensions. The results of these different perspectives give extra light in what concerns the relations between the distinct RNA viruses.The authors thank all those who have contributed and shared sequences to the GISAID database (https://www.gisaid.org/). The authors also thank those who have contributed to the GenBank of the National Center for Biotechnology Information (NCBI) databases (https://www.ncbi.nlm.nih.gov/genbank). The authors also thank Rómulo Antão for the help in handling the information with the compressors zlib and bz2.SpringerRepositório Científico do Instituto Politécnico do PortoMachado, J. A. TenreiroRocha-Neves, João M.Andrade, José P.2022-01-12T11:06:08Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdftext/plain; charset=utf-8http://hdl.handle.net/10400.22/19414eng10.1007/s11071-020-05771-8metadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-04-19T01:46:55Zoai:recipp.ipp.pt:10400.22/19414Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:39:16.346373Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
title |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
spellingShingle |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories Machado, J. A. Tenreiro COVID-19 Kolmogorov complexity theory Shannon information theory Hierarchical clustering Multidimensional scaling |
title_short |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
title_full |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
title_fullStr |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
title_full_unstemmed |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
title_sort |
Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov’s complexity and Shannon’s information theories |
author |
Machado, J. A. Tenreiro |
author_facet |
Machado, J. A. Tenreiro Rocha-Neves, João M. Andrade, José P. |
author_role |
author |
author2 |
Rocha-Neves, João M. Andrade, José P. |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Machado, J. A. Tenreiro Rocha-Neves, João M. Andrade, José P. |
dc.subject.por.fl_str_mv |
COVID-19 Kolmogorov complexity theory Shannon information theory Hierarchical clustering Multidimensional scaling |
topic |
COVID-19 Kolmogorov complexity theory Shannon information theory Hierarchical clustering Multidimensional scaling |
description |
This paper tackles the information of 133 RNA viruses available in public databases under the light of several mathematical and computational tools. First, the formal concepts of distance metrics, Kolmogorov complexity and Shannon information are recalled. Second, the computational tools available presently for tackling and visualizing patterns embedded in datasets, such as the hierarchical clustering and the multidimensional scaling, are discussed. The synergies of the common application of the mathematical and computational resources are then used for exploring the RNA data, cross-evaluating the normalized compression distance, entropy and Jensen–Shannon divergence, versus representations in two and three dimensions. The results of these different perspectives give extra light in what concerns the relations between the distinct RNA viruses. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020 2020-01-01T00:00:00Z 2022-01-12T11:06:08Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/19414 |
url |
http://hdl.handle.net/10400.22/19414 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1007/s11071-020-05771-8 |
dc.rights.driver.fl_str_mv |
metadata only access info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
metadata only access |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf text/plain; charset=utf-8 |
dc.publisher.none.fl_str_mv |
Springer |
publisher.none.fl_str_mv |
Springer |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131480946900992 |