Distributed clustering of ubiquitous data streams

Detalhes bibliográficos
Autor(a) principal: Pedro Pereira Rodrigues
Data de Publicação: 2014
Outros Autores: João Gama
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://repositorio.inesctec.pt/handle/123456789/3560
http://dx.doi.org/10.1002/widm.1109
Resumo: Nowadays information is generated and gathered from distributed streaming data sources, stressing communications and computing infrastructure, making it hard to transmit, compute, and store. Knowledge discovery from ubiquitous data streams has become a major goal for all sorts of applications, mostly based on unsupervised techniques such as clustering. Two subproblems exist: clustering streaming data observations and clustering streaming data sources. The former searches for dense regions of the data space, identifying hot spots where data sources tend to produce data, while the latter finds groups of sources that behave similarly over time. In order to better assess the current status of this topic, this article presents a thorough review on distributed algorithms addressing either of the subproblems. We characterize clustering algorithms for ubiquitous data streams, discussing advantages and disadvantages of distributed procedures. Overall, distributed stream clustering methods improve communication ratios, processing speed, and resources consumption, while achieving similar clustering validity as the centralized counterparts. (C) 2013 John Wiley & Sons, Ltd.
id RCAP_f38ceca81aecd41d1b6a21e1dc2d81e9
oai_identifier_str oai:repositorio.inesctec.pt:123456789/3560
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Distributed clustering of ubiquitous data streamsNowadays information is generated and gathered from distributed streaming data sources, stressing communications and computing infrastructure, making it hard to transmit, compute, and store. Knowledge discovery from ubiquitous data streams has become a major goal for all sorts of applications, mostly based on unsupervised techniques such as clustering. Two subproblems exist: clustering streaming data observations and clustering streaming data sources. The former searches for dense regions of the data space, identifying hot spots where data sources tend to produce data, while the latter finds groups of sources that behave similarly over time. In order to better assess the current status of this topic, this article presents a thorough review on distributed algorithms addressing either of the subproblems. We characterize clustering algorithms for ubiquitous data streams, discussing advantages and disadvantages of distributed procedures. Overall, distributed stream clustering methods improve communication ratios, processing speed, and resources consumption, while achieving similar clustering validity as the centralized counterparts. (C) 2013 John Wiley & Sons, Ltd.2017-11-20T10:40:44Z2014-01-01T00:00:00Z2014info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://repositorio.inesctec.pt/handle/123456789/3560http://dx.doi.org/10.1002/widm.1109engPedro Pereira RodriguesJoão Gamainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-05-15T10:19:58Zoai:repositorio.inesctec.pt:123456789/3560Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:52:30.149310Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Distributed clustering of ubiquitous data streams
title Distributed clustering of ubiquitous data streams
spellingShingle Distributed clustering of ubiquitous data streams
Pedro Pereira Rodrigues
title_short Distributed clustering of ubiquitous data streams
title_full Distributed clustering of ubiquitous data streams
title_fullStr Distributed clustering of ubiquitous data streams
title_full_unstemmed Distributed clustering of ubiquitous data streams
title_sort Distributed clustering of ubiquitous data streams
author Pedro Pereira Rodrigues
author_facet Pedro Pereira Rodrigues
João Gama
author_role author
author2 João Gama
author2_role author
dc.contributor.author.fl_str_mv Pedro Pereira Rodrigues
João Gama
description Nowadays information is generated and gathered from distributed streaming data sources, stressing communications and computing infrastructure, making it hard to transmit, compute, and store. Knowledge discovery from ubiquitous data streams has become a major goal for all sorts of applications, mostly based on unsupervised techniques such as clustering. Two subproblems exist: clustering streaming data observations and clustering streaming data sources. The former searches for dense regions of the data space, identifying hot spots where data sources tend to produce data, while the latter finds groups of sources that behave similarly over time. In order to better assess the current status of this topic, this article presents a thorough review on distributed algorithms addressing either of the subproblems. We characterize clustering algorithms for ubiquitous data streams, discussing advantages and disadvantages of distributed procedures. Overall, distributed stream clustering methods improve communication ratios, processing speed, and resources consumption, while achieving similar clustering validity as the centralized counterparts. (C) 2013 John Wiley & Sons, Ltd.
publishDate 2014
dc.date.none.fl_str_mv 2014-01-01T00:00:00Z
2014
2017-11-20T10:40:44Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://repositorio.inesctec.pt/handle/123456789/3560
http://dx.doi.org/10.1002/widm.1109
url http://repositorio.inesctec.pt/handle/123456789/3560
http://dx.doi.org/10.1002/widm.1109
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799131601333911553