Dealing with repeated objects in SNNagg
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/1822/42342 |
Resumo: | Due to the constant technological advances and massive use of electronic devices, the amount of data generated has increased at a very high rate, leading to the urgent need to process larger amounts of data in less time. In order to be able to handle these large amounts of data, several techniques and algorithms have been developed in the area of knowledge discovery in databases, which process consists of several stages, including data mining that analyze vast amounts of data, identifying patterns, models or trends. Among the several data mining techniques, this work is focused in clustering spatial data with a density-based approach that uses the Shared Nearest Neighbor algorithm (SNN). SNN has shown several advantages when analyzing this type of data, identifying clusters of different sizes, shapes, and densities, and also dealing with noise. This paper presents and evaluates a new extension of SNN that is able to deal with repeated objects, creating aggregates that reduce the processing time required to cluster a given dataset, as repeated objects are excluded from the most time demanding step, which is associated with the identification of the k-nearest neighbors of a point. The proposed approach, SNNagg, was evaluated and the obtained results show that the processing time is reduced without compromising the quality of the obtained clusters. |
id |
RCAP_a92abd5e913e2d9fd2cade0f357df8c3 |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/42342 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Dealing with repeated objects in SNNaggSpatial DataSpatio-Temporal DataClusteringSNNDensity-based ClusteringEngenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaDue to the constant technological advances and massive use of electronic devices, the amount of data generated has increased at a very high rate, leading to the urgent need to process larger amounts of data in less time. In order to be able to handle these large amounts of data, several techniques and algorithms have been developed in the area of knowledge discovery in databases, which process consists of several stages, including data mining that analyze vast amounts of data, identifying patterns, models or trends. Among the several data mining techniques, this work is focused in clustering spatial data with a density-based approach that uses the Shared Nearest Neighbor algorithm (SNN). SNN has shown several advantages when analyzing this type of data, identifying clusters of different sizes, shapes, and densities, and also dealing with noise. This paper presents and evaluates a new extension of SNN that is able to deal with repeated objects, creating aggregates that reduce the processing time required to cluster a given dataset, as repeated objects are excluded from the most time demanding step, which is associated with the identification of the k-nearest neighbors of a point. The proposed approach, SNNagg, was evaluated and the obtained results show that the processing time is reduced without compromising the quality of the obtained clusters.This work has been supported by FCT, Fundação para a Ciência e Tecnologia, within the Project Scope UID/CEC/00319/2013.IAENGUniversidade do MinhoGalvão, João Rui Magalhães Velho da CunhaSantos, Maribel YasminaPires, João MouraCosta, Carlos2016-02-162016-02-16T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/42342engJoao Galvão, Maribel Yasmina Santos, Joao Moura Pires, and Carlos Costa, "Dealing with Repeated Objects in SNNagg", IAENG International Journal of Computer Science, vol. 43, no. 1, pp115-125, 2016, ISSN: 1819656X.1819656Xinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:13:43Zoai:repositorium.sdum.uminho.pt:1822/42342Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:05:53.385286Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Dealing with repeated objects in SNNagg |
title |
Dealing with repeated objects in SNNagg |
spellingShingle |
Dealing with repeated objects in SNNagg Galvão, João Rui Magalhães Velho da Cunha Spatial Data Spatio-Temporal Data Clustering SNN Density-based Clustering Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
title_short |
Dealing with repeated objects in SNNagg |
title_full |
Dealing with repeated objects in SNNagg |
title_fullStr |
Dealing with repeated objects in SNNagg |
title_full_unstemmed |
Dealing with repeated objects in SNNagg |
title_sort |
Dealing with repeated objects in SNNagg |
author |
Galvão, João Rui Magalhães Velho da Cunha |
author_facet |
Galvão, João Rui Magalhães Velho da Cunha Santos, Maribel Yasmina Pires, João Moura Costa, Carlos |
author_role |
author |
author2 |
Santos, Maribel Yasmina Pires, João Moura Costa, Carlos |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Galvão, João Rui Magalhães Velho da Cunha Santos, Maribel Yasmina Pires, João Moura Costa, Carlos |
dc.subject.por.fl_str_mv |
Spatial Data Spatio-Temporal Data Clustering SNN Density-based Clustering Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
topic |
Spatial Data Spatio-Temporal Data Clustering SNN Density-based Clustering Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
description |
Due to the constant technological advances and massive use of electronic devices, the amount of data generated has increased at a very high rate, leading to the urgent need to process larger amounts of data in less time. In order to be able to handle these large amounts of data, several techniques and algorithms have been developed in the area of knowledge discovery in databases, which process consists of several stages, including data mining that analyze vast amounts of data, identifying patterns, models or trends. Among the several data mining techniques, this work is focused in clustering spatial data with a density-based approach that uses the Shared Nearest Neighbor algorithm (SNN). SNN has shown several advantages when analyzing this type of data, identifying clusters of different sizes, shapes, and densities, and also dealing with noise. This paper presents and evaluates a new extension of SNN that is able to deal with repeated objects, creating aggregates that reduce the processing time required to cluster a given dataset, as repeated objects are excluded from the most time demanding step, which is associated with the identification of the k-nearest neighbors of a point. The proposed approach, SNNagg, was evaluated and the obtained results show that the processing time is reduced without compromising the quality of the obtained clusters. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-02-16 2016-02-16T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1822/42342 |
url |
http://hdl.handle.net/1822/42342 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Joao Galvão, Maribel Yasmina Santos, Joao Moura Pires, and Carlos Costa, "Dealing with Repeated Objects in SNNagg", IAENG International Journal of Computer Science, vol. 43, no. 1, pp115-125, 2016, ISSN: 1819656X. 1819656X |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
IAENG |
publisher.none.fl_str_mv |
IAENG |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799132471922524160 |