A model for clustering data from heterogeneous dissimilarities

Santi, Éverton; Aloise, Daniel; Blanchard, Simon J.

A model for clustering data from heterogeneous dissimilarities

Detalhes bibliográficos
Autor(a) principal:	Santi, Éverton
Data de Publicação:	2016
Outros Autores:	Aloise, Daniel, Blanchard, Simon J.
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Institucional da UFRN
Texto Completo:	https://repositorio.ufrn.br/handle/123456789/30633
Resumo:	Clustering algorithms partition a set of n objects into p groups (called clusters), such that objects assigned to the same groups are homogeneous according to some criteria. To derive these clusters, the data input required is often a single n × n dissimilarity matrix. Yet for many applications, more than one instance of the dissimilarity matrix is available and so to conform to model requirements, it is common practice to aggregate (e.g., sum up, average) the matrices. This aggregation practice results in clustering solutions that mask the true nature of the original data. In this paper we introduce a clustering model which, to handle the heterogeneity, uses all available dissimilarity matrices and identifies for groups of individuals clustering objects in a similar way. The model is a nonconvex problem and difficult to solve exactly, and we thus introduce a Variable Neighborhood Search heuristic to provide solutions efficiently. Computational experiments and an empirical application to perception of chocolate candy show that the heuristic algorithm is efficient and that the proposed model is suited for recovering heterogeneous data. Implications for clustering researchers are discussed

Metadados do item

id	UFRN_bbcc22f06736420702c0c201474c90fc
oai_identifier_str	oai:https://repositorio.ufrn.br:123456789/30633
network_acronym_str	UFRN
network_name_str	Repositório Institucional da UFRN
repository_id_str
spelling	Santi, ÉvertonAloise, DanielBlanchard, Simon J.2020-11-23T15:27:39Z2020-11-23T15:27:39Z2016-09-16SANTI, Éverton; ALOISE, Daniel; BLANCHARD, Simon J.. A model for clustering data from heterogeneous dissimilarities. European Journal of Operational Research, [S.L.], v. 253, n. 3, p. 659-672, set. 2016. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0377221716301618?via%3Dihub. Acesso em: 08 set. 2020. http://dx.doi.org/10.1016/j.ejor.2016.03.033.0377-2217https://repositorio.ufrn.br/handle/123456789/3063310.1016/j.ejor.2016.03.033ElsevierHeterogeneityHeuristicsData miningClusteringOptimizationA model for clustering data from heterogeneous dissimilaritiesinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleClustering algorithms partition a set of n objects into p groups (called clusters), such that objects assigned to the same groups are homogeneous according to some criteria. To derive these clusters, the data input required is often a single n × n dissimilarity matrix. Yet for many applications, more than one instance of the dissimilarity matrix is available and so to conform to model requirements, it is common practice to aggregate (e.g., sum up, average) the matrices. This aggregation practice results in clustering solutions that mask the true nature of the original data. In this paper we introduce a clustering model which, to handle the heterogeneity, uses all available dissimilarity matrices and identifies for groups of individuals clustering objects in a similar way. The model is a nonconvex problem and difficult to solve exactly, and we thus introduce a Variable Neighborhood Search heuristic to provide solutions efficiently. Computational experiments and an empirical application to perception of chocolate candy show that the heuristic algorithm is efficient and that the proposed model is suited for recovering heterogeneous data. Implications for clustering researchers are discussedengreponame:Repositório Institucional da UFRNinstname:Universidade Federal do Rio Grande do Norte (UFRN)instacron:UFRNinfo:eu-repo/semantics/openAccessCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8914https://repositorio.ufrn.br/bitstream/123456789/30633/2/license_rdf4d2950bda3d176f570a9f8b328dfbbefMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81484https://repositorio.ufrn.br/bitstream/123456789/30633/3/license.txte9597aa2854d128fd968be5edc8a28d9MD53TEXTModelForClusteringData_2016.pdf.txtModelForClusteringData_2016.pdf.txtExtracted texttext/plain88886https://repositorio.ufrn.br/bitstream/123456789/30633/4/ModelForClusteringData_2016.pdf.txt22ce5fa407e1c161abc961538ed3c77eMD54THUMBNAILModelForClusteringData_2016.pdf.jpgModelForClusteringData_2016.pdf.jpgGenerated Thumbnailimage/jpeg1651https://repositorio.ufrn.br/bitstream/123456789/30633/5/ModelForClusteringData_2016.pdf.jpgcdfe60ab42036b294a44f9b9c371ca5bMD55123456789/306332023-02-03 19:07:33.403oai:https://repositorio.ufrn.br:123456789/30633Tk9OLUVYQ0xVU0lWRSBESVNUUklCVVRJT04gTElDRU5TRQoKCkJ5IHNpZ25pbmcgYW5kIGRlbGl2ZXJpbmcgdGhpcyBsaWNlbnNlLCBNci4gKGF1dGhvciBvciBjb3B5cmlnaHQgaG9sZGVyKToKCgphKSBHcmFudHMgdGhlIFVuaXZlcnNpZGFkZSBGZWRlcmFsIFJpbyBHcmFuZGUgZG8gTm9ydGUgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgb2YKcmVwcm9kdWNlLCBjb252ZXJ0IChhcyBkZWZpbmVkIGJlbG93KSwgY29tbXVuaWNhdGUgYW5kIC8gb3IKZGlzdHJpYnV0ZSB0aGUgZGVsaXZlcmVkIGRvY3VtZW50IChpbmNsdWRpbmcgYWJzdHJhY3QgLyBhYnN0cmFjdCkgaW4KZGlnaXRhbCBvciBwcmludGVkIGZvcm1hdCBhbmQgaW4gYW55IG1lZGl1bS4KCmIpIERlY2xhcmVzIHRoYXQgdGhlIGRvY3VtZW50IHN1Ym1pdHRlZCBpcyBpdHMgb3JpZ2luYWwgd29yaywgYW5kIHRoYXQKeW91IGhhdmUgdGhlIHJpZ2h0IHRvIGdyYW50IHRoZSByaWdodHMgY29udGFpbmVkIGluIHRoaXMgbGljZW5zZS4gRGVjbGFyZXMKdGhhdCB0aGUgZGVsaXZlcnkgb2YgdGhlIGRvY3VtZW50IGRvZXMgbm90IGluZnJpbmdlLCBhcyBmYXIgYXMgaXQgaXMKdGhlIHJpZ2h0cyBvZiBhbnkgb3RoZXIgcGVyc29uIG9yIGVudGl0eS4KCmMpIElmIHRoZSBkb2N1bWVudCBkZWxpdmVyZWQgY29udGFpbnMgbWF0ZXJpYWwgd2hpY2ggZG9lcyBub3QKcmlnaHRzLCBkZWNsYXJlcyB0aGF0IGl0IGhhcyBvYnRhaW5lZCBhdXRob3JpemF0aW9uIGZyb20gdGhlIGhvbGRlciBvZiB0aGUKY29weXJpZ2h0IHRvIGdyYW50IHRoZSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkbyBSaW8gR3JhbmRlIGRvIE5vcnRlIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdCB0aGlzIG1hdGVyaWFsIHdob3NlIHJpZ2h0cyBhcmUgb2YKdGhpcmQgcGFydGllcyBpcyBjbGVhcmx5IGlkZW50aWZpZWQgYW5kIHJlY29nbml6ZWQgaW4gdGhlIHRleHQgb3IKY29udGVudCBvZiB0aGUgZG9jdW1lbnQgZGVsaXZlcmVkLgoKSWYgdGhlIGRvY3VtZW50IHN1Ym1pdHRlZCBpcyBiYXNlZCBvbiBmdW5kZWQgb3Igc3VwcG9ydGVkIHdvcmsKYnkgYW5vdGhlciBpbnN0aXR1dGlvbiBvdGhlciB0aGFuIHRoZSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkbyBSaW8gR3JhbmRlIGRvIE5vcnRlLCBkZWNsYXJlcyB0aGF0IGl0IGhhcyBmdWxmaWxsZWQgYW55IG9ibGlnYXRpb25zIHJlcXVpcmVkIGJ5IHRoZSByZXNwZWN0aXZlIGFncmVlbWVudCBvciBhZ3JlZW1lbnQuCgpUaGUgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZG8gUmlvIEdyYW5kZSBkbyBOb3J0ZSB3aWxsIGNsZWFybHkgaWRlbnRpZnkgaXRzIG5hbWUgKHMpIGFzIHRoZSBhdXRob3IgKHMpIG9yIGhvbGRlciAocykgb2YgdGhlIGRvY3VtZW50J3MgcmlnaHRzCmRlbGl2ZXJlZCwgYW5kIHdpbGwgbm90IG1ha2UgYW55IGNoYW5nZXMsIG90aGVyIHRoYW4gdGhvc2UgcGVybWl0dGVkIGJ5CnRoaXMgbGljZW5zZQo=Repositório de PublicaçõesPUBhttp://repositorio.ufrn.br/oai/opendoar:2023-02-03T22:07:33Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)false
dc.title.pt_BR.fl_str_mv	A model for clustering data from heterogeneous dissimilarities
title	A model for clustering data from heterogeneous dissimilarities
spellingShingle	A model for clustering data from heterogeneous dissimilarities Santi, Éverton Heterogeneity Heuristics Data mining Clustering Optimization
title_short	A model for clustering data from heterogeneous dissimilarities
title_full	A model for clustering data from heterogeneous dissimilarities
title_fullStr	A model for clustering data from heterogeneous dissimilarities
title_full_unstemmed	A model for clustering data from heterogeneous dissimilarities
title_sort	A model for clustering data from heterogeneous dissimilarities
author	Santi, Éverton
author_facet	Santi, Éverton Aloise, Daniel Blanchard, Simon J.
author_role	author
author2	Aloise, Daniel Blanchard, Simon J.
author2_role	author author
dc.contributor.author.fl_str_mv	Santi, Éverton Aloise, Daniel Blanchard, Simon J.
dc.subject.por.fl_str_mv	Heterogeneity Heuristics Data mining Clustering Optimization
topic	Heterogeneity Heuristics Data mining Clustering Optimization
description	Clustering algorithms partition a set of n objects into p groups (called clusters), such that objects assigned to the same groups are homogeneous according to some criteria. To derive these clusters, the data input required is often a single n × n dissimilarity matrix. Yet for many applications, more than one instance of the dissimilarity matrix is available and so to conform to model requirements, it is common practice to aggregate (e.g., sum up, average) the matrices. This aggregation practice results in clustering solutions that mask the true nature of the original data. In this paper we introduce a clustering model which, to handle the heterogeneity, uses all available dissimilarity matrices and identifies for groups of individuals clustering objects in a similar way. The model is a nonconvex problem and difficult to solve exactly, and we thus introduce a Variable Neighborhood Search heuristic to provide solutions efficiently. Computational experiments and an empirical application to perception of chocolate candy show that the heuristic algorithm is efficient and that the proposed model is suited for recovering heterogeneous data. Implications for clustering researchers are discussed
publishDate	2016
dc.date.issued.fl_str_mv	2016-09-16
dc.date.accessioned.fl_str_mv	2020-11-23T15:27:39Z
dc.date.available.fl_str_mv	2020-11-23T15:27:39Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	SANTI, Éverton; ALOISE, Daniel; BLANCHARD, Simon J.. A model for clustering data from heterogeneous dissimilarities. European Journal of Operational Research, [S.L.], v. 253, n. 3, p. 659-672, set. 2016. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0377221716301618?via%3Dihub. Acesso em: 08 set. 2020. http://dx.doi.org/10.1016/j.ejor.2016.03.033.
dc.identifier.uri.fl_str_mv	https://repositorio.ufrn.br/handle/123456789/30633
dc.identifier.issn.none.fl_str_mv	0377-2217
dc.identifier.doi.none.fl_str_mv	10.1016/j.ejor.2016.03.033
identifier_str_mv	SANTI, Éverton; ALOISE, Daniel; BLANCHARD, Simon J.. A model for clustering data from heterogeneous dissimilarities. European Journal of Operational Research, [S.L.], v. 253, n. 3, p. 659-672, set. 2016. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0377221716301618?via%3Dihub. Acesso em: 08 set. 2020. http://dx.doi.org/10.1016/j.ejor.2016.03.033. 0377-2217 10.1016/j.ejor.2016.03.033
url	https://repositorio.ufrn.br/handle/123456789/30633
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Elsevier
publisher.none.fl_str_mv	Elsevier
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFRN instname:Universidade Federal do Rio Grande do Norte (UFRN) instacron:UFRN
instname_str	Universidade Federal do Rio Grande do Norte (UFRN)
instacron_str	UFRN
institution	UFRN
reponame_str	Repositório Institucional da UFRN
collection	Repositório Institucional da UFRN
bitstream.url.fl_str_mv	https://repositorio.ufrn.br/bitstream/123456789/30633/2/license_rdf https://repositorio.ufrn.br/bitstream/123456789/30633/3/license.txt https://repositorio.ufrn.br/bitstream/123456789/30633/4/ModelForClusteringData_2016.pdf.txt https://repositorio.ufrn.br/bitstream/123456789/30633/5/ModelForClusteringData_2016.pdf.jpg
bitstream.checksum.fl_str_mv	4d2950bda3d176f570a9f8b328dfbbef e9597aa2854d128fd968be5edc8a28d9 22ce5fa407e1c161abc961538ed3c77e cdfe60ab42036b294a44f9b9c371ca5b
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)
repository.mail.fl_str_mv
_version_	1802117784933498880

A model for clustering data from heterogeneous dissimilarities

Registros relacionados