Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state

Detalhes bibliográficos
Autor(a) principal: Crispim, Diêgo Lima
Data de Publicação: 2020
Outros Autores: Fernandes, Lindemberg Lima, Filho, David Figueiredo Ferreira, Lira, Bruna Roberta Pereira
Tipo de documento: Artigo
Idioma: por
Título da fonte: Research, Society and Development
Texto Completo: https://rsdjournal.org/index.php/rsd/article/view/2067
Resumo: This study aimed to compare the performance of hierarchical agglomerative clustering methods using a data set composed of several sustainability indicators referring to the municipalities of the state of Pará. As well as determining the number of initial groups to be formed by applying of validity indexes. For the selection of indicators, a check-list of national, regional and local scientific studies addressing the theme of sustainability was carried out. Subsequently, the indicators were standardized due to the units and scales of different measures, not interfering in the result and having similar weights in the calculation of the coefficient of similarity. The measure of dissimilarity used was the Euclidean distance, and to determine the hierarchical grouping method was used the agglomerative coefficient (AC). Validation indexes were used to establish the initial grouping number. The agglomerative method with the best performance regarding the (AL) was Ward with 0.94, indicating a better strength and quality among the agglomerative techniques. The Davies Bouldin (DB), Dunn (D) and Silhouette (SIL) validation indexes indicated that the ideal amount of initial clusters to be formed is 2, however the PBM index found that the ideal formation is with 4 groups. Regarding the municipalities with greater homogeneity, it was found that in the composition with 2 groups, the most similar observations were m105 (Salinópolis) and m109 (Santa Izabel do Pará), followed by the observations m102 (Rio Maria) and m144 (Xinguara), all inserted in group 1.
id UNIFEI_1dc2b1000a1c7a7d0d6a7d605429f294
oai_identifier_str oai:ojs.pkp.sfu.ca:article/2067
network_acronym_str UNIFEI
network_name_str Research, Society and Development
repository_id_str
spelling Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará stateComparación de los métodos de agrupación jerárquica aglomerativa en los indicadores de sostenibilidad de los municipios del estado de ParáComparação de métodos de agrupamentos hierárquicos aglomerativos em indicadores de sustentabilidade em municípios do estado do ParáEl método de WardDistancia euclidianaÍndice de validaciónDendrograma.Método de WardDistância euclidianaÍndice de validaçãoDendrograma.Ward's methodEuclidean distanceValidation indexDendrogram.This study aimed to compare the performance of hierarchical agglomerative clustering methods using a data set composed of several sustainability indicators referring to the municipalities of the state of Pará. As well as determining the number of initial groups to be formed by applying of validity indexes. For the selection of indicators, a check-list of national, regional and local scientific studies addressing the theme of sustainability was carried out. Subsequently, the indicators were standardized due to the units and scales of different measures, not interfering in the result and having similar weights in the calculation of the coefficient of similarity. The measure of dissimilarity used was the Euclidean distance, and to determine the hierarchical grouping method was used the agglomerative coefficient (AC). Validation indexes were used to establish the initial grouping number. The agglomerative method with the best performance regarding the (AL) was Ward with 0.94, indicating a better strength and quality among the agglomerative techniques. The Davies Bouldin (DB), Dunn (D) and Silhouette (SIL) validation indexes indicated that the ideal amount of initial clusters to be formed is 2, however the PBM index found that the ideal formation is with 4 groups. Regarding the municipalities with greater homogeneity, it was found that in the composition with 2 groups, the most similar observations were m105 (Salinópolis) and m109 (Santa Izabel do Pará), followed by the observations m102 (Rio Maria) and m144 (Xinguara), all inserted in group 1.El objetivo de este estudio fue comparar el desempeño de los métodos de agrupación aglomerativa jerárquica utilizando un conjunto de datos compuesto por varios indicadores de sostenibilidad referidos a los municipios del estado de Pará. Además de determinar el número de grupos iniciales que se formarán mediante la aplicación de índices de validación. Para la selección de los indicadores, se llevó a cabo una lista de control de estudios científicos nacionales, regionales y locales sobre el tema de la sostenibilidad. Posteriormente, los indicadores se estandarizaron debido a las unidades y escalas de las diferentes medidas, no interfiriendo en el resultado y teniendo pesos similares en el cálculo del coeficiente de similitud. La medida de disimilitud utilizada fue la distancia euclídea, y para determinar el método de agrupación jerárquica se utilizó el coeficiente aglomerativo (CA). Se utilizaron índices de validación para establecer el número de agrupación inicial. El método aglomerativo con el mejor rendimiento respecto al (AL) fue Ward con 0,94, lo que indica una mayor resistencia y calidad entre las técnicas aglomerativas. Los índices de validación de Davies Bouldin (DB), Dunn (D) y Silhouette (SIL) indicaron que la cantidad ideal de clusters iniciales a formar es 2, sin embargo el índice PBM encontró que la formación ideal es con 4 grupos. En cuanto a los municipios con mayor homogeneidad, se encontró que en la composición con 2 grupos, las observaciones más similares fueron m105 (Salinópolis) y m109 (Santa Izabel do Pará), seguidas por las observaciones m102 (Río María) y m144 (Xinguara), todas insertadas en el grupo 1.Este estudo teve como objetivo comparar o desempenho dos métodos de agrupamento hierárquico aglomerativo utilizando um conjunto de dados composto por diversos indicadores de sustentabilidade referentes aos municípios do estado do Pará. Assim como, definir a quantidade de agrupamentos iniciais a serem constituídos pela utilização dos índices de validade. Para seleção dos indicadores, foi feito um check-list de estudos científicos de abrangência nacional, regional e local que abordam a temática da sustentabilidade. Posteriormente, foi realizado a padronização dos indicadores, devido às unidades e escalas de medidas diferentes, não interferindo no resultado e possuindo pesos semelhantes no cômputo do coeficiente de similaridade. A medida de dissimilaridade empregada foi a distância euclidiana, e para determinar o método de agrupamento hierárquico foi utilizado o coeficiente aglomerativo (CA). Para estabelecer o número de agrupamento inicial foram empregados índices de validação. O método aglomerativo com melhor desempenho quanto ao (CA) foi de Ward com 0,94, indicando uma melhor força e qualidade entre as técnicas aglomerativos. Os índices de validação Davies Bouldin (DB), Dunn (D) e Silhouette (SIL), indicaram que a quantidade ideal de agrupamentos iniciais a ser formado são 2, todavia o índice PBM constatou que a formação ideal é com 4 grupos. Com relação aos municípios maior homogeneidade, verificou-se que na composição com 2 grupos, as observações mais similares foram m105(Salinópolis) e m109(Santa Izabel do Pará), seguido das observações m102 (Rio Maria) e m144 (Xinguara), todas inseridas no grupo 1.Research, Society and Development2020-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://rsdjournal.org/index.php/rsd/article/view/206710.33448/rsd-v9i2.2067Research, Society and Development; Vol. 9 No. 2; e60922067Research, Society and Development; Vol. 9 Núm. 2; e60922067Research, Society and Development; v. 9 n. 2; e609220672525-3409reponame:Research, Society and Developmentinstname:Universidade Federal de Itajubá (UNIFEI)instacron:UNIFEIporhttps://rsdjournal.org/index.php/rsd/article/view/2067/1695Copyright (c) 2019 Diêgo Lima Crispim, Lindemberg Lima Fernandes, David Figueiredo Ferreira Filho, Bruna Roberta Pereira Lirainfo:eu-repo/semantics/openAccessCrispim, Diêgo LimaFernandes, Lindemberg LimaFilho, David Figueiredo FerreiraLira, Bruna Roberta Pereira2020-08-20T18:08:48Zoai:ojs.pkp.sfu.ca:article/2067Revistahttps://rsdjournal.org/index.php/rsd/indexPUBhttps://rsdjournal.org/index.php/rsd/oairsd.articles@gmail.com2525-34092525-3409opendoar:2024-01-17T09:26:51.337480Research, Society and Development - Universidade Federal de Itajubá (UNIFEI)false
dc.title.none.fl_str_mv Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
Comparación de los métodos de agrupación jerárquica aglomerativa en los indicadores de sostenibilidad de los municipios del estado de Pará
Comparação de métodos de agrupamentos hierárquicos aglomerativos em indicadores de sustentabilidade em municípios do estado do Pará
title Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
spellingShingle Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
Crispim, Diêgo Lima
El método de Ward
Distancia euclidiana
Índice de validación
Dendrograma.
Método de Ward
Distância euclidiana
Índice de validação
Dendrograma.
Ward's method
Euclidean distance
Validation index
Dendrogram.
title_short Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
title_full Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
title_fullStr Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
title_full_unstemmed Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
title_sort Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
author Crispim, Diêgo Lima
author_facet Crispim, Diêgo Lima
Fernandes, Lindemberg Lima
Filho, David Figueiredo Ferreira
Lira, Bruna Roberta Pereira
author_role author
author2 Fernandes, Lindemberg Lima
Filho, David Figueiredo Ferreira
Lira, Bruna Roberta Pereira
author2_role author
author
author
dc.contributor.author.fl_str_mv Crispim, Diêgo Lima
Fernandes, Lindemberg Lima
Filho, David Figueiredo Ferreira
Lira, Bruna Roberta Pereira
dc.subject.por.fl_str_mv El método de Ward
Distancia euclidiana
Índice de validación
Dendrograma.
Método de Ward
Distância euclidiana
Índice de validação
Dendrograma.
Ward's method
Euclidean distance
Validation index
Dendrogram.
topic El método de Ward
Distancia euclidiana
Índice de validación
Dendrograma.
Método de Ward
Distância euclidiana
Índice de validação
Dendrograma.
Ward's method
Euclidean distance
Validation index
Dendrogram.
description This study aimed to compare the performance of hierarchical agglomerative clustering methods using a data set composed of several sustainability indicators referring to the municipalities of the state of Pará. As well as determining the number of initial groups to be formed by applying of validity indexes. For the selection of indicators, a check-list of national, regional and local scientific studies addressing the theme of sustainability was carried out. Subsequently, the indicators were standardized due to the units and scales of different measures, not interfering in the result and having similar weights in the calculation of the coefficient of similarity. The measure of dissimilarity used was the Euclidean distance, and to determine the hierarchical grouping method was used the agglomerative coefficient (AC). Validation indexes were used to establish the initial grouping number. The agglomerative method with the best performance regarding the (AL) was Ward with 0.94, indicating a better strength and quality among the agglomerative techniques. The Davies Bouldin (DB), Dunn (D) and Silhouette (SIL) validation indexes indicated that the ideal amount of initial clusters to be formed is 2, however the PBM index found that the ideal formation is with 4 groups. Regarding the municipalities with greater homogeneity, it was found that in the composition with 2 groups, the most similar observations were m105 (Salinópolis) and m109 (Santa Izabel do Pará), followed by the observations m102 (Rio Maria) and m144 (Xinguara), all inserted in group 1.
publishDate 2020
dc.date.none.fl_str_mv 2020-01-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://rsdjournal.org/index.php/rsd/article/view/2067
10.33448/rsd-v9i2.2067
url https://rsdjournal.org/index.php/rsd/article/view/2067
identifier_str_mv 10.33448/rsd-v9i2.2067
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://rsdjournal.org/index.php/rsd/article/view/2067/1695
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Research, Society and Development
publisher.none.fl_str_mv Research, Society and Development
dc.source.none.fl_str_mv Research, Society and Development; Vol. 9 No. 2; e60922067
Research, Society and Development; Vol. 9 Núm. 2; e60922067
Research, Society and Development; v. 9 n. 2; e60922067
2525-3409
reponame:Research, Society and Development
instname:Universidade Federal de Itajubá (UNIFEI)
instacron:UNIFEI
instname_str Universidade Federal de Itajubá (UNIFEI)
instacron_str UNIFEI
institution UNIFEI
reponame_str Research, Society and Development
collection Research, Society and Development
repository.name.fl_str_mv Research, Society and Development - Universidade Federal de Itajubá (UNIFEI)
repository.mail.fl_str_mv rsd.articles@gmail.com
_version_ 1797052645439963136