Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Research, Society and Development |
Texto Completo: | https://rsdjournal.org/index.php/rsd/article/view/2067 |
Resumo: | This study aimed to compare the performance of hierarchical agglomerative clustering methods using a data set composed of several sustainability indicators referring to the municipalities of the state of Pará. As well as determining the number of initial groups to be formed by applying of validity indexes. For the selection of indicators, a check-list of national, regional and local scientific studies addressing the theme of sustainability was carried out. Subsequently, the indicators were standardized due to the units and scales of different measures, not interfering in the result and having similar weights in the calculation of the coefficient of similarity. The measure of dissimilarity used was the Euclidean distance, and to determine the hierarchical grouping method was used the agglomerative coefficient (AC). Validation indexes were used to establish the initial grouping number. The agglomerative method with the best performance regarding the (AL) was Ward with 0.94, indicating a better strength and quality among the agglomerative techniques. The Davies Bouldin (DB), Dunn (D) and Silhouette (SIL) validation indexes indicated that the ideal amount of initial clusters to be formed is 2, however the PBM index found that the ideal formation is with 4 groups. Regarding the municipalities with greater homogeneity, it was found that in the composition with 2 groups, the most similar observations were m105 (Salinópolis) and m109 (Santa Izabel do Pará), followed by the observations m102 (Rio Maria) and m144 (Xinguara), all inserted in group 1. |
id |
UNIFEI_1dc2b1000a1c7a7d0d6a7d605429f294 |
---|---|
oai_identifier_str |
oai:ojs.pkp.sfu.ca:article/2067 |
network_acronym_str |
UNIFEI |
network_name_str |
Research, Society and Development |
repository_id_str |
|
spelling |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará stateComparación de los métodos de agrupación jerárquica aglomerativa en los indicadores de sostenibilidad de los municipios del estado de ParáComparação de métodos de agrupamentos hierárquicos aglomerativos em indicadores de sustentabilidade em municípios do estado do ParáEl método de WardDistancia euclidianaÍndice de validaciónDendrograma.Método de WardDistância euclidianaÍndice de validaçãoDendrograma.Ward's methodEuclidean distanceValidation indexDendrogram.This study aimed to compare the performance of hierarchical agglomerative clustering methods using a data set composed of several sustainability indicators referring to the municipalities of the state of Pará. As well as determining the number of initial groups to be formed by applying of validity indexes. For the selection of indicators, a check-list of national, regional and local scientific studies addressing the theme of sustainability was carried out. Subsequently, the indicators were standardized due to the units and scales of different measures, not interfering in the result and having similar weights in the calculation of the coefficient of similarity. The measure of dissimilarity used was the Euclidean distance, and to determine the hierarchical grouping method was used the agglomerative coefficient (AC). Validation indexes were used to establish the initial grouping number. The agglomerative method with the best performance regarding the (AL) was Ward with 0.94, indicating a better strength and quality among the agglomerative techniques. The Davies Bouldin (DB), Dunn (D) and Silhouette (SIL) validation indexes indicated that the ideal amount of initial clusters to be formed is 2, however the PBM index found that the ideal formation is with 4 groups. Regarding the municipalities with greater homogeneity, it was found that in the composition with 2 groups, the most similar observations were m105 (Salinópolis) and m109 (Santa Izabel do Pará), followed by the observations m102 (Rio Maria) and m144 (Xinguara), all inserted in group 1.El objetivo de este estudio fue comparar el desempeño de los métodos de agrupación aglomerativa jerárquica utilizando un conjunto de datos compuesto por varios indicadores de sostenibilidad referidos a los municipios del estado de Pará. Además de determinar el número de grupos iniciales que se formarán mediante la aplicación de índices de validación. Para la selección de los indicadores, se llevó a cabo una lista de control de estudios científicos nacionales, regionales y locales sobre el tema de la sostenibilidad. Posteriormente, los indicadores se estandarizaron debido a las unidades y escalas de las diferentes medidas, no interfiriendo en el resultado y teniendo pesos similares en el cálculo del coeficiente de similitud. La medida de disimilitud utilizada fue la distancia euclídea, y para determinar el método de agrupación jerárquica se utilizó el coeficiente aglomerativo (CA). Se utilizaron índices de validación para establecer el número de agrupación inicial. El método aglomerativo con el mejor rendimiento respecto al (AL) fue Ward con 0,94, lo que indica una mayor resistencia y calidad entre las técnicas aglomerativas. Los índices de validación de Davies Bouldin (DB), Dunn (D) y Silhouette (SIL) indicaron que la cantidad ideal de clusters iniciales a formar es 2, sin embargo el índice PBM encontró que la formación ideal es con 4 grupos. En cuanto a los municipios con mayor homogeneidad, se encontró que en la composición con 2 grupos, las observaciones más similares fueron m105 (Salinópolis) y m109 (Santa Izabel do Pará), seguidas por las observaciones m102 (Río María) y m144 (Xinguara), todas insertadas en el grupo 1.Este estudo teve como objetivo comparar o desempenho dos métodos de agrupamento hierárquico aglomerativo utilizando um conjunto de dados composto por diversos indicadores de sustentabilidade referentes aos municípios do estado do Pará. Assim como, definir a quantidade de agrupamentos iniciais a serem constituídos pela utilização dos índices de validade. Para seleção dos indicadores, foi feito um check-list de estudos científicos de abrangência nacional, regional e local que abordam a temática da sustentabilidade. Posteriormente, foi realizado a padronização dos indicadores, devido às unidades e escalas de medidas diferentes, não interferindo no resultado e possuindo pesos semelhantes no cômputo do coeficiente de similaridade. A medida de dissimilaridade empregada foi a distância euclidiana, e para determinar o método de agrupamento hierárquico foi utilizado o coeficiente aglomerativo (CA). Para estabelecer o número de agrupamento inicial foram empregados índices de validação. O método aglomerativo com melhor desempenho quanto ao (CA) foi de Ward com 0,94, indicando uma melhor força e qualidade entre as técnicas aglomerativos. Os índices de validação Davies Bouldin (DB), Dunn (D) e Silhouette (SIL), indicaram que a quantidade ideal de agrupamentos iniciais a ser formado são 2, todavia o índice PBM constatou que a formação ideal é com 4 grupos. Com relação aos municípios maior homogeneidade, verificou-se que na composição com 2 grupos, as observações mais similares foram m105(Salinópolis) e m109(Santa Izabel do Pará), seguido das observações m102 (Rio Maria) e m144 (Xinguara), todas inseridas no grupo 1.Research, Society and Development2020-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://rsdjournal.org/index.php/rsd/article/view/206710.33448/rsd-v9i2.2067Research, Society and Development; Vol. 9 No. 2; e60922067Research, Society and Development; Vol. 9 Núm. 2; e60922067Research, Society and Development; v. 9 n. 2; e609220672525-3409reponame:Research, Society and Developmentinstname:Universidade Federal de Itajubá (UNIFEI)instacron:UNIFEIporhttps://rsdjournal.org/index.php/rsd/article/view/2067/1695Copyright (c) 2019 Diêgo Lima Crispim, Lindemberg Lima Fernandes, David Figueiredo Ferreira Filho, Bruna Roberta Pereira Lirainfo:eu-repo/semantics/openAccessCrispim, Diêgo LimaFernandes, Lindemberg LimaFilho, David Figueiredo FerreiraLira, Bruna Roberta Pereira2020-08-20T18:08:48Zoai:ojs.pkp.sfu.ca:article/2067Revistahttps://rsdjournal.org/index.php/rsd/indexPUBhttps://rsdjournal.org/index.php/rsd/oairsd.articles@gmail.com2525-34092525-3409opendoar:2024-01-17T09:26:51.337480Research, Society and Development - Universidade Federal de Itajubá (UNIFEI)false |
dc.title.none.fl_str_mv |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state Comparación de los métodos de agrupación jerárquica aglomerativa en los indicadores de sostenibilidad de los municipios del estado de Pará Comparação de métodos de agrupamentos hierárquicos aglomerativos em indicadores de sustentabilidade em municípios do estado do Pará |
title |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state |
spellingShingle |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state Crispim, Diêgo Lima El método de Ward Distancia euclidiana Índice de validación Dendrograma. Método de Ward Distância euclidiana Índice de validação Dendrograma. Ward's method Euclidean distance Validation index Dendrogram. |
title_short |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state |
title_full |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state |
title_fullStr |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state |
title_full_unstemmed |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state |
title_sort |
Comparison of cluster methods agglomerative hierarchical in sustainability indicators in municipalities of Pará state |
author |
Crispim, Diêgo Lima |
author_facet |
Crispim, Diêgo Lima Fernandes, Lindemberg Lima Filho, David Figueiredo Ferreira Lira, Bruna Roberta Pereira |
author_role |
author |
author2 |
Fernandes, Lindemberg Lima Filho, David Figueiredo Ferreira Lira, Bruna Roberta Pereira |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Crispim, Diêgo Lima Fernandes, Lindemberg Lima Filho, David Figueiredo Ferreira Lira, Bruna Roberta Pereira |
dc.subject.por.fl_str_mv |
El método de Ward Distancia euclidiana Índice de validación Dendrograma. Método de Ward Distância euclidiana Índice de validação Dendrograma. Ward's method Euclidean distance Validation index Dendrogram. |
topic |
El método de Ward Distancia euclidiana Índice de validación Dendrograma. Método de Ward Distância euclidiana Índice de validação Dendrograma. Ward's method Euclidean distance Validation index Dendrogram. |
description |
This study aimed to compare the performance of hierarchical agglomerative clustering methods using a data set composed of several sustainability indicators referring to the municipalities of the state of Pará. As well as determining the number of initial groups to be formed by applying of validity indexes. For the selection of indicators, a check-list of national, regional and local scientific studies addressing the theme of sustainability was carried out. Subsequently, the indicators were standardized due to the units and scales of different measures, not interfering in the result and having similar weights in the calculation of the coefficient of similarity. The measure of dissimilarity used was the Euclidean distance, and to determine the hierarchical grouping method was used the agglomerative coefficient (AC). Validation indexes were used to establish the initial grouping number. The agglomerative method with the best performance regarding the (AL) was Ward with 0.94, indicating a better strength and quality among the agglomerative techniques. The Davies Bouldin (DB), Dunn (D) and Silhouette (SIL) validation indexes indicated that the ideal amount of initial clusters to be formed is 2, however the PBM index found that the ideal formation is with 4 groups. Regarding the municipalities with greater homogeneity, it was found that in the composition with 2 groups, the most similar observations were m105 (Salinópolis) and m109 (Santa Izabel do Pará), followed by the observations m102 (Rio Maria) and m144 (Xinguara), all inserted in group 1. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-01-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://rsdjournal.org/index.php/rsd/article/view/2067 10.33448/rsd-v9i2.2067 |
url |
https://rsdjournal.org/index.php/rsd/article/view/2067 |
identifier_str_mv |
10.33448/rsd-v9i2.2067 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
https://rsdjournal.org/index.php/rsd/article/view/2067/1695 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Research, Society and Development |
publisher.none.fl_str_mv |
Research, Society and Development |
dc.source.none.fl_str_mv |
Research, Society and Development; Vol. 9 No. 2; e60922067 Research, Society and Development; Vol. 9 Núm. 2; e60922067 Research, Society and Development; v. 9 n. 2; e60922067 2525-3409 reponame:Research, Society and Development instname:Universidade Federal de Itajubá (UNIFEI) instacron:UNIFEI |
instname_str |
Universidade Federal de Itajubá (UNIFEI) |
instacron_str |
UNIFEI |
institution |
UNIFEI |
reponame_str |
Research, Society and Development |
collection |
Research, Society and Development |
repository.name.fl_str_mv |
Research, Society and Development - Universidade Federal de Itajubá (UNIFEI) |
repository.mail.fl_str_mv |
rsd.articles@gmail.com |
_version_ |
1797052645439963136 |