Cluster Analysis in Practice: Dealing with Outliers in Managerial Research

Detalhes bibliográficos
Autor(a) principal: Lopes,Humberto Elias Garcia
Data de Publicação: 2021
Outros Autores: Gosling,Marlusa de Sevilha
Tipo de documento: Artigo
Idioma: eng
Título da fonte: RAC. Revista de Administração Contemporânea (Online)
Texto Completo: http://old.scielo.br/scielo.php?script=sci_arttext&pid=S1415-65552021000100502
Resumo: ABSTRACT Context: in recent years, cluster analysis has stimulated researchers to explore new ways to understand data behavior. The computational ease of this method and its ability to generate consistent outputs, even in small datasets, explain that to some extent. However, researchers are often mistaken in holding that clustering is a terrain in which anything goes. The literature shows the opposite: they must be careful, especially regarding the effect of outliers on cluster formation. Objective: in this tutorial paper, we contribute to this discussion by presenting four clustering techniques and their respective advantages and disadvantages in the treatment of outliers. Methods: for that, we worked from a managerial dataset and analyzed it using k-means, PAM, DBSCAN, and FCM techniques. Results: our analyzes indicate that researchers have distinct clustering techniques for dealing with outliers accordingly. Conclusion: we concluded that researchers need to have a more diversified repertoire of clustering techniques. After all, this would give them two relevant empirical alternatives: choose the most appropriate technique for their research objectives or adopt a multi-method approach.
id ANPPGA-1_054bcf57de1df1586025952f15104950
oai_identifier_str oai:scielo:S1415-65552021000100502
network_acronym_str ANPPGA-1
network_name_str RAC. Revista de Administração Contemporânea (Online)
repository_id_str
spelling Cluster Analysis in Practice: Dealing with Outliers in Managerial Researchcluster analysisoutliersk-meansDBSCANfuzzy clusteringABSTRACT Context: in recent years, cluster analysis has stimulated researchers to explore new ways to understand data behavior. The computational ease of this method and its ability to generate consistent outputs, even in small datasets, explain that to some extent. However, researchers are often mistaken in holding that clustering is a terrain in which anything goes. The literature shows the opposite: they must be careful, especially regarding the effect of outliers on cluster formation. Objective: in this tutorial paper, we contribute to this discussion by presenting four clustering techniques and their respective advantages and disadvantages in the treatment of outliers. Methods: for that, we worked from a managerial dataset and analyzed it using k-means, PAM, DBSCAN, and FCM techniques. Results: our analyzes indicate that researchers have distinct clustering techniques for dealing with outliers accordingly. Conclusion: we concluded that researchers need to have a more diversified repertoire of clustering techniques. After all, this would give them two relevant empirical alternatives: choose the most appropriate technique for their research objectives or adopt a multi-method approach.Associação Nacional de Pós-Graduação e Pesquisa em Administração2021-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S1415-65552021000100502Revista de Administração Contemporânea v.25 n.1 2021reponame:RAC. Revista de Administração Contemporânea (Online)instname:Associação Nacional de Pós-Graduação e Pesquisa em Administração (ANPAD)instacron:ANPAD10.1590/1982-7849rac2021200081info:eu-repo/semantics/openAccessLopes,Humberto Elias GarciaGosling,Marlusa de Sevilhaeng2020-10-16T00:00:00Zoai:scielo:S1415-65552021000100502Revistahttps://rac.anpad.org.br/index.php/racONGhttps://rac.anpad.org.br/index.php/rac/oairac@anpad.org.br1982-78491415-6555opendoar:2020-10-16T00:00RAC. Revista de Administração Contemporânea (Online) - Associação Nacional de Pós-Graduação e Pesquisa em Administração (ANPAD)false
dc.title.none.fl_str_mv Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
title Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
spellingShingle Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
Lopes,Humberto Elias Garcia
cluster analysis
outliers
k-means
DBSCAN
fuzzy clustering
title_short Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
title_full Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
title_fullStr Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
title_full_unstemmed Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
title_sort Cluster Analysis in Practice: Dealing with Outliers in Managerial Research
author Lopes,Humberto Elias Garcia
author_facet Lopes,Humberto Elias Garcia
Gosling,Marlusa de Sevilha
author_role author
author2 Gosling,Marlusa de Sevilha
author2_role author
dc.contributor.author.fl_str_mv Lopes,Humberto Elias Garcia
Gosling,Marlusa de Sevilha
dc.subject.por.fl_str_mv cluster analysis
outliers
k-means
DBSCAN
fuzzy clustering
topic cluster analysis
outliers
k-means
DBSCAN
fuzzy clustering
description ABSTRACT Context: in recent years, cluster analysis has stimulated researchers to explore new ways to understand data behavior. The computational ease of this method and its ability to generate consistent outputs, even in small datasets, explain that to some extent. However, researchers are often mistaken in holding that clustering is a terrain in which anything goes. The literature shows the opposite: they must be careful, especially regarding the effect of outliers on cluster formation. Objective: in this tutorial paper, we contribute to this discussion by presenting four clustering techniques and their respective advantages and disadvantages in the treatment of outliers. Methods: for that, we worked from a managerial dataset and analyzed it using k-means, PAM, DBSCAN, and FCM techniques. Results: our analyzes indicate that researchers have distinct clustering techniques for dealing with outliers accordingly. Conclusion: we concluded that researchers need to have a more diversified repertoire of clustering techniques. After all, this would give them two relevant empirical alternatives: choose the most appropriate technique for their research objectives or adopt a multi-method approach.
publishDate 2021
dc.date.none.fl_str_mv 2021-01-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://old.scielo.br/scielo.php?script=sci_arttext&pid=S1415-65552021000100502
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S1415-65552021000100502
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1590/1982-7849rac2021200081
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv text/html
dc.publisher.none.fl_str_mv Associação Nacional de Pós-Graduação e Pesquisa em Administração
publisher.none.fl_str_mv Associação Nacional de Pós-Graduação e Pesquisa em Administração
dc.source.none.fl_str_mv Revista de Administração Contemporânea v.25 n.1 2021
reponame:RAC. Revista de Administração Contemporânea (Online)
instname:Associação Nacional de Pós-Graduação e Pesquisa em Administração (ANPAD)
instacron:ANPAD
instname_str Associação Nacional de Pós-Graduação e Pesquisa em Administração (ANPAD)
instacron_str ANPAD
institution ANPAD
reponame_str RAC. Revista de Administração Contemporânea (Online)
collection RAC. Revista de Administração Contemporânea (Online)
repository.name.fl_str_mv RAC. Revista de Administração Contemporânea (Online) - Associação Nacional de Pós-Graduação e Pesquisa em Administração (ANPAD)
repository.mail.fl_str_mv rac@anpad.org.br
_version_ 1754209053767106560