Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing

Detalhes bibliográficos
Autor(a) principal: Kumar, Bhupendera
Data de Publicação: 2023
Outros Autores: Kumar, Rajeev
Tipo de documento: Artigo
Idioma: eng
Título da fonte: INFOCOMP: Jornal de Ciência da Computação
Texto Completo: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492
Resumo: With the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate. In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries.
id UFLA-5_605573cc3eeea18bd6a5581e063647b8
oai_identifier_str oai:infocomp.dcc.ufla.br:article/2492
network_acronym_str UFLA-5
network_name_str INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling Unification of Numerical and Ordinal Survey Data for Clustering-based InferencingWith the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate. In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries.Editora da UFLA2023-07-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492INFOCOMP Journal of Computer Science; Vol. 22 No. 1 (2023): June 20231982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492/598Copyright (c) 2023 Bhupendera Kumar, Rajeev Kumarinfo:eu-repo/semantics/openAccessKumar, BhupenderaKumar, Rajeev 2023-07-09T00:28:43Zoai:infocomp.dcc.ufla.br:article/2492Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:48.640371INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
title Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
spellingShingle Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
Kumar, Bhupendera
title_short Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
title_full Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
title_fullStr Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
title_full_unstemmed Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
title_sort Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
author Kumar, Bhupendera
author_facet Kumar, Bhupendera
Kumar, Rajeev
author_role author
author2 Kumar, Rajeev
author2_role author
dc.contributor.author.fl_str_mv Kumar, Bhupendera
Kumar, Rajeev
description With the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate. In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries.
publishDate 2023
dc.date.none.fl_str_mv 2023-07-09
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492
url https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492/598
dc.rights.driver.fl_str_mv Copyright (c) 2023 Bhupendera Kumar, Rajeev Kumar
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2023 Bhupendera Kumar, Rajeev Kumar
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Editora da UFLA
publisher.none.fl_str_mv Editora da UFLA
dc.source.none.fl_str_mv INFOCOMP Journal of Computer Science; Vol. 22 No. 1 (2023): June 2023
1982-3363
1807-4545
reponame:INFOCOMP: Jornal de Ciência da Computação
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str INFOCOMP: Jornal de Ciência da Computação
collection INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv infocomp@dcc.ufla.br||apfreire@dcc.ufla.br
_version_ 1799874742705979392