Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | INFOCOMP: Jornal de Ciência da Computação |
Texto Completo: | https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492 |
Resumo: | With the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate. In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries. |
id |
UFLA-5_605573cc3eeea18bd6a5581e063647b8 |
---|---|
oai_identifier_str |
oai:infocomp.dcc.ufla.br:article/2492 |
network_acronym_str |
UFLA-5 |
network_name_str |
INFOCOMP: Jornal de Ciência da Computação |
repository_id_str |
|
spelling |
Unification of Numerical and Ordinal Survey Data for Clustering-based InferencingWith the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate. In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries.Editora da UFLA2023-07-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492INFOCOMP Journal of Computer Science; Vol. 22 No. 1 (2023): June 20231982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492/598Copyright (c) 2023 Bhupendera Kumar, Rajeev Kumarinfo:eu-repo/semantics/openAccessKumar, BhupenderaKumar, Rajeev 2023-07-09T00:28:43Zoai:infocomp.dcc.ufla.br:article/2492Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:48.640371INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true |
dc.title.none.fl_str_mv |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
title |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
spellingShingle |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing Kumar, Bhupendera |
title_short |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
title_full |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
title_fullStr |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
title_full_unstemmed |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
title_sort |
Unification of Numerical and Ordinal Survey Data for Clustering-based Inferencing |
author |
Kumar, Bhupendera |
author_facet |
Kumar, Bhupendera Kumar, Rajeev |
author_role |
author |
author2 |
Kumar, Rajeev |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Kumar, Bhupendera Kumar, Rajeev |
description |
With the proliferation of surveys for almost every issue governing our life with various parameters and a variety of data, it becomes necessary for a researcher to unify these data followed for extracting inferences from the survey. Data from quantitative surveys are clustered to reveal respondents' divergent and dominant tendencies. It aims to investigate the general trends among the respondents' categories. Due to the unique characteristics of survey data, popular clustering techniques based on value similarity are inadequate. In this paper, we attempt to unify the numerical data with the ordinal data of a survey. We model the data with a Gaussian distribution, therefore, we first convert the numerical data to ordinal data following the distribution; this may be the governing attributes for deciding the clusters. Then, we use $K$-means clustering with varying numbers of clusters. We implement the proposed methodologies on real survey data and compare the clustering efficiency before and after the proposed methodology on the number of clusters. More crucially, it appropriately uses the ordinal attributes order information and numerical attribute statistical information for clustering. Extensive testing demonstrates that the suggested unification works better on real data sets than its contemporaries. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-07-09 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492 |
url |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/2492/598 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2023 Bhupendera Kumar, Rajeev Kumar info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2023 Bhupendera Kumar, Rajeev Kumar |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Editora da UFLA |
publisher.none.fl_str_mv |
Editora da UFLA |
dc.source.none.fl_str_mv |
INFOCOMP Journal of Computer Science; Vol. 22 No. 1 (2023): June 2023 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
INFOCOMP: Jornal de Ciência da Computação |
collection |
INFOCOMP: Jornal de Ciência da Computação |
repository.name.fl_str_mv |
INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
infocomp@dcc.ufla.br||apfreire@dcc.ufla.br |
_version_ |
1799874742705979392 |