The influence of feature grouping algorithm in outlier detection with categorical data

Detalhes bibliográficos
Autor(a) principal: Nathaniel, Sharon Femi Paul Sunder
Data de Publicação: 2024
Outros Autores: Alwarsamy, Kala, Viswanathan, Rajalakshmi, Subramanian, Ganesh Vaidyanathan, Veerabahu, Vidhya
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Acta scientiarum. Technology (Online)
Texto Completo: http://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/article/view/66902
Resumo: Outlier mining has become a rapidly developing domain over the recent years with increasing importance in the fields like banking, sensor networks, and health care. In general, anomaly detection methods are compatible with numerical data and ignore categorical data. However, in real-time problems, both numerical and categorical data are to be considered to obtain accurate results. There are several methods available for the outlier detection of high dimensional data in numerical data. In this paper, a feature grouping algorithm for anomaly detection is proposed that considers the categorical data also. This algorithm correlates the features of categorical data and forms feature clusters and detects the outliers. The features are assigned feature weights based on their levels of appearance and the outlier scores are determined. The performance of the feature grouping algorithm is then compared with the traditional algorithms like LOF and Isolation Forest algorithm and state-of-the-art methods like WATCH on UCI datasets. From the experimental evaluation of the results obtained, it is found that the proposed algorithm is comparatively better than the existing algorithms for categorical data.
id UEM-6_00907021396737acf4070ca18a5b953d
oai_identifier_str oai:periodicos.uem.br/ojs:article/66902
network_acronym_str UEM-6
network_name_str Acta scientiarum. Technology (Online)
repository_id_str
spelling The influence of feature grouping algorithm in outlier detection with categorical data The influence of feature grouping algorithm in outlier detection with categorical data outlier detection; feature grouping; categorical data; lof; isolation forest.outlier detection; feature grouping; categorical data; lof; isolation forest.Outlier mining has become a rapidly developing domain over the recent years with increasing importance in the fields like banking, sensor networks, and health care. In general, anomaly detection methods are compatible with numerical data and ignore categorical data. However, in real-time problems, both numerical and categorical data are to be considered to obtain accurate results. There are several methods available for the outlier detection of high dimensional data in numerical data. In this paper, a feature grouping algorithm for anomaly detection is proposed that considers the categorical data also. This algorithm correlates the features of categorical data and forms feature clusters and detects the outliers. The features are assigned feature weights based on their levels of appearance and the outlier scores are determined. The performance of the feature grouping algorithm is then compared with the traditional algorithms like LOF and Isolation Forest algorithm and state-of-the-art methods like WATCH on UCI datasets. From the experimental evaluation of the results obtained, it is found that the proposed algorithm is comparatively better than the existing algorithms for categorical data.Outlier mining has become a rapidly developing domain over the recent years with increasing importance in the fields like banking, sensor networks, and health care. In general, anomaly detection methods are compatible with numerical data and ignore categorical data. However, in real-time problems, both numerical and categorical data are to be considered to obtain accurate results. There are several methods available for the outlier detection of high dimensional data in numerical data. In this paper, a feature grouping algorithm for anomaly detection is proposed that considers the categorical data also. This algorithm correlates the features of categorical data and forms feature clusters and detects the outliers. The features are assigned feature weights based on their levels of appearance and the outlier scores are determined. The performance of the feature grouping algorithm is then compared with the traditional algorithms like LOF and Isolation Forest algorithm and state-of-the-art methods like WATCH on UCI datasets. From the experimental evaluation of the results obtained, it is found that the proposed algorithm is comparatively better than the existing algorithms for categorical data.Universidade Estadual De Maringá2024-04-17info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/article/view/6690210.4025/actascitechnol.v46i1.66902Acta Scientiarum. Technology; Vol 46 No 1 (2024): Em proceso; e66902Acta Scientiarum. Technology; v. 46 n. 1 (2024): Publicação contínua; e669021806-25631807-8664reponame:Acta scientiarum. Technology (Online)instname:Universidade Estadual de Maringá (UEM)instacron:UEMenghttp://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/article/view/66902/751375157436Copyright (c) 2024 Acta Scientiarum. Technologyhttp://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessNathaniel, Sharon Femi Paul Sunder Alwarsamy, Kala Viswanathan, Rajalakshmi Subramanian, Ganesh Vaidyanathan Veerabahu, Vidhya2024-04-17T14:13:24Zoai:periodicos.uem.br/ojs:article/66902Revistahttps://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/indexPUBhttps://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/oai||actatech@uem.br1807-86641806-2563opendoar:2024-04-17T14:13:24Acta scientiarum. Technology (Online) - Universidade Estadual de Maringá (UEM)false
dc.title.none.fl_str_mv The influence of feature grouping algorithm in outlier detection with categorical data
The influence of feature grouping algorithm in outlier detection with categorical data
title The influence of feature grouping algorithm in outlier detection with categorical data
spellingShingle The influence of feature grouping algorithm in outlier detection with categorical data
Nathaniel, Sharon Femi Paul Sunder
outlier detection; feature grouping; categorical data; lof; isolation forest.
outlier detection; feature grouping; categorical data; lof; isolation forest.
title_short The influence of feature grouping algorithm in outlier detection with categorical data
title_full The influence of feature grouping algorithm in outlier detection with categorical data
title_fullStr The influence of feature grouping algorithm in outlier detection with categorical data
title_full_unstemmed The influence of feature grouping algorithm in outlier detection with categorical data
title_sort The influence of feature grouping algorithm in outlier detection with categorical data
author Nathaniel, Sharon Femi Paul Sunder
author_facet Nathaniel, Sharon Femi Paul Sunder
Alwarsamy, Kala
Viswanathan, Rajalakshmi
Subramanian, Ganesh Vaidyanathan
Veerabahu, Vidhya
author_role author
author2 Alwarsamy, Kala
Viswanathan, Rajalakshmi
Subramanian, Ganesh Vaidyanathan
Veerabahu, Vidhya
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Nathaniel, Sharon Femi Paul Sunder
Alwarsamy, Kala
Viswanathan, Rajalakshmi
Subramanian, Ganesh Vaidyanathan
Veerabahu, Vidhya
dc.subject.por.fl_str_mv outlier detection; feature grouping; categorical data; lof; isolation forest.
outlier detection; feature grouping; categorical data; lof; isolation forest.
topic outlier detection; feature grouping; categorical data; lof; isolation forest.
outlier detection; feature grouping; categorical data; lof; isolation forest.
description Outlier mining has become a rapidly developing domain over the recent years with increasing importance in the fields like banking, sensor networks, and health care. In general, anomaly detection methods are compatible with numerical data and ignore categorical data. However, in real-time problems, both numerical and categorical data are to be considered to obtain accurate results. There are several methods available for the outlier detection of high dimensional data in numerical data. In this paper, a feature grouping algorithm for anomaly detection is proposed that considers the categorical data also. This algorithm correlates the features of categorical data and forms feature clusters and detects the outliers. The features are assigned feature weights based on their levels of appearance and the outlier scores are determined. The performance of the feature grouping algorithm is then compared with the traditional algorithms like LOF and Isolation Forest algorithm and state-of-the-art methods like WATCH on UCI datasets. From the experimental evaluation of the results obtained, it is found that the proposed algorithm is comparatively better than the existing algorithms for categorical data.
publishDate 2024
dc.date.none.fl_str_mv 2024-04-17
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/article/view/66902
10.4025/actascitechnol.v46i1.66902
url http://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/article/view/66902
identifier_str_mv 10.4025/actascitechnol.v46i1.66902
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv http://www.periodicos.uem.br/ojs/index.php/ActaSciTechnol/article/view/66902/751375157436
dc.rights.driver.fl_str_mv Copyright (c) 2024 Acta Scientiarum. Technology
http://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2024 Acta Scientiarum. Technology
http://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Estadual De Maringá
publisher.none.fl_str_mv Universidade Estadual De Maringá
dc.source.none.fl_str_mv Acta Scientiarum. Technology; Vol 46 No 1 (2024): Em proceso; e66902
Acta Scientiarum. Technology; v. 46 n. 1 (2024): Publicação contínua; e66902
1806-2563
1807-8664
reponame:Acta scientiarum. Technology (Online)
instname:Universidade Estadual de Maringá (UEM)
instacron:UEM
instname_str Universidade Estadual de Maringá (UEM)
instacron_str UEM
institution UEM
reponame_str Acta scientiarum. Technology (Online)
collection Acta scientiarum. Technology (Online)
repository.name.fl_str_mv Acta scientiarum. Technology (Online) - Universidade Estadual de Maringá (UEM)
repository.mail.fl_str_mv ||actatech@uem.br
_version_ 1799315338405347328