A comparative analysis of undersampling techniques for network intrusion detection systems design

Detalhes bibliográficos
Autor(a) principal: Silva, Bruno Riccelli dos Santos
Data de Publicação: 2021
Outros Autores: Silveira, Ricardo Jardel, Silva Neto, Manuel Gonçalves da, Cortez, Paulo César, Gomes, Danielo Gonçalves
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da Universidade Federal do Ceará (UFC)
Texto Completo: http://www.repositorio.ufc.br/handle/riufc/70571
Resumo: Intrusion Detection Systems (IDS) figure as one of the leading solutions adopted in the network security area to prevent intrusions and ensure data and services security. However, this issue requires IDS to be assertive and efficient processing time. Undersampling techniques allow classifiers to be evaluated from smaller subsets in a representative manner, aiming high assertive metrics in less processing time. There are several solutions in literature for IDS projects, but some criteria are not respected, such as the adoption of a replicable methodology. In this work, we selected three undersampling methodologies: random, Cluster centroids, and NearMiss in two novel unbalanced datasets (CIC2017 and CIC2018) for comparison between five classifiers using cross-validation and Wilcoxon statistical test. Our main contribution is a systematic and replicable methodology for using subsampling techniques to balance the data sets adopted in the IDS project. We choose three metrics for classifier's choice in an IDS design: accuracy, f1-measure, and processing time. The results indicate that the under-sampling by Cluster centroids presents the best performance when applied to distance-based classifiers. Moreover, under-sampling techniques influence the process of choosing the best classifier in the design of an IDS.
id UFC-7_27bde1c456779523c27d024d38acb3c6
oai_identifier_str oai:repositorio.ufc.br:riufc/70571
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling A comparative analysis of undersampling techniques for network intrusion detection systems designIntrusion detection systemsUndersamplingCICIDS2017CICIDS2018Intrusion Detection Systems (IDS) figure as one of the leading solutions adopted in the network security area to prevent intrusions and ensure data and services security. However, this issue requires IDS to be assertive and efficient processing time. Undersampling techniques allow classifiers to be evaluated from smaller subsets in a representative manner, aiming high assertive metrics in less processing time. There are several solutions in literature for IDS projects, but some criteria are not respected, such as the adoption of a replicable methodology. In this work, we selected three undersampling methodologies: random, Cluster centroids, and NearMiss in two novel unbalanced datasets (CIC2017 and CIC2018) for comparison between five classifiers using cross-validation and Wilcoxon statistical test. Our main contribution is a systematic and replicable methodology for using subsampling techniques to balance the data sets adopted in the IDS project. We choose three metrics for classifier's choice in an IDS design: accuracy, f1-measure, and processing time. The results indicate that the under-sampling by Cluster centroids presents the best performance when applied to distance-based classifiers. Moreover, under-sampling techniques influence the process of choosing the best classifier in the design of an IDS.Journal of Communication and Information Systems2023-02-08T13:19:05Z2023-02-08T13:19:05Z2021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfCORTEZ, P. C. et al. A comparative analysis of undersampling techniques for network intrusion detection systems design. Journal of Communication and Information Systems, [s.l.], v. 36, n. 1, p. 31-43, 2021. DOI: https://doi.org/10.14209/jcis.2021.31980-6604http://www.repositorio.ufc.br/handle/riufc/70571Silva, Bruno Riccelli dos SantosSilveira, Ricardo JardelSilva Neto, Manuel Gonçalves daCortez, Paulo CésarGomes, Danielo Gonçalvesengreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccess2023-02-16T12:58:47Zoai:repositorio.ufc.br:riufc/70571Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2024-09-11T18:42:57.582405Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.none.fl_str_mv A comparative analysis of undersampling techniques for network intrusion detection systems design
title A comparative analysis of undersampling techniques for network intrusion detection systems design
spellingShingle A comparative analysis of undersampling techniques for network intrusion detection systems design
Silva, Bruno Riccelli dos Santos
Intrusion detection systems
Undersampling
CICIDS2017
CICIDS2018
title_short A comparative analysis of undersampling techniques for network intrusion detection systems design
title_full A comparative analysis of undersampling techniques for network intrusion detection systems design
title_fullStr A comparative analysis of undersampling techniques for network intrusion detection systems design
title_full_unstemmed A comparative analysis of undersampling techniques for network intrusion detection systems design
title_sort A comparative analysis of undersampling techniques for network intrusion detection systems design
author Silva, Bruno Riccelli dos Santos
author_facet Silva, Bruno Riccelli dos Santos
Silveira, Ricardo Jardel
Silva Neto, Manuel Gonçalves da
Cortez, Paulo César
Gomes, Danielo Gonçalves
author_role author
author2 Silveira, Ricardo Jardel
Silva Neto, Manuel Gonçalves da
Cortez, Paulo César
Gomes, Danielo Gonçalves
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Silva, Bruno Riccelli dos Santos
Silveira, Ricardo Jardel
Silva Neto, Manuel Gonçalves da
Cortez, Paulo César
Gomes, Danielo Gonçalves
dc.subject.por.fl_str_mv Intrusion detection systems
Undersampling
CICIDS2017
CICIDS2018
topic Intrusion detection systems
Undersampling
CICIDS2017
CICIDS2018
description Intrusion Detection Systems (IDS) figure as one of the leading solutions adopted in the network security area to prevent intrusions and ensure data and services security. However, this issue requires IDS to be assertive and efficient processing time. Undersampling techniques allow classifiers to be evaluated from smaller subsets in a representative manner, aiming high assertive metrics in less processing time. There are several solutions in literature for IDS projects, but some criteria are not respected, such as the adoption of a replicable methodology. In this work, we selected three undersampling methodologies: random, Cluster centroids, and NearMiss in two novel unbalanced datasets (CIC2017 and CIC2018) for comparison between five classifiers using cross-validation and Wilcoxon statistical test. Our main contribution is a systematic and replicable methodology for using subsampling techniques to balance the data sets adopted in the IDS project. We choose three metrics for classifier's choice in an IDS design: accuracy, f1-measure, and processing time. The results indicate that the under-sampling by Cluster centroids presents the best performance when applied to distance-based classifiers. Moreover, under-sampling techniques influence the process of choosing the best classifier in the design of an IDS.
publishDate 2021
dc.date.none.fl_str_mv 2021
2023-02-08T13:19:05Z
2023-02-08T13:19:05Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv CORTEZ, P. C. et al. A comparative analysis of undersampling techniques for network intrusion detection systems design. Journal of Communication and Information Systems, [s.l.], v. 36, n. 1, p. 31-43, 2021. DOI: https://doi.org/10.14209/jcis.2021.3
1980-6604
http://www.repositorio.ufc.br/handle/riufc/70571
identifier_str_mv CORTEZ, P. C. et al. A comparative analysis of undersampling techniques for network intrusion detection systems design. Journal of Communication and Information Systems, [s.l.], v. 36, n. 1, p. 31-43, 2021. DOI: https://doi.org/10.14209/jcis.2021.3
1980-6604
url http://www.repositorio.ufc.br/handle/riufc/70571
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Journal of Communication and Information Systems
publisher.none.fl_str_mv Journal of Communication and Information Systems
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1813028917794045952