Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques

Detalhes bibliográficos
Autor(a) principal: Silva, Leandro Marcos da [UNESP]
Data de Publicação: 2022
Outros Autores: Silveira, Marcos Rogério [UNESP], Cansian, Adriano Mauro [UNESP], Kobayashi, Hugo Koji
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.54039/ijcnis.v14i1.5256
http://hdl.handle.net/11449/240917
Resumo: DNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.
id UNSP_76dcd1bdcf7902da01da0e48b2cfbef8
oai_identifier_str oai:repositorio.unesp.br:11449/240917
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniquesabused domains in TLDcybersecuritydata imbalancedmachine learning algorithmspassive DNSDNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.Sao Paulo State University (UNESP) Department of Computer Science and Statistics (DCCE)Brazilian Network Information Center (NIC.br)Sao Paulo State University (UNESP) Department of Computer Science and Statistics (DCCE)Universidade Estadual Paulista (UNESP)Brazilian Network Information Center (NIC.br)Silva, Leandro Marcos da [UNESP]Silveira, Marcos Rogério [UNESP]Cansian, Adriano Mauro [UNESP]Kobayashi, Hugo Koji2023-03-01T20:38:25Z2023-03-01T20:38:25Z2022-04-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article76-85http://dx.doi.org/10.54039/ijcnis.v14i1.5256International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022.2073-607X2076-0930http://hdl.handle.net/11449/24091710.54039/ijcnis.v14i1.52562-s2.0-85129288286Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengInternational Journal of Communication Networks and Information Securityinfo:eu-repo/semantics/openAccess2023-03-01T20:38:25Zoai:repositorio.unesp.br:11449/240917Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T17:36:59.860412Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
spellingShingle Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
Silva, Leandro Marcos da [UNESP]
abused domains in TLD
cybersecurity
data imbalanced
machine learning algorithms
passive DNS
title_short Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_full Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_fullStr Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_full_unstemmed Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_sort Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
author Silva, Leandro Marcos da [UNESP]
author_facet Silva, Leandro Marcos da [UNESP]
Silveira, Marcos Rogério [UNESP]
Cansian, Adriano Mauro [UNESP]
Kobayashi, Hugo Koji
author_role author
author2 Silveira, Marcos Rogério [UNESP]
Cansian, Adriano Mauro [UNESP]
Kobayashi, Hugo Koji
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (UNESP)
Brazilian Network Information Center (NIC.br)
dc.contributor.author.fl_str_mv Silva, Leandro Marcos da [UNESP]
Silveira, Marcos Rogério [UNESP]
Cansian, Adriano Mauro [UNESP]
Kobayashi, Hugo Koji
dc.subject.por.fl_str_mv abused domains in TLD
cybersecurity
data imbalanced
machine learning algorithms
passive DNS
topic abused domains in TLD
cybersecurity
data imbalanced
machine learning algorithms
passive DNS
description DNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.
publishDate 2022
dc.date.none.fl_str_mv 2022-04-01
2023-03-01T20:38:25Z
2023-03-01T20:38:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.54039/ijcnis.v14i1.5256
International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022.
2073-607X
2076-0930
http://hdl.handle.net/11449/240917
10.54039/ijcnis.v14i1.5256
2-s2.0-85129288286
url http://dx.doi.org/10.54039/ijcnis.v14i1.5256
http://hdl.handle.net/11449/240917
identifier_str_mv International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022.
2073-607X
2076-0930
10.54039/ijcnis.v14i1.5256
2-s2.0-85129288286
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv International Journal of Communication Networks and Information Security
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 76-85
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128835122626560