Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques

Silva, Leandro Marcos da [UNESP]; Silveira, Marcos Rogério [UNESP]; Cansian, Adriano Mauro [UNESP]; Kobayashi, Hugo Koji

Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques

Detalhes bibliográficos
Autor(a) principal:	Silva, Leandro Marcos da [UNESP]
Data de Publicação:	2022
Outros Autores:	Silveira, Marcos Rogério [UNESP], Cansian, Adriano Mauro [UNESP], Kobayashi, Hugo Koji
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Institucional da UNESP
Texto Completo:	http://dx.doi.org/10.54039/ijcnis.v14i1.5256 http://hdl.handle.net/11449/240917
Resumo:	DNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.

Metadados do item

id	UNSP_76dcd1bdcf7902da01da0e48b2cfbef8
oai_identifier_str	oai:repositorio.unesp.br:11449/240917
network_acronym_str	UNSP
network_name_str	Repositório Institucional da UNESP
repository_id_str	2946
spelling	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniquesabused domains in TLDcybersecuritydata imbalancedmachine learning algorithmspassive DNSDNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.Sao Paulo State University (UNESP) Department of Computer Science and Statistics (DCCE)Brazilian Network Information Center (NIC.br)Sao Paulo State University (UNESP) Department of Computer Science and Statistics (DCCE)Universidade Estadual Paulista (UNESP)Brazilian Network Information Center (NIC.br)Silva, Leandro Marcos da [UNESP]Silveira, Marcos Rogério [UNESP]Cansian, Adriano Mauro [UNESP]Kobayashi, Hugo Koji2023-03-01T20:38:25Z2023-03-01T20:38:25Z2022-04-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article76-85http://dx.doi.org/10.54039/ijcnis.v14i1.5256International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022.2073-607X2076-0930http://hdl.handle.net/11449/24091710.54039/ijcnis.v14i1.52562-s2.0-85129288286Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengInternational Journal of Communication Networks and Information Securityinfo:eu-repo/semantics/openAccess2023-03-01T20:38:25Zoai:repositorio.unesp.br:11449/240917Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T17:36:59.860412Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
spellingShingle	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques Silva, Leandro Marcos da [UNESP] abused domains in TLD cybersecurity data imbalanced machine learning algorithms passive DNS
title_short	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_full	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_fullStr	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_full_unstemmed	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
title_sort	Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
author	Silva, Leandro Marcos da [UNESP]
author_facet	Silva, Leandro Marcos da [UNESP] Silveira, Marcos Rogério [UNESP] Cansian, Adriano Mauro [UNESP] Kobayashi, Hugo Koji
author_role	author
author2	Silveira, Marcos Rogério [UNESP] Cansian, Adriano Mauro [UNESP] Kobayashi, Hugo Koji
author2_role	author author author
dc.contributor.none.fl_str_mv	Universidade Estadual Paulista (UNESP) Brazilian Network Information Center (NIC.br)
dc.contributor.author.fl_str_mv	Silva, Leandro Marcos da [UNESP] Silveira, Marcos Rogério [UNESP] Cansian, Adriano Mauro [UNESP] Kobayashi, Hugo Koji
dc.subject.por.fl_str_mv	abused domains in TLD cybersecurity data imbalanced machine learning algorithms passive DNS
topic	abused domains in TLD cybersecurity data imbalanced machine learning algorithms passive DNS
description	DNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.
publishDate	2022
dc.date.none.fl_str_mv	2022-04-01 2023-03-01T20:38:25Z 2023-03-01T20:38:25Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://dx.doi.org/10.54039/ijcnis.v14i1.5256 International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022. 2073-607X 2076-0930 http://hdl.handle.net/11449/240917 10.54039/ijcnis.v14i1.5256 2-s2.0-85129288286
url	http://dx.doi.org/10.54039/ijcnis.v14i1.5256 http://hdl.handle.net/11449/240917
identifier_str_mv	International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022. 2073-607X 2076-0930 10.54039/ijcnis.v14i1.5256 2-s2.0-85129288286
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	International Journal of Communication Networks and Information Security
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	76-85
dc.source.none.fl_str_mv	Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP
instname_str	Universidade Estadual Paulista (UNESP)
instacron_str	UNESP
institution	UNESP
reponame_str	Repositório Institucional da UNESP
collection	Repositório Institucional da UNESP
repository.name.fl_str_mv	Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_	1808128835122626560

Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques

Registros relacionados