Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Trabalho de conclusão de curso |
Idioma: | por |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://hdl.handle.net/11449/216224 |
Resumo: | The use of domain names for the practice of malicious activities on the internet is a problem faced on a global scale, with emphasis on Brazil, which is in the ranking of countries most affected by phishing-type cyberattacks. To solve this, several approaches are studied by academia, among them the use of machine learning to classify domains as malicious or legitimate stands out. To deal with this, it was proposed to classify domains in three stages, where each one of them is interconnected through a single system called DNS framework. The system allows training new models and submitting new datasets for the training step, however the previous models and lists are lost. Therefore, approaches used by the academic community were studied that culminated in a set of techniques and approaches to manage machine learning models, these practices are commonly grouped and defined by the term MLOps. From that, it was possible to build a new system with the capacity to store, version and monitor models, lists and system logs, being later integrated with the framework. In this way, ensuring that each of the stages can have, independently, their training sets built incrementally from well-defined operations, without causing the loss of the previous process, in addition, allowing the creation of new models through an automated pipeline, so that it is made available to the production environment. |
id |
UNSP_c83e10f6ce58368e224c3b228a91693d |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/216224 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciososAutomating machine learning processes of the DNS framework for malicious domain detectionCybersecurityMachine learningAutomationDNS frameworkCibersegurançaMLOpsAutomatizaçãoFramework DNSThe use of domain names for the practice of malicious activities on the internet is a problem faced on a global scale, with emphasis on Brazil, which is in the ranking of countries most affected by phishing-type cyberattacks. To solve this, several approaches are studied by academia, among them the use of machine learning to classify domains as malicious or legitimate stands out. To deal with this, it was proposed to classify domains in three stages, where each one of them is interconnected through a single system called DNS framework. The system allows training new models and submitting new datasets for the training step, however the previous models and lists are lost. Therefore, approaches used by the academic community were studied that culminated in a set of techniques and approaches to manage machine learning models, these practices are commonly grouped and defined by the term MLOps. From that, it was possible to build a new system with the capacity to store, version and monitor models, lists and system logs, being later integrated with the framework. In this way, ensuring that each of the stages can have, independently, their training sets built incrementally from well-defined operations, without causing the loss of the previous process, in addition, allowing the creation of new models through an automated pipeline, so that it is made available to the production environment.A utilização de nomes de domínio para a prática de atividades maliciosas na internet é um problema enfrentado em escala global, com destaque para o Brasil que está no ranking dos países mais afetados por ciberataques do tipo phishing. Para resolver isso, diversas abordagens são estudadas pela academia, e entre elas destaca-se a utilização de aprendizado de máquina para a classificação de domínios como maliciosos ou legítimos. Para lidar com isso, foi proposta a classificação de domínios em três estágios, onde cada um deles está ligado à um único sistema denominado framework DNS. O sistema permite fazer o treinamento de novos modelos e submeter novos conjuntos de dados para a etapa de treinamento, entretanto os modelos e listas anteriores são descartados durante o processo. Diante disso, foram estudadas abordagens utilizadas pela comunidade acadêmica que culminaram em um conjunto de técnicas e abordagens para gerenciar modelos de aprendizado de máquina, e essas práticas são comumente agrupadas e definidas pelo termo MLOps. A partir disso, foi possível construir um novo sistema com a capacidade de armazenar, versionar e monitorar modelos, listas e logs do sistema, que é posteriormente integrado com o framework. Dessa forma, cada um dos estágios pode ter, de forma independente, os seus conjuntos de treinamento construídos de forma incremental a partir de operações bem definidas, sem ocasionar a perda do processo anterior. Também é possível criar novos modelos por meio de um pipeline automatizado, para que o mesmo seja disponibilizado em ambiente de produção.Fundação para o Desenvolvimento da UNESP (FUNDUNESP)NIC.br: 2764/2018Universidade Estadual Paulista (Unesp)Cansian, Adriano Mauro [UNESP]Universidade Estadual Paulista (Unesp)Gardini, Victor Fernandes2022-01-31T20:16:56Z2022-01-31T20:16:56Z2022-01-13info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisapplication/pdfhttp://hdl.handle.net/11449/216224porinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESP2023-12-28T06:16:50Zoai:repositorio.unesp.br:11449/216224Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T21:30:16.379025Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos Automating machine learning processes of the DNS framework for malicious domain detection |
title |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos |
spellingShingle |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos Gardini, Victor Fernandes Cybersecurity Machine learning Automation DNS framework Cibersegurança MLOps Automatização Framework DNS |
title_short |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos |
title_full |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos |
title_fullStr |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos |
title_full_unstemmed |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos |
title_sort |
Automatização de processos de machine learning do framework DNS para a detecção de domínios maliciosos |
author |
Gardini, Victor Fernandes |
author_facet |
Gardini, Victor Fernandes |
author_role |
author |
dc.contributor.none.fl_str_mv |
Cansian, Adriano Mauro [UNESP] Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Gardini, Victor Fernandes |
dc.subject.por.fl_str_mv |
Cybersecurity Machine learning Automation DNS framework Cibersegurança MLOps Automatização Framework DNS |
topic |
Cybersecurity Machine learning Automation DNS framework Cibersegurança MLOps Automatização Framework DNS |
description |
The use of domain names for the practice of malicious activities on the internet is a problem faced on a global scale, with emphasis on Brazil, which is in the ranking of countries most affected by phishing-type cyberattacks. To solve this, several approaches are studied by academia, among them the use of machine learning to classify domains as malicious or legitimate stands out. To deal with this, it was proposed to classify domains in three stages, where each one of them is interconnected through a single system called DNS framework. The system allows training new models and submitting new datasets for the training step, however the previous models and lists are lost. Therefore, approaches used by the academic community were studied that culminated in a set of techniques and approaches to manage machine learning models, these practices are commonly grouped and defined by the term MLOps. From that, it was possible to build a new system with the capacity to store, version and monitor models, lists and system logs, being later integrated with the framework. In this way, ensuring that each of the stages can have, independently, their training sets built incrementally from well-defined operations, without causing the loss of the previous process, in addition, allowing the creation of new models through an automated pipeline, so that it is made available to the production environment. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-01-31T20:16:56Z 2022-01-31T20:16:56Z 2022-01-13 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/bachelorThesis |
format |
bachelorThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/11449/216224 |
url |
http://hdl.handle.net/11449/216224 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) |
publisher.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808129327944957952 |