Enhancing SOC threat detection and classification with ML

Detalhes bibliográficos
Autor(a) principal: Pereira, Guilherme Amaral Ribeiro
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/41059
Resumo: The rise of the internet has brought a lot of positive aspects to our society. However, it has also created pathways for malicious actors to exploit organizations by damaging and stealing their assets. Consequently, organizations employ mechanisms that manage and boost their security. Especially Security Operation Centers (SOCs) popularity has increased for this reason. One of the SOCs’ priorities is to preemptively detect threats before they can damage the organization’s assets. However, SOCs and their tools can not catch up with these malicious actors’ novel, complex, and stealth strategies, primarily when relying on predefined rules with defined thresholds applied to the vast amounts of data generated within their environments. Anomaly detection empowered by Machine Learning (ML) can complement SOCs and their tools for detecting these threats by discovering hidden patterns in the vast amount of data generated within their environments. Although many works propose to use ML to detect threats, several rely on curated datasets or specific attack types. However, their performance will not necessarily correspond to the real world, where threats and expected behaviors of users are diverse and ever-evolving. Moreover, establishing a ground truth in these scenarios is often impossible due to the nature of the anomalies and vast amounts of data. Most also are not flexible enough to quickly adapt to the heterogeneity of data available. Another aspect that should be addressed when applying ML methodologies to perform anomaly and threat detection is that security analysts must manually investigate the detection results. The fact is that benign activity might also be anomalous. Despite this, manual analysis of anomalies that turn out to be caused by benign activity burdens security analysts. This work proposes an improved MLbased architecture to aid SOCs in threat detection in the vast amount of data. The architecture combines ML-based anomaly detection to filter out unusual behaviors with a second supervised ML layer that substitutes the security analysts’ classification of anomalies as benign or threatening. The solution was implemented and evaluated in a real-world SOC where Flow level data was used to test the architecture. So far, it has been possible to identify numerous behaviors that do not comply with the organization’s policies, such as using prohibited applications. Moreover, the system was tested against a set of attacks, including port scans, Denial of Service (DOS), and complex data exfiltration scenarios. The results of the tests demonstrate the system’s capability to detect attacks that the organization’s Security Information and Event Management (SIEM) system failed to detect.
id RCAP_a4cc2ee48b066989945da55b9ecac3c8
oai_identifier_str oai:ria.ua.pt:10773/41059
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Enhancing SOC threat detection and classification with MLSecurity Operation CentersSecurity information and event managementIntrusion detection systemsMachine learningThreat detectionAnomaly detectionCyber securityThe rise of the internet has brought a lot of positive aspects to our society. However, it has also created pathways for malicious actors to exploit organizations by damaging and stealing their assets. Consequently, organizations employ mechanisms that manage and boost their security. Especially Security Operation Centers (SOCs) popularity has increased for this reason. One of the SOCs’ priorities is to preemptively detect threats before they can damage the organization’s assets. However, SOCs and their tools can not catch up with these malicious actors’ novel, complex, and stealth strategies, primarily when relying on predefined rules with defined thresholds applied to the vast amounts of data generated within their environments. Anomaly detection empowered by Machine Learning (ML) can complement SOCs and their tools for detecting these threats by discovering hidden patterns in the vast amount of data generated within their environments. Although many works propose to use ML to detect threats, several rely on curated datasets or specific attack types. However, their performance will not necessarily correspond to the real world, where threats and expected behaviors of users are diverse and ever-evolving. Moreover, establishing a ground truth in these scenarios is often impossible due to the nature of the anomalies and vast amounts of data. Most also are not flexible enough to quickly adapt to the heterogeneity of data available. Another aspect that should be addressed when applying ML methodologies to perform anomaly and threat detection is that security analysts must manually investigate the detection results. The fact is that benign activity might also be anomalous. Despite this, manual analysis of anomalies that turn out to be caused by benign activity burdens security analysts. This work proposes an improved MLbased architecture to aid SOCs in threat detection in the vast amount of data. The architecture combines ML-based anomaly detection to filter out unusual behaviors with a second supervised ML layer that substitutes the security analysts’ classification of anomalies as benign or threatening. The solution was implemented and evaluated in a real-world SOC where Flow level data was used to test the architecture. So far, it has been possible to identify numerous behaviors that do not comply with the organization’s policies, such as using prohibited applications. Moreover, the system was tested against a set of attacks, including port scans, Denial of Service (DOS), and complex data exfiltration scenarios. The results of the tests demonstrate the system’s capability to detect attacks that the organization’s Security Information and Event Management (SIEM) system failed to detect.A ascensão da internet proporcionou muitos aspetos positivos para a nossa sociedade. No entanto, também criou caminhos que permitem a atores maliciosos explorarem as organizações, ao roubar e destruir os seus dados. Consequentemente, as organizações empregam mecanismos que garantem e aumentam a sua segurança. Os Centros de Operações de Segurança (SOC) tornaram-se especialmente populares devido a esta razão. Uma das prioridades dos SOCs é detetar atempadamente as ameaças antes que as mesmas danifiquem os bens das organizações. No entanto, os SOCs e as suas ferramentas não conseguem acompanhar novas estratégias, de maior complexidade e camufladas, empregadas por estes atores, especialmente quando dependem de regras pré-definidas aplicadas às enormes quantidades de dados disponíveis nos seus ambientes. A deteção de anomalias suportada por aprendizagem automática pode complementar os SOCs e as suas ferramentas para detetar estas ameaças, descobrindo padrões “escondidos” na vasta quantidade de dados existente. Apesar de muitos trabalhos proporem utilizar aprendizagem automática para detetar ameaças, vários utilizam dados curados ou ataques específicos. No entanto o seu desempenho pode não corresponder necessariamente ao mundo real, onde as ameaças e o comportamento usual dos utilizadores e sistemas são muito mais diversos e estão em constante evolução. Além disso, estabelecer uma verdade fundamental nestes cenários é muitas vezes impossível pela própria natureza das anomalias e a grande quantidade de dados. Muitos trabalhos também não são flexíveis para rapidamente suportarem a heterogeneidade de dados disponíveis. Outro aspeto que deve ser tido em conta quando se aplicam metodologias de deteção de anomalias e ameaças é que os analistas de segurança devem investigar os resultados da deteção, considerando que atividade lícita pode também ser anómala. Contudo, a análise manual de anomalias que depois acabam por ser atribuídas a atividades lícitas sobrecarrega os analistas de segurança. Este trabalho propõe uma arquitetura baseada em aprendizagem automática para assistir os SOCs na deteção de ameaças na vasta quantidade de dados disponível. A arquitetura combina deteção de anomalias para filtrar comportamentos estranhos, com uma segunda camada baseada em aprendizagem supervisionada que substitui a classificação dos analistas na classificação das anomalias como lícitas ou ameaças. A solução foi implementada e testada num ambiente real onde dados de fluxos foram usados para testar a arquitetura. Até ao momento foi possível identificar diversos comportamentos que violam políticas da organização como o uso de aplicações proibidas. Adicionalmente o sistema foi testado contra uma série de ataques, incluindo “port scans”, negações de serviços e cenários complexos de exfiltrações de dados. Os resultados demonstram a capacidade do sistema na deteção de ataques onde a ferramenta de Segurança de Informação e Gestão de Eventos (SIEM) da organização falhou na sua deteção.2026-01-14T00:00:00Z2023-07-03T00:00:00Z2023-07-03info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/41059engPereira, Guilherme Amaral Ribeiroinfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-18T01:48:27Zoai:ria.ua.pt:10773/41059Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T04:02:09.701313Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Enhancing SOC threat detection and classification with ML
title Enhancing SOC threat detection and classification with ML
spellingShingle Enhancing SOC threat detection and classification with ML
Pereira, Guilherme Amaral Ribeiro
Security Operation Centers
Security information and event management
Intrusion detection systems
Machine learning
Threat detection
Anomaly detection
Cyber security
title_short Enhancing SOC threat detection and classification with ML
title_full Enhancing SOC threat detection and classification with ML
title_fullStr Enhancing SOC threat detection and classification with ML
title_full_unstemmed Enhancing SOC threat detection and classification with ML
title_sort Enhancing SOC threat detection and classification with ML
author Pereira, Guilherme Amaral Ribeiro
author_facet Pereira, Guilherme Amaral Ribeiro
author_role author
dc.contributor.author.fl_str_mv Pereira, Guilherme Amaral Ribeiro
dc.subject.por.fl_str_mv Security Operation Centers
Security information and event management
Intrusion detection systems
Machine learning
Threat detection
Anomaly detection
Cyber security
topic Security Operation Centers
Security information and event management
Intrusion detection systems
Machine learning
Threat detection
Anomaly detection
Cyber security
description The rise of the internet has brought a lot of positive aspects to our society. However, it has also created pathways for malicious actors to exploit organizations by damaging and stealing their assets. Consequently, organizations employ mechanisms that manage and boost their security. Especially Security Operation Centers (SOCs) popularity has increased for this reason. One of the SOCs’ priorities is to preemptively detect threats before they can damage the organization’s assets. However, SOCs and their tools can not catch up with these malicious actors’ novel, complex, and stealth strategies, primarily when relying on predefined rules with defined thresholds applied to the vast amounts of data generated within their environments. Anomaly detection empowered by Machine Learning (ML) can complement SOCs and their tools for detecting these threats by discovering hidden patterns in the vast amount of data generated within their environments. Although many works propose to use ML to detect threats, several rely on curated datasets or specific attack types. However, their performance will not necessarily correspond to the real world, where threats and expected behaviors of users are diverse and ever-evolving. Moreover, establishing a ground truth in these scenarios is often impossible due to the nature of the anomalies and vast amounts of data. Most also are not flexible enough to quickly adapt to the heterogeneity of data available. Another aspect that should be addressed when applying ML methodologies to perform anomaly and threat detection is that security analysts must manually investigate the detection results. The fact is that benign activity might also be anomalous. Despite this, manual analysis of anomalies that turn out to be caused by benign activity burdens security analysts. This work proposes an improved MLbased architecture to aid SOCs in threat detection in the vast amount of data. The architecture combines ML-based anomaly detection to filter out unusual behaviors with a second supervised ML layer that substitutes the security analysts’ classification of anomalies as benign or threatening. The solution was implemented and evaluated in a real-world SOC where Flow level data was used to test the architecture. So far, it has been possible to identify numerous behaviors that do not comply with the organization’s policies, such as using prohibited applications. Moreover, the system was tested against a set of attacks, including port scans, Denial of Service (DOS), and complex data exfiltration scenarios. The results of the tests demonstrate the system’s capability to detect attacks that the organization’s Security Information and Event Management (SIEM) system failed to detect.
publishDate 2023
dc.date.none.fl_str_mv 2023-07-03T00:00:00Z
2023-07-03
2026-01-14T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/41059
url http://hdl.handle.net/10773/41059
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138193931501568