Malware detection in macOS using supervised learning

BURGARDT, Caio Augusto Pereira

Malware detection in macOS using supervised learning

Detalhes bibliográficos
Autor(a) principal:	BURGARDT, Caio Augusto Pereira
Data de Publicação:	2022
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Institucional da UFPE
dARK ID:	ark:/64986/001300000w4mr
Texto Completo:	https://repositorio.ufpe.br/handle/123456789/46235
Resumo:	The development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.

Metadados do item

id	UFPE_d7cb35bf71e5f26dd01e117b14a2124b
oai_identifier_str	oai:repositorio.ufpe.br:123456789/46235
network_acronym_str	UFPE
network_name_str	Repositório Institucional da UFPE
repository_id_str	2221
spelling	BURGARDT, Caio Augusto Pereirahttp://lattes.cnpq.br/0812104184657634http://lattes.cnpq.br/9838400375894439CAMPELO, Divanilson Rodrigo de Sousa2022-09-08T12:07:45Z2022-09-08T12:07:45Z2022-02-25BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022.https://repositorio.ufpe.br/handle/123456789/46235ark:/64986/001300000w4mrThe development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.O desenvolvimento de malware para macOS cresceu significativamente nos últimos anos. Os invasores se tornaram mais sofisticados e mais direcionados com o surgimento de novas famílias de malware perigosas para o macOS. No entanto, como o problema de detecção de malware é muito dependente da plataforma, as soluções propostas para outros sistemas operacionais não podem ser usadas diretamente no macOS. A detecção de malware é um dos principais pilares da segurança de endpoints. Infelizmente, existem muito poucos trabalhos sobre a segurança de endpoint do macOS, que é considerada território pouco investigado.Atualmente, o único mecanismo de detecção de malware no macOS é um sistema baseado em assinaturas com menos de 200 regras em 2021, conhecido como XProtect. Trabalhos recentes que tentaram melhorar a detecção de malwares no macOS têm limitações de metodologia, como a falta de um grande conjunto de dados de malware do macOS e problemas que surgem com conjuntos de dados em classes desequilibradas.Nessa dissertação, trazemos o problema de detecção de malware para o sistema operacional macOS e avaliamos como algoritmos de aprendizado de máquina supervisionados podem ser usados para melhorar a segurança de end - point do ecossistema macOS. Criamos um novo dataset extraindo informações do formato Mach-O de 631 malwares e 10.141 softwares benignos de fontes públicas. Avaliamos o desempenho de sete algoritmos de aprendizagem de máquina em conjunto com duas estratégias de amostragem e quatro técnicas de redução de features para a detecção de malwares no macOS. Como resultado, apresentamos modelos melhores que as proteções nativas do macOS, com taxas de detecção superiores a 90% e taxas de alarmes falsos inferiores a 1%. Os modelos apresentados demonstram com sucesso que a segurança do macOS pode ser aprimorada usando características estáticas de executáveis nativos em combinação com algoritmos populares de aprendizagem de máquina.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessRedes de ComputadoresAprendizagem de máquinaMalware detection in macOS using supervised learninginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Caio Augusto Pereira Burgardt.pdfDISSERTAÇÃO Caio Augusto Pereira Burgardt.pdfapplication/pdf1158998https://repositorio.ufpe.br/bitstream/123456789/46235/1/DISSERTA%c3%87%c3%83O%20Caio%20Augusto%20Pereira%20Burgardt.pdf85e278dcc9f10e9021c4e8458beb703cMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82142https://repositorio.ufpe.br/bitstream/123456789/46235/3/license.txt6928b9260b07fb2755249a5ca9903395MD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/46235/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52TEXTDISSERTAÇÃO Caio Augusto Pereira Burgardt.pdf.txtDISSERTAÇÃO Caio Augusto Pereira Burgardt.pdf.txtExtracted texttext/plain80386https://repositorio.ufpe.br/bitstream/123456789/46235/4/DISSERTA%c3%87%c3%83O%20Caio%20Augusto%20Pereira%20Burgardt.pdf.txt0dc28a3ef74fa0deababd8aa40a3a7d0MD54THUMBNAILDISSERTAÇÃO Caio Augusto Pereira Burgardt.pdf.jpgDISSERTAÇÃO Caio Augusto Pereira Burgardt.pdf.jpgGenerated Thumbnailimage/jpeg1220https://repositorio.ufpe.br/bitstream/123456789/46235/5/DISSERTA%c3%87%c3%83O%20Caio%20Augusto%20Pereira%20Burgardt.pdf.jpgb33b6f2d3e1a1781bb0d295b5262eb20MD55123456789/462352022-09-09 03:05:11.253oai:repositorio.ufpe.br:123456789/46235VGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBkZSBEb2N1bWVudG9zIG5vIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUKIAoKRGVjbGFybyBlc3RhciBjaWVudGUgZGUgcXVlIGVzdGUgVGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyB0ZW0gbyBvYmpldGl2byBkZSBkaXZ1bGdhw6fDo28gZG9zIGRvY3VtZW50b3MgZGVwb3NpdGFkb3Mgbm8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRSBlIGRlY2xhcm8gcXVlOgoKSSAtICBvIGNvbnRlw7pkbyBkaXNwb25pYmlsaXphZG8gw6kgZGUgcmVzcG9uc2FiaWxpZGFkZSBkZSBzdWEgYXV0b3JpYTsKCklJIC0gbyBjb250ZcO6ZG8gw6kgb3JpZ2luYWwsIGUgc2UgbyB0cmFiYWxobyBlL291IHBhbGF2cmFzIGRlIG91dHJhcyBwZXNzb2FzIGZvcmFtIHV0aWxpemFkb3MsIGVzdGFzIGZvcmFtIGRldmlkYW1lbnRlIHJlY29uaGVjaWRhczsKCklJSSAtIHF1YW5kbyB0cmF0YXItc2UgZGUgVHJhYmFsaG8gZGUgQ29uY2x1c8OjbyBkZSBDdXJzbywgRGlzc2VydGHDp8OjbyBvdSBUZXNlOiBvIGFycXVpdm8gZGVwb3NpdGFkbyBjb3JyZXNwb25kZSDDoCB2ZXJzw6NvIGZpbmFsIGRvIHRyYWJhbGhvOwoKSVYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIFRyYWJhbGhvIGRlIENvbmNsdXPDo28gZGUgQ3Vyc28sIERpc3NlcnRhw6fDo28gb3UgVGVzZTogZXN0b3UgY2llbnRlIGRlIHF1ZSBhIGFsdGVyYcOnw6NvIGRhIG1vZGFsaWRhZGUgZGUgYWNlc3NvIGFvIGRvY3VtZW50byBhcMOzcyBvIGRlcMOzc2l0byBlIGFudGVzIGRlIGZpbmRhciBvIHBlcsOtb2RvIGRlIGVtYmFyZ28sIHF1YW5kbyBmb3IgZXNjb2xoaWRvIGFjZXNzbyByZXN0cml0bywgc2Vyw6EgcGVybWl0aWRhIG1lZGlhbnRlIHNvbGljaXRhw6fDo28gZG8gKGEpIGF1dG9yIChhKSBhbyBTaXN0ZW1hIEludGVncmFkbyBkZSBCaWJsaW90ZWNhcyBkYSBVRlBFIChTSUIvVUZQRSkuCgogClBhcmEgdHJhYmFsaG9zIGVtIEFjZXNzbyBBYmVydG86CgpOYSBxdWFsaWRhZGUgZGUgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGUgYXV0b3IgcXVlIHJlY2FlbSBzb2JyZSBlc3RlIGRvY3VtZW50bywgZnVuZGFtZW50YWRvIG5hIExlaSBkZSBEaXJlaXRvIEF1dG9yYWwgbm8gOS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBhcnQuIDI5LCBpbmNpc28gSUlJLCBhdXRvcml6byBhIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRlIFBlcm5hbWJ1Y28gYSBkaXNwb25pYmlsaXphciBncmF0dWl0YW1lbnRlLCBzZW0gcmVzc2FyY2ltZW50byBkb3MgZGlyZWl0b3MgYXV0b3JhaXMsIHBhcmEgZmlucyBkZSBsZWl0dXJhLCBpbXByZXNzw6NvIGUvb3UgZG93bmxvYWQgKGFxdWlzacOnw6NvKSBhdHJhdsOpcyBkbyBzaXRlIGRvIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUgbm8gZW5kZXJlw6dvIGh0dHA6Ly93d3cucmVwb3NpdG9yaW8udWZwZS5iciwgYSBwYXJ0aXIgZGEgZGF0YSBkZSBkZXDDs3NpdG8uCgogClBhcmEgdHJhYmFsaG9zIGVtIEFjZXNzbyBSZXN0cml0bzoKCk5hIHF1YWxpZGFkZSBkZSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkZSBhdXRvciBxdWUgcmVjYWVtIHNvYnJlIGVzdGUgZG9jdW1lbnRvLCBmdW5kYW1lbnRhZG8gbmEgTGVpIGRlIERpcmVpdG8gQXV0b3JhbCBubyA5LjYxMCBkZSAxOSBkZSBmZXZlcmVpcm8gZGUgMTk5OCwgYXJ0LiAyOSwgaW5jaXNvIElJSSwgYXV0b3Jpem8gYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIGEgZGlzcG9uaWJpbGl6YXIgZ3JhdHVpdGFtZW50ZSwgc2VtIHJlc3NhcmNpbWVudG8gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBwYXJhIGZpbnMgZGUgbGVpdHVyYSwgaW1wcmVzc8OjbyBlL291IGRvd25sb2FkIChhcXVpc2nDp8OjbykgYXRyYXbDqXMgZG8gc2l0ZSBkbyBSZXBvc2l0w7NyaW8gRGlnaXRhbCBkYSBVRlBFIG5vIGVuZGVyZcOnbyBodHRwOi8vd3d3LnJlcG9zaXRvcmlvLnVmcGUuYnIsIHF1YW5kbyBmaW5kYXIgbyBwZXLDrW9kbyBkZSBlbWJhcmdvIGNvbmRpemVudGUgYW8gdGlwbyBkZSBkb2N1bWVudG8sIGNvbmZvcm1lIGluZGljYWRvIG5vIGNhbXBvIERhdGEgZGUgRW1iYXJnby4KRepositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212022-09-09T06:05:11Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv	Malware detection in macOS using supervised learning
title	Malware detection in macOS using supervised learning
spellingShingle	Malware detection in macOS using supervised learning BURGARDT, Caio Augusto Pereira Redes de Computadores Aprendizagem de máquina
title_short	Malware detection in macOS using supervised learning
title_full	Malware detection in macOS using supervised learning
title_fullStr	Malware detection in macOS using supervised learning
title_full_unstemmed	Malware detection in macOS using supervised learning
title_sort	Malware detection in macOS using supervised learning
author	BURGARDT, Caio Augusto Pereira
author_facet	BURGARDT, Caio Augusto Pereira
author_role	author
dc.contributor.authorLattes.pt_BR.fl_str_mv	http://lattes.cnpq.br/0812104184657634
dc.contributor.advisorLattes.pt_BR.fl_str_mv	http://lattes.cnpq.br/9838400375894439
dc.contributor.author.fl_str_mv	BURGARDT, Caio Augusto Pereira
dc.contributor.advisor1.fl_str_mv	CAMPELO, Divanilson Rodrigo de Sousa
contributor_str_mv	CAMPELO, Divanilson Rodrigo de Sousa
dc.subject.por.fl_str_mv	Redes de Computadores Aprendizagem de máquina
topic	Redes de Computadores Aprendizagem de máquina
description	The development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.
publishDate	2022
dc.date.accessioned.fl_str_mv	2022-09-08T12:07:45Z
dc.date.available.fl_str_mv	2022-09-08T12:07:45Z
dc.date.issued.fl_str_mv	2022-02-25
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022.
dc.identifier.uri.fl_str_mv	https://repositorio.ufpe.br/handle/123456789/46235
dc.identifier.dark.fl_str_mv	ark:/64986/001300000w4mr
identifier_str_mv	BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022. ark:/64986/001300000w4mr
url	https://repositorio.ufpe.br/handle/123456789/46235
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv	Programa de Pos Graduacao em Ciencia da Computacao
dc.publisher.initials.fl_str_mv	UFPE
dc.publisher.country.fl_str_mv	Brasil
publisher.none.fl_str_mv	Universidade Federal de Pernambuco
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE
instname_str	Universidade Federal de Pernambuco (UFPE)
instacron_str	UFPE
institution	UFPE
reponame_str	Repositório Institucional da UFPE
collection	Repositório Institucional da UFPE
bitstream.url.fl_str_mv	https://repositorio.ufpe.br/bitstream/123456789/46235/1/DISSERTA%c3%87%c3%83O%20Caio%20Augusto%20Pereira%20Burgardt.pdf https://repositorio.ufpe.br/bitstream/123456789/46235/3/license.txt https://repositorio.ufpe.br/bitstream/123456789/46235/2/license_rdf https://repositorio.ufpe.br/bitstream/123456789/46235/4/DISSERTA%c3%87%c3%83O%20Caio%20Augusto%20Pereira%20Burgardt.pdf.txt https://repositorio.ufpe.br/bitstream/123456789/46235/5/DISSERTA%c3%87%c3%83O%20Caio%20Augusto%20Pereira%20Burgardt.pdf.jpg
bitstream.checksum.fl_str_mv	85e278dcc9f10e9021c4e8458beb703c 6928b9260b07fb2755249a5ca9903395 e39d27027a6cc9cb039ad269a5db8e34 0dc28a3ef74fa0deababd8aa40a3a7d0 b33b6f2d3e1a1781bb0d295b5262eb20
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv	attena@ufpe.br
_version_	1815172931079110656

Malware detection in macOS using supervised learning

Registros relacionados