May 23rd, 2019 at MESH
Time: 10:00 Length: 45 minutes
Fully Automated Luxury Malware Detection: the good, the bad and the ugly of malware detection with machine learning.
The old days of examining samples, creating signatures and deploying rules-based detection engines will soon be relegated to the dustbin of history as we welcome our new machine learning security overlords. Or will they? While there is no shortage of articles about how AI (the marketing speak for machine learning technologies) will revolutionize everything from beer brewing to the way we consume cat videos and secure our computer systems, the reality of doing malware detection and analysis with machine learning models is far from a fully automated luxury security utopia. This talk will give an introduction to the field of machine learning from the perspective of malware analysis and detection. We will examine previous attempts to build and train classifiers that can classify previously unseen malware samples, the lessons learnt and the hurdles that are currently preventing the field from progressing. As a closer case study, we will examine Endgame’s open source EMBER dataset - a collection of data about 1.1 million malicious and benign Windows PE files - and learn about the kinds of features one can compute from PE files, how these features can be used to train classifiers that can predict maliciousness of previously unseen samples and what this all means for a potential future of Fully Automated Luxury Malware Detection.
By day, Camilla works as a software engineer in the Machine Learning group at Elastic. By night, she pokes at curious looking Mirai samples and polishes her reverse engineering skills.