Human-Readable Real-Time Classifications of Malicious Executables

2012 
Shafiq et al. (2009a) propose a non–signature-based technique for detecting malware which applies data mining techniques to features extracted from executable files. Their technique has a high level of accuracy, a low false positive rate, and a speed on par with commercial anti-virus products. One portion of their technique uses a multi-layer perceptron as a classifier, which provides little insight into the reasons for classification. Our experience is that network security analysts prefer tools which provide human-comprehensible reasons for a classification, rather than operating as “black boxes”. We therefore build on the results of Shafiq et al. by demonstrating a technique which uses decision trees to distinguish packed from non-packed files, producing a classification diagram which can be understood by analysts. We show that the resulting detector still provides high accuracy and classifies files rapidly.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    3
    Citations
    NaN
    KQI
    []