Enhancing Robustness of Malware Detection using Synthetically-adversarial Samples

2020 
Malware detection is a critical task in cybersecurity to protect computers and networks from malicious activities arising from malicious software. With the emergence of machine learning and especially deep learning, many malware detection models (malware classifiers) have been developed to learn features of malware samples collected from static or dynamic analysis. However, these classifiers experience a deterioration in performance (e.g., detection accuracy) over time due to the changes in the distribution of malware samples. Leveraging the positive aspects of adversarial samples, we aim at enhancing the robustness of malware classifiers using synthetically-adversarial samples. We develop Generative Adversarial Networks (GANs) that learn to generate not only malicious samples but also benign samples to enrich the training set of a baseline malware classifier. We improve the performance of the developed GANs by incorporating a relativistic discriminator and the cosine margin loss function such that quasi-realistic samples can be generated. We carry out extensive experiments with publicly available malware samples to evaluate the performance of the proposed approach. The experimental results show that without synthetic samples in the training set, the baseline classifier experiences a drop in its detection accuracy by up to 18.20% when evaluated against a test set that includes synthetic samples. By introducing synthetic samples into the training set and retraining the classifier, the improvement in detection accuracy not only compensates the drop but also increases further by up to 4.15%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    1
    Citations
    NaN
    KQI
    []