MalDAE: Detecting and explaining malware based on correlation and fusion of static and dynamic characteristics

2019 
Abstract It is a wide-spread way to detect malware by analyzing its behavioral characteristics based on API call sequences. However, previous studies usually just focus on its static or dynamic API call sequence, while neglecting the correlation between them. Our experimental results show that there exists an underlying relation between the dynamic and static API call sequences of malware. The relation can be described as “the syntax is different, but the semantics is similar”. Based on this discovery, this paper first attempts to explore the difference and relation between the static and dynamic API sequences of malicious programs. We correlate and fuse their dynamic and static API sequences into one hybrid sequence based on semantics mapping and then construct the hybrid feature vector space. Furthermore, we mine and define the malicious behavior types of the programs, and provide explainable results for malware detection. Our study has addressed the shortcoming of the previous approaches that they usually pay attention to detection but neglect explanation. By correlation and fusion of the static and dynamic API sequences, we establish an explainable malware detection framework, called MalDAE. The evaluation results show that the detection and classification accuracy of MalDAE can reach up to 97.89% and 94.39% respectively outperforming the previous similar studies by comprehensive comparison. In addition, MalDAE gives an understandable explanation for common types of malware and provides predictive support for understanding and resisting malware.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    35
    References
    35
    Citations
    NaN
    KQI
    []