Applying Clustering Techniques for Refining Large Data Set: Case Study on Malware

2019 
Malware databases have been unintentionally collecting garbage (incomplete malware) together with malware through the Internet. This paper focuses on finding garbage (incomplete malware) from large malware datasets using binary pattern matching and speed up the matching by using nested clustering as a preprocessing. To verify the effectiveness of our method, we conduct experiments on various malware datasets. The results show that our method works efficiently while maintaining high accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    1
    Citations
    NaN
    KQI
    []