Large-scale Malware Automatic Detection Based On Multiclass Features and Machine Learning

2018 
In the1 modern global mobile phone market, Android OS is firmly occupying the first throne with the absolute user base and the proportion of mobile phones, and with the proliferation of android smart phones and other mobile devices, it has also attracted more and more attention from malware developers. There are a large number of malware in several major application markets around the world. In order to solve this problem, this paper proposes a large-scale malware detection system based on multiclass features and machine learning. There are two main problems in the traditional detection schemes, one is how to analyze and extract effectively features which can distinguish malware and benign software, the other is that how to select the most suitable algorithm to detect malware. For the first problem, we select the features that can reflect the android software's maliciousness by extracting android features and removing the features whose degree of distinction are less. For the second problem, we compare seven machine learning algorithms and select the most suitable algorithm that has the highest accuracy for android malware identification. Afterwards, many experiments are done to verify our solutions. First, we extract 234 features and select 76 features. Second, selecting the most suitable algorithm "ensemble learning" by comparing the detection accuracies of 7 algorithms, then adjusting and optimizing the related parameter to achieve the highest accuracy, 99.73%, which proves the effectiveness of our system and scheme.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []