An Unknown Attack Detection Scheme Based on Semi-supervised Learning and Information Gain Ratio

2019 
State-of-the-art intrusion detection schemes employ machine learning techniques to identify unknown attacks with the network traffic data features. However, due to the lack of enough training set, the difficulty of quantitatively and adaptively selecting features, the existing schemes cannot detect unknown attacks effectively. To address this issue, this paper first proposes an improved k-means driven semi-supervised learning algorithm to enlarge the training set accurately with a small amount of labelled dataset for the detection model. Furthermore, information gain ratio aware random forest is utilized to determine the impact of different features and their weight voting for determination of unknown attacks, which can not only retain the information of features at utmost, but also adjust the weights of different features adaptively against dynamic attacks. Extensive experiments indicate that this scheme can detect unknown attacks effectively with more than 91% accuracy and less than 5% false negative rate over three real-world datasets. Compared with existing schemes, the accuracy is increased by at least 15.85%, while the false negative rate is decreased by more than 51.98%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []