A clustering algorithm for binary protocol data frames based on principal component analysis and density peaks clustering

2017 
Binary protocols lack session flow characteristics and its frequent patterns extracting is difficult. In order to achieve binary protocol data frames identification, an unsupervised clustering algorithm based on improved principal component analysis (PCA) and density peaks clustering (DPC) is proposed. We improve PCA by determining the dimensionality for PCA based on information gain. The improved PCA can remove redundant information and retain the characteristics of original data. Meanwhile, we improve DPC based on distance index weighting. The improved DPC can select cluster centers automatically and enhance the distinction between cluster centers and other data frames effectively. Experimental results show that the proposed algorithm works effectively for binary protocol data frames clustering.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []