Graph Based Kernel k-Means Using Representative Data Points as Initial Centers

2015 
The k-means algorithm is undoubtedly the most widely used data clustering algorithm due to its relative simplicity. It can only handle data that are linearly separable. A generalization of k-means is kernel k-means, which can handle data that are not linearly separable. Standard k-means and kernel k-means have the same disadvantage of being sensitive to the initial placement of the cluster centers. A novel kernel k-means algorithm is proposed in the paper. The proposed algorithm uses a graph based kernel matrix and finds k data points as initial centers for kernel k-means. Since finding the optimal data points as initial centers is an NP-hard problem, this problem is relaxed to obtain k representative data points as initial centers. Matching pursuit algorithm for multiple vectors is used to greedily find k representative data points. The proposed algorithm is tested on synthetic and real-world datasets and compared with kernel k-means algorithms using other initialization techniques. Our empirical study shows encouraging results of the proposed algorithm.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []