Robust deep k-means: An effective and simple method for data clustering

2021 
Abstract Clustering aims to partition an input dataset into distinct groups according to some distance or similarity measurements. One of the most widely used clustering method nowadays is the k -means algorithm because of its simplicity and efficiency. In the last few decades, k -means and its various extensions have been formulated to solve the practical clustering problems. However, existing clustering methods are often presented in a single-layer formulation (i.e., shallow formulation). As a result, the mapping between the obtained low-level representation and the original input data may contain rather complex hierarchical information. To overcome the drawbacks of low-level features, deep learning techniques are adopted to extract deep representations and improve the clustering performance. In this paper, we propose a robust deep k -means model to learn the hidden representations associate with different implicit lower-level attributes. By using the deep structure to hierarchically perform k -means, the hierarchical semantics of data can be exploited in a layerwise way. Data samples from the same class are forced to be closer layer by layer, which is beneficial for clustering task. The objective function of our model is derived to a more trackable form such that the optimization problem can be tackled more easily and the final robust results can be obtained. Experimental results over 12 benchmark data sets substantiate that the proposed model achieves a breakthrough in clustering performance, compared with both classical and state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    7
    Citations
    NaN
    KQI
    []