Large-Scale Data Clustering Using Manifold-Regularized Ensemble of Posterior in GAN

2021 
Data clustering is an unsupervised learning method as a pivotal technique for statistical data analysis. It is a challenging machine learning scheme that involves the grouping of data samples, especially in large databases. Deep neural networks are scalable to large-scale data and capable of learning data structure by modeling the nonlinearity. One of the famous latent generative models in this realm is the generative adversarial network (GAN). In the latent generative models, for clustering, we need the posterior corresponding to the intended model. Then, we need a variational approximation of that. To address this problem, we can maximize mutual information or minimize the KL-divergence. In this paper, to reach a more generalized inference in clustering, an ensemble approach is employed to approximate the posterior. To implement this ensemble with deep networks, we proposed a convex lower bound for the posteriors’ variational approximation. To amend the generator behavior, we injected the geometrical structure of data as manifold regularization to the objective function to reach accurate statistical inference. The efficacy of the proposed method has been addressed in four benchmark data sets. The experimental results confirm our model’s superiority in comparison with standard clustering algorithms and some recently developed deep methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []