Ensemble Minimum Sum of Squared Similarities sampling for Nyström-based spectral clustering.

2016 
Spectral clustering is a powerful approach for clustering, with applications across multiple disciplines, including bioinformatics. However, the way its computational complexity scales limits its application in analyzing large datasets. This complexity can be reduced using the Nystrom method, which subsamples the input data in a way that preserves its representational diversity. There are different established strategies for subsampling, yet they may have performance limitations for certain complex datasets. This paper we propose an alternative to those methods, introducing a new sampling procedure called Ensemble Minimum Sum of Squared Similarities (EMS3). We further improve on this method by using weight mixtures in subsample selection, yielding more accurate low-rank approximations than existing ensemble Nystrom methods. We also provide a theoretical analysis of the upper error bound of the EMS3 algorithm, and demonstrate its performance in comparison to the leading spectral clustering methods that use Nystrom sampling.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    8
    Citations
    NaN
    KQI
    []