Parallel K-means Clustering Algorithm on NOWs

1999 
Despite its simplicity and its linear time, a serial K-means algorithm's time complexity remains expensive when it is applied to a problem of large size of multidimensional vectors. In this paper we show an improvement by a factor of O(K/2), where K is the number of desired clusters, by applying theories of parallel computing to the algorithm. In addition to time improvement, the parallel version of K-means algorithm also enables the algorithm to run on larger collective memory of multiple machines when the memory of a single machine is insufficient to solve a problem. We show that a problem size can be scaled up to O(K) times a problem size on a single machine.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    92
    Citations
    NaN
    KQI
    []