A covariance-free iterative algorithm for distributed principal component analysis on vertically partitioned data

2012 
In this paper, a covariance-free iterative algorithm is developed to achieve distributed principal component analysis on high-dimensional data sets that are vertically partitioned. We have proved that our iterative algorithm converges monotonously with an exponential rate. Different from existing techniques that aim at approximating the global PCA, our covariance-free iterative distributed PCA (CIDPCA) algorithm can estimate the principal components directly without computing the sample covariance matrix. Therefore a significant reduction on transmission costs can be achieved. Furthermore, in comparison to existing distributed PCA techniques, CIDPCA can provide more accurate estimations of the principal components and classification results. We have demonstrated the superior performance of CIDPCA through the studies of multiple real-world data sets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    15
    Citations
    NaN
    KQI
    []