Self-Paced Probabilistic Principal Component Analysis for Data with Outliers

2020 
Principal Component Analysis (PCA) is a popular tool for dimension reduction and feature extraction in data analysis. Probabilistic PCA (PPCA) extends the standard PCA by using a probabilistic model. However, both standard PCA and PPCA are not robust, as they are sensitive to outliers. To alleviate this problem, we propose a novel method called Self-Paced Probabilistic Principal Component Analysis (SP-PPCA) by introducing the Self-Paced Learning mechanism into PPCA. Furthermore, we design the corresponding optimization algorithm based on an alternative search strategy and an expectation-maximization algorithm, so that SP-PPCA uses an iterative procedure to find the optimal projection vectors and filter out outliers. Experiments on both synthetic data and real data demonstrate that SP-PPCA is more robust than the baselines.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    5
    Citations
    NaN
    KQI
    []