Multivariate Outlier Detection With ICS

2016 
The Invariant Coordinate Selection (ICS) method shows remarkable properties for revealing data structures such as outliers in multivariate data sets. Based on the joint spectral decomposition of two scatter matrices, it leads to a new affine invariant coordinate system. The Euclidian distance considering all the invariant coordinates corresponds to a Mahalanobis distance in the original system. However, unlike the Mahalanobis distance, ICS makes it possible to select relevant components for displaying potential outliers. Using asymptotic arguments, the present paper shows the performance of ICS when the number of variables is large and outliers are contained in a small dimensional subspace. Owing to the resulting dimension reduction, the method is expected to improve the power of outlier detection rules such as Mahalanobis distance-based criteria. It also greatly simplifies outliers interpretation. The paper includes practical guidelines for using ICS in the context of a small proportion of outliers. The choice of scatter matrices together with the selection of relevant invariant components through parallel analysis and normality tests are addressed. An extensive simulation study provides a comparison with Principal Component Analysis and Mahalanobis distance. The performance of our proposal is also evaluated on several real data sets using a user-friendly R package.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    1
    Citations
    NaN
    KQI
    []