Compositional Biplots: A Story of False Leads and Hidden Features Revealed by the Last Dimensions

2021 
Logratio principal component analysis is often one of the first steps in exploring a compositional data set. Compositional biplots based on the first two principal components are frequently used to uncover proportionality between parts or to detect one-dimensional patterns of variability for larger subcompositions. This article argues that this approach is likely to produce false leads and proposes an alternative procedure based on condition indices and low-variance principal components. We advocate the calculation of condition indices, combined with biplots of the last few principal components and lists of subcompositions with large condition numbers, and these are shown to be useful for detecting proportionality and one-dimensional relationships. The detection of such patterns in compositional data sets is shown to be closely related to the analysis of multicollinearity as employed in linear regression. Two example data sets, amino acid compositions in calves and chemical components of coffee aroma, are used as illustrations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []