Accelerating kernel principal component analysis (KPCA) by utilizing two‐dimensional wavelet compression: applications to spectroscopic imaging

2008 
Principal component analysis (PCA) is a standard tool for analyzing spectroscopic data. However, PCA can at most discriminate a number of spectroscopic signatures that is either equal to the number of variables or to the number of samples, whichever is smaller. Furthermore, linear algorithms are not well adapted to model nonlinear relationships present in the data. In order to overcome the limitations imposed by linear algorithms when applied to nonlinear data, Kernel Principal Component Analysis (KPCA) has been developed. Unlike PCA, KPCA is able to extract a number of principal components (PCs) that exceeds the number of variables, if the number of samples is greater. Because spectroscopic imagers acquire up to tens of thousands of spectra, KPCA computations often require multiple gigabytes of RAM just for holding data. This prohibits the routine application of KPCA to spectroscopic imaging especially if calculations are run on personal computers. In order to avoid such situations, a wavelet compression algorithm is presented that never has to hold all data in memory. The main goal here is to enable the application of KPCA, including mean centering, to large datasets. For this purpose, a mean-centering technique that is compatible with the compression has also been developed. For assessing this compression method, the figures of merit ‘reduction in memory requirements’, ‘quality of compression-based models’ and ‘gains in computation speed’ are studied. These analyses are performed at different compression levels. For testing purposes, spectroscopic imaging data acquired from bacterial samples and remote sensing are used. The results demonstrate that the proposed compression-based KPCA algorithm is (a) feasible on personal computers and (b) derives good approximations of the models determined by the memory-demanding uncompressed KPCA. Copyright © 2008 John Wiley & Sons, Ltd.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    5
    Citations
    NaN
    KQI
    []