logo
    Sparse Principal Component Analysis With Preserved Sparsity Pattern
    54
    Citation
    56
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    Principal component analysis (PCA) is widely used for feature extraction and dimension reduction in pattern recognition and data analysis. Despite its popularity, the reduced dimension obtained from the PCA is difficult to interpret due to the dense structure of principal loading vectors. To address this issue, several methods have been proposed for sparse PCA, all of which estimate loading vectors with few non-zero elements. However, when more than one principal component is estimated, the associated loading vectors do not possess the same sparsity pattern. Therefore, it becomes difficult to determine a small subset of variables from the original feature space that have the highest contribution in the principal components. To address this issue, an adaptive block sparse PCA method is proposed. The proposed method is guaranteed to obtain the same sparsity pattern across all principal components. Experiments show that applying the proposed sparse PCA method can help improve the performance of feature selection for image processing applications. We further demonstrate that our proposed sparse PCA method can be used to improve the performance of blind source separation for functional magnetic resonance imaging data.
    Keywords:
    Sparse PCA
    Feature vector
    Feature (linguistics)
    Principal component analysis (PCA) is a widely used technique for dimension reduction, data processing, and feature extraction. The three tasks are particularly useful and important in high-dimensional data analysis and statistical learning. However, the regular PCA encounters great fundamental challenges under high dimensionality and may produce "wrong" results. As a remedy, sparse PCA (SPCA) has been proposed and studied. SPCA is shown to offer a "right" solution under high dimensions. In this paper, we review methodological and theoretical developments of SPCA, as well as its applications in scientific studies.
    Component (thermodynamics)
    Citations (149)
    Sparse matrix-vector multiplication (SpMV) is a widely used kernel in scientific applications as well as data analytics. Many GPU implementations of SpMV have been proposed, proposing different sparse matrix representations. However, no sparse matrix representation is consistently superior, and the best representation varies for sparse matrices with different sparsity patterns. In this paper we study four popular sparse representations implemented in the NVIDIA cuSPARSE library: CSR, ELL, COO and a hybrid ELL-COO scheme. We analyze statistical features of a dataset of 27 matrices, covering a wide spectrum of sparsity features, and attempt to correlate SpMV performance with each representation with simple aggregate metrics of the matrices. We present some insights on the correlation between matrix features and the best choice for sparse matrix representation.
    Kernel (algebra)
    Representation
    Matrix (chemical analysis)
    Matrix representation
    Citations (10)
    The CUR decomposition provides an approximation of a matrix $X$ that has low reconstruction error and that is sparse in the sense that the resulting approximation lies in the span of only a few columns of $X$. In this regard, it appears to be similar to many sparse PCA methods. However, CUR takes a randomized algorithmic approach, whereas most sparse PCA methods are framed as convex optimization problems. In this paper, we try to understand CUR from a sparse optimization viewpoint. We show that CUR is implicitly optimizing a sparse regression objective and, furthermore, cannot be directly cast as a sparse PCA method. We also observe that the sparsity attained by CUR possesses an interesting structure, which leads us to formulate a sparse PCA method that achieves a CUR-like sparsity.
    Sparse PCA
    K-SVD
    Matrix (chemical analysis)
    Citations (1)
    The CUR decomposition provides an approximation of a matrix X that has low reconstruction error and that is sparse in the sense that the resulting approximation lies in the span of only a few columns of X. In this regard, it appears to be similar to many sparse PCA methods. However, CUR takes a randomized algorithmic approach, whereas most sparse PCA methods are framed as convex optimization problems. In this paper, we try to understand CUR from a sparse optimization viewpoint. We show that CUR is implicitly optimizing a sparse regression objective and, furthermore, cannot be directly cast as a sparse PCA method. We also observe that the sparsity attained by CUR possesses an interesting structure, which leads us to formulate a sparse PCA method that achieves a CUR-like sparsity.
    Sparse PCA
    K-SVD
    Matrix (chemical analysis)
    Citations (32)
    Sparse matrix-vector multiplication (SpMV) is a core kernel in numerous applications, ranging from physics simulation and large-scale solvers to data analytics. Many GPU implementations of SpMV have been proposed, targeting several sparse representations and aiming at maximizing overall performance. No single sparse matrix representation is uniformly superior, and the best performing representation varies for sparse matrices with different sparsity patterns.
    Kernel (algebra)
    Representation
    Matrix representation
    Matrix (chemical analysis)
    Implementation
    Citations (118)
    Sparse computations, such as sparse matrix-dense vector multiplication, are notoriously hard to optimize due to their irregularity and memory-boundedness. Solutions to improve the performance of sparse computations have been proposed, ranging from hardware-based such as gather-scatter instructions, to software ones such as generalized and dedicated sparse formats, used together with specialized executor programs for different hardware targets. These sparse computations are often performed on read-only sparse structures: while the data themselves are variable, the sparsity structure itself does not change. Indeed, sparse formats such as CSR have a typically high cost to insert/remove nonzero elements in the representation. The typical use case is to not modify the sparsity during possibly repeated computations on the same sparse structure.
    Code (set theory)
    Executor
    Citations (3)
    This paper considers recovery of two-dimensional (2D) sparse signals from incomplete measurements. The 2D sparse signals can be reconstructed by solving a sparse representation problem for Multiple Measurement Vectors (MMV). However, the extension of the sparse recovery algorithms to the MMV case may be inefficient if the vectors do not have the same sparsity profile. In this paper, a sequential sparse recovery (SSR) algorithm is proposed to reconstruct the two-dimensional (2D) sparse matrix. The sparsity of the matrix is much reduced after down-sampling observation and the sparse matrix can be reconstructed after sequential observations and reconstructions. Simulation results verify the effectiveness of the proposed method in 2D sparse signal reconstruction.
    Matrix (chemical analysis)
    Signal reconstruction
    Signal Recovery
    Representation