Sparse regularized low-rank tensor regression with applications in genomic data analysis

2020 
Abstract Many applications in biomedical informatics deal with data in the tensor form. Traditional regression methods which take vectors as covariates may encounter difficulties in handling tensors due to their ultrahigh dimensionality and complex structure. In this paper, we introduce a novel sparse regularized Tucker tensor regression model to exploit the structure of tensor covariates and perform feature selection on tensor data. Based on Tucker decomposition of the regression coefficient tensor, we reduce the ultrahigh dimensionality to a manageable level. To make our model identifiable, we impose the orthonormality constraint on the factor matrices. Unlike previous tensor regression models that impose sparse penalty on the factor matrices of the coefficient tensor, our model directly imposes sparse penalty on the coefficient tensor to select the relevant features on tensor data. An efficient optimization algorithm based on alternating direction method of multiplier (ADMM) algorithm is designed to solve our proposed model. The performance of our model is evaluated on both synthetic and real genomic data. Experiment results on synthetic data demonstrate that our model could identify the true related signals more accurately than other state-of-the-art regression models. The analysis on genomic data of melanoma demonstrates that our model can achieve better prediction performance and identify markers with important implications. Our model and the associated studies can provide useful insights to disease or pathogenesis mechanisms, and will benefit further studies in variable selection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    3
    Citations
    NaN
    KQI
    []