The Cluster Elastic Net for High-Dimensional Regression With Unknown Variable Grouping.

2014 
In the high-dimensional regression setting, the elastic net produces a parsimonious model by shrinking all coefficients toward the origin. However, in certain settings, this behavior might not be desirable: if some features are highly correlated with each other and associated with the response, then we might wish to perform less shrinkage on the coefficients corresponding to that subset of features. We propose the cluster elastic net, which selectively shrinks the coefficients for such variables toward each other, rather than toward the origin. Instead of assuming that the clusters are known a priori, the cluster elastic net infers clusters of features from the data, on the basis of correlation among the variables as well as association with the response. These clusters are then used to more accurately perform regression. We demonstrate the theoretical advantages of our proposed approach, and explore its performance in a simulation study, and in an application to HIV drug resistance data. Supplementary ma...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    40
    Citations
    NaN
    KQI
    []