Second-order Stein: SURE for SURE and other applications in high-dimensional inference

Pierre C. Bellec,Cun-Hui Zhang

Second-order Stein: SURE for SURE and other applications in high-dimensional inference

2021

Stein’s formula states that a random variable of the form z⊤f(z)−divf(z) is mean-zero for all functions f with integrable gradient. Here, divf is the divergence of the function f and z is a standard normal vector. This paper aims to propose a second-order Stein formula to characterize the variance of such random variables for all functions f(z) with square integrable gradient, and to demonstrate the usefulness of this second-order Stein formula in various applications. In the Gaussian sequence model, a remarkable consequence of Stein’s formula is Stein’s Unbiased Risk Estimate (SURE), an unbiased estimate of the mean squared risk for almost any given estimator μˆ of the unknown mean vector. A first application of the second-order Stein formula is an Unbiased Risk Estimate for SURE itself (SURE for SURE): an unbiased estimate providing information about the squared distance between SURE and the squared estimation error of μˆ. SURE for SURE has a simple form as a function of the data and is applicable to all μˆ with square integrable gradient, for example, the Lasso and the Elastic Net. In addition to SURE for SURE, the following statistical applications are developed: (1) upper bounds on the risk of SURE when the estimation target is the mean squared error; (2) confidence regions based on SURE and using the second-order Stein formula; (3) oracle inequalities satisfied by SURE-tuned estimates under a mild Lipschtiz assumption; (4) an upper bound on the variance of the size of the model selected by the Lasso, and more generally an upper bound on the variance of the empirical degrees-of-freedom of convex penalized estimators; (5) explicit expressions of SURE for SURE for the Lasso and the Elastic Net; (6) in the linear model, a general semiparametric scheme to de-bias a differentiable initial estimator for the statistical inference of a low-dimensional projection of the unknown regression coefficient vector, with a characterization of the variance after debiasing; and (7) an accuracy analysis of a Gaussian Monte Carlo scheme to approximate the divergence of functions f:Rn→Rn.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations