Prior-preconditioned conjugate gradient method for accelerated Gibbs sampling in "large $n$ & large $p$" sparse Bayesian regression.

2018 
In a modern observational study based on healthcare databases, the number of observations typically ranges in the order of 10^5 ~ 10^6 and that of the predictors in the order of 10^4 ~ 10^5. Despite the large sample size, data rarely provide sufficient information to reliably estimate such a large number of parameters. Sparse regression provides a potential solution. Bayesian approaches based on shrinkage priors possess many desirable theoretical properties and, under linear and logistic models, yield posterior distributions amenable to Gibbs sampling. A major computational bottleneck arises in the "large n & large p" setting, however, from the need to sample from a high-dimensional Gaussian distribution at each iteration; despite the availability of a closed-form expression for the precision matrix $Φ$, computing and factorizing such a large matrix is computationally expensive nonetheless. In this article, we present a novel algorithm to speed up this bottleneck based on the following observation: we can cheaply generate a random vector $b$ such that the solution to the linear system $Φβ= b$ has the desired Gaussian distribution. We can then solve the linear system by the conjugate gradient (CG) algorithm through the matrix-vector multiplications by $Φ$, without ever explicitly inverting $Φ$. As practical performance of CG depends critically on appropriate preconditioning of the linear system, we develop a theory of prior-preconditioning to turn CG into a highly effective algorithm for sparse Bayesian regression. We apply our algorithm to a clinically relevant large-scale observational study with n = 72,489 and p = 22,175, designed to assess the relative risk of intracranial hemorrhage from two alternative blood anti-coagulants. Our algorithm demonstrates an order of magnitude speed-up in the posterior computation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    74
    References
    11
    Citations
    NaN
    KQI
    []