Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Will Grathwohl,Kevin Swersky,Milad Hashemi,David Duvenaud,Chris J. Maddison

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

2021

Will Grathwohl
Kevin Swersky
Milad Hashemi
David Duvenaud
Chris J. Maddison

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

Keywords:

Scalability
Sampling (statistics)
Likelihood function
Probabilistic logic
Boltzmann machine
Energy (signal processing)
Computer science
Ising model
Class (biology)
Algorithm

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations