Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator

James A. Preiss,Sébastien M. R. Arnold,Chen Yu-Wei,Marius Kloft

Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator

2019

James A. Preiss
Sébastien M. R. Arnold
Chen Yu-Wei
Marius Kloft

We study the variance of the REINFORCE policy gradient estimator in environments with continuous state and action spaces, linear dynamics, quadratic cost, and Gaussian noise. These simple environments allow us to derive bounds on the estimator variance in terms of the environment and noise parameters. We compare the predictions of our bounds to the empirical variance in simulation experiments.

Keywords:

Gaussian noise
Linear-quadratic regulator
Estimator
Mathematical optimization
Mathematics
Quadratic equation
quadratic cost

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations