Improved Step-Size Schedules for Noisy Gradient Methods

Sarit Khirirat,Xiaoyu Wang,Sindri Magnusson,Mikael Johansson

Improved Step-Size Schedules for Noisy Gradient Methods

2021

Noise is inherited in many optimization methods such as stochastic gradient methods, zeroth-order methods and compressed gradient methods. For such methods to converge toward a global optimum, it is intuitive to use large step-sizes in the initial iterations when the noise is typically small compared to the algorithm-steps, and reduce the step-sizes as the algorithm progresses. This intuition has been con-firmed in theory and practice for stochastic gradient methods, but similar results are lacking for other methods using approximate gradients. This paper shows that the diminishing step-size strategies can indeed be applied for a broad class of noisy gradient methods. Unlike previous works, our analysis framework shows that such step-size schedules enable these methods to enjoy an optimal $\mathcal{O}(1/k)$ rate. We exemplify our results on zeroth-order methods and stochastic compression methods. Our experiments validate fast convergence of these methods with the step decay schedules.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations