Understanding the loss landscape of one-hidden-layer ReLU networks

2021 
Abstract In this paper, it is proved that for one-hidden-layer ReLU networks all differentiable local minima are global inside each differentiable region. Necessary and sufficient conditions for the existences of differentiable local minima, saddle points and non-differentiable local minima are given, as well as their locations if they do exist. Building upon the theory, a linear programming based algorithm is designed to judge the existence of differentiable local minima, and is used to predict whether spurious local minima exist for the MNIST and CIFAR-10 datasets. Experimental results show that there are no spurious local minima for most typical weight vectors. These theoretical predictions are verified by demonstrating the consistency between them and the results of gradient descent search.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    62
    References
    2
    Citations
    NaN
    KQI
    []