Surrogate dropout: Learning optimal drop rate through proxy

2020 
Abstract Dropout is commonly used in deep neural networks to alleviate the problem of overfitting. Conventionally, the neurons in a layer indiscriminately share a fixed drop probability, which results in difficulty in determining the appropriate value for different tasks. Moreover, this static strategy will also incur serious degradation on performance when the conventional dropout is extensively applied to both shallow and deep layers. A question is whether selectively dropping the neurons would realize a better regularization effect. This paper proposes a simple and effective surrogate dropout method whereby neurons are dropped according to their importance. The proposed method has two main stages. The first stage trains a surrogate module that can be jointly optimized along with the neural network to evaluate the importance of each neuron. In the second stage, the output of the surrogate module is regarded as a guidance signal for dropping certain neurons, approximating the optimal per-neuron drop rate when the network converges. Various convolutional neural network architectures and multiple datasets, including CIFAR-10, CIFAR-100, SVHN, Tiny ImageNet, and two medical image datasets are used to evaluate the surrogate dropout method. The experimental results demonstrate that the proposed method achieves a better regularization effect than the baseline methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    0
    Citations
    NaN
    KQI
    []