Probability Guided Maxout

Claudio Ferrari,Stefano Berretti,Alberto Del Bimbo

Probability Guided Maxout

2021

In this paper, we propose an original CNN training strategy that brings together ideas from both dropout-like regularization methods and solutions that learn discriminative features. We propose a dropping criterion that, differently from dropout and its variants, is deterministic rather than random. It grounds on the empirical evidence that feature descriptors with larger L2-norm and highly-active nodes are strongly correlated to confident class predictions. Thus, our criterion guides towards dropping a percentage of the most active nodes of the descriptors, proportionally to the estimated class probability. We simultaneously train a per-sample scaling factor to balance the expected output across training and inference. This further allows us to keep high the descriptor's L2-norm, which we show enforces confident predictions. The combination of these two strategies resulted in our “Probability Guided Maxout” solution that acts as a training regularizer. We prove the above behaviors by reporting extensive image classification results on the CIFAR10, CIFAR100, and Caltech256 datasets. Code is available at https://github.com/clferrari/probability-guided-maxout.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations