Hexpo: A vanishing-proof activation function

Shumin Kong,Masahiro Takatsuka

Hexpo: A vanishing-proof activation function

2017

This article proposes “Hexpo”, an activation function that has ability to scale the gradient and hence overcome the vanishing gradient problem. Unlike rectified linear units family which produces identity mapping on positive inputs, Hexpo has scalable limits on both positive and negative values. With parametrization, the active domain of Hexpo and the output it maps to is flexible. Thus it can alleviate the vanishing gradient problem from both gradient flow and local gradient aspect, while preserving the upper and lower bound of output. Parametrization also offers Hexpo the ability to produce the output that is close to zero. On experiments involving MNIST hand digit recognition dataset, Hexpo outperforms rectified linear units family(rectified linear unit and exponential linear unit) by the accuracy and speed of learning. On the experiments involving CIFAR-10 tiny image recognition dataset and convolutional layers, Hexpo outperforms rectified linear unit and performs similar in comparison with exponential linear unit.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations