Hexpo: A vanishing-proof activation function

2017 
This article proposes “Hexpo”, an activation function that has ability to scale the gradient and hence overcome the vanishing gradient problem. Unlike rectified linear units family which produces identity mapping on positive inputs, Hexpo has scalable limits on both positive and negative values. With parametrization, the active domain of Hexpo and the output it maps to is flexible. Thus it can alleviate the vanishing gradient problem from both gradient flow and local gradient aspect, while preserving the upper and lower bound of output. Parametrization also offers Hexpo the ability to produce the output that is close to zero. On experiments involving MNIST hand digit recognition dataset, Hexpo outperforms rectified linear units family(rectified linear unit and exponential linear unit) by the accuracy and speed of learning. On the experiments involving CIFAR-10 tiny image recognition dataset and convolutional layers, Hexpo outperforms rectified linear unit and performs similar in comparison with exponential linear unit.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    14
    Citations
    NaN
    KQI
    []