Redundant feature pruning for accelerated inference in deep neural networks

2019 
Abstract This paper presents an efficient technique to reduce the inference cost of deep and/or wide convolutional neural network models by pruning redundant features (or filters). Previous studies have shown that over-sized deep neural network models tend to produce a lot of redundant features that are either shifted version of one another or are very similar and show little or no variations; thus resulting in filtering redundancy. We propose to prune these redundant features along with their related feature maps according to their relative cosine distances in the feature space, thus leading to smaller networks with reduced post-training inference computational costs and competitive performance. We empirically show on select models (VGG-16, ResNet-56, ResNet-110, and ResNet-34) and dataset (MNIST Handwritten digits, CIFAR-10, and ImageNet) that inference costs (in FLOPS) can be significantly reduced while overall performance is still competitive with the state-of-the-art.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    20
    Citations
    NaN
    KQI
    []