Accelerating Winograd Convolutions using Symbolic Computation and Meta-programming

2020 
Convolution operations are essential constituents of convolutional neural networks. Their efficient and performance-portable implementation demands tremendous programming effort and fine-tuning. Winograd's minimal filtering algorithm is a well-known method to reduce the computational complexity of convolution operations. Unfortunately, existing implementations of this algorithm are either vendor-specific or hard-coded to support a small subset of convolutions, thus limiting their versatility and performance portability. In this paper, we propose a novel method to optimize Winograd convolutions based on symbolic computation. Taking advantage meta-programming and auto-tuning, we further introduce a system to automate the generation of efficient and portable Winograd convolution code for various GPUs. We show that our optimization technique can effectively exploit repetitive patterns, enabling us to reduce the number of arithmetic operations by up to 62% without compromising numerical stability. Moreover, we demonstrate in experiments that we can generate efficient kernels with runtimes close to deep-learning libraries, requiring only a minimum of programming effort, which confirms the performance portability of our approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    1
    Citations
    NaN
    KQI
    []