Approximate LSTM Computing for Energy-Efficient Speech Recognition

2020 
This paper presents an approximate computing method of long short-term memory (LSTM) operations for energy-efficient end-to-end speech recognition. We newly introduce the concept of similarity score, which can measure how much the inputs of two adjacent LSTM cells are similar to each other. Then, we disable the highly-similar LSTM operations and directly transfer the prior results for reducing the computational costs of speech recognition. The pseudo-LSTM operation is additionally defined for providing the approximate computation with reduced processing resolution, which can further relax the processing overheads without degrading the accuracy. In order to verify the proposed idea, in addition, we design an approximate LSTM accelerator in 65 nm CMOS process. The proposed accelerator newly utilizes a number of approximate processing elements (PEs) to support the proposed skipped-LSTM and pseudo-LSTM operations without degrading the energy efficiency. Moreover, sparsity-aware scheduling is introduced by introducing the small-sized on-chip SRAM buffer. As a result, the proposed work provides an energy-efficient but still accurate speech recognition system, which consumes 2.19 times less energy than the baseline architecture.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    4
    Citations
    NaN
    KQI
    []