Iterative Compression of End-to-End ASR Model using AutoML

Abhinav Mehrotra,Lukasz Dudziak,Jinsu Yeo,Young-yoon Lee,Ravichander Vipperla,Mohamed S. Abdelfattah,Sourav Bhattacharya,Samin Ishtiaq,Alberto Gil C. P. Ramos,SangJeong Lee,Dae-Hyun Kim,Nicholas D. Lane

Iterative Compression of End-to-End ASR Model using AutoML

2020

Abhinav Mehrotra
Lukasz Dudziak
Jinsu Yeo
Young-yoon Lee
Ravichander Vipperla
Mohamed S. Abdelfattah
Sourav Bhattacharya
Samin Ishtiaq
Alberto Gil C. P. Ramos
SangJeong Lee
Dae-Hyun Kim
Nicholas D. Lane

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

Keywords:

Compression (physics)
Rank factorization
End-to-end principle
model compression
Computer science
Iterative compression
Speedup
Speech recognition

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations