Hybrid-Regressive Neural Machine Translation

2021 
Although the non-autoregressive translation model based on iterative refinement has achieved comparable performance to the autoregressive counterparts with faster decoding, we empirically found that such aggressive iterations make the acceleration rely heavily on small batch size (e.g., 1) and computing device (e.g., GPU). By designing synthetic experiments, we highlight that iteration times can be significantly reduced when providing a good (partial) target context. Inspired by this, we propose a two-stage translation prototype -- Hybrid-Regressive Translation (HRT). HRT first jumpily generates a discontinuous sequence by autoregression (e.g., make a prediction every k tokens, k>1). Then, with the help of the partially deterministic target context, HRT fills all the previously skipped tokens with one iteration in a non-autoregressive way. The experimental results on WMT'16 En-Ro and WMT'14 En-De show that our model outperforms the state-of-the-art non-autoregressive models with multiple iterations, even autoregressive models. Moreover, compared with autoregressive models, HRT can be steadily accelerated 1.5 times regardless of batch size and device.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []