Improving Neural Machine Translation Model with Deep Encoding Information

2021 
Availability of very high computational power along with the development of deep neural network (DNN) technology has enabled rapid progress of machine translation technology. The powerful representation ability of the deep neural network also enables the neural machine translation technology (NMT) to exploit the available large-scale bilingual parallel corpus as well as the computing power to provide a highly effective translation model. Nevertheless, the existing neural machine translation models only utilize the top layer encoder information, whereas the information available in deeper encoding layers is often ignored. This significantly constrains the performance of the translation model. To address this issue, in this paper, we propose a novel neural machine translation model which can fully exploit the deep encoding information. The core idea is to use different ways of aggregating the information from different encoding layers. We further design three different aggregation strategies including parallel layer, multi-layer, and dynamic layer encoding information aggregations. Three translation models are correspondingly trained and compared with the baseline transformer model for the Chinese-to-English translation task. The experimental results indicate that the BLEU-4 score of the proposed model has been increased by 0.89 compared with that of the benchmark model. Experiments demonstrate the effectiveness of the proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    0
    Citations
    NaN
    KQI
    []