Improving Adversarial Neural Machine Translation for Morphologically Rich Language

2020 
Generative adversarial networks (GAN) have great successes on natural language processing (NLP) and neural machine translation (NMT). However, the existing discriminator in GAN for NMT only combines two words as one query to train the translation models, which restrict the discriminator to be more meaningful and fail to apply rich monolingual information. Recent studies only consider one single reference translation during model training, this limit the GAN model to learn sufficient information about the representation of source sentence. These situations are even worse when languages are morphologically rich. In this article, an extended version of GAN model for neural machine translation is proposed to optimize the performance of morphologically rich language translation. In particular, we use the morphological word embedding instead of word embedding as input in GAN model to enrich the representation of words and overcome the data sparsity problem during model training. Moreover, multiple references are integrated into discriminator to make the model consider more context information and adapt to the diversity of different languages. Experimental results on German $\leftrightarrow$ English, French $\leftrightarrow$ English, Czech $\leftrightarrow$ English, Finnish $\leftrightarrow$ English, Turkish $\leftrightarrow$ English, Chinese $\leftrightarrow$ English, Finnish $\leftrightarrow$ Turkish and Turkish $\leftrightarrow$ Czech translation tasks demonstrate that our method achieves significant improvements over baseline systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    35
    References
    2
    Citations
    NaN
    KQI
    []