Improved Spoken Uyghur Segmentation for Neural Machine Translation

Chenggang Mi,Yating Yang,Xi Zhou,Lei Wang,Tonghai Jiang

Improved Spoken Uyghur Segmentation for Neural Machine Translation

2018

Chenggang Mi
Yating Yang
Xi Zhou
Lei Wang
Tonghai Jiang

To increase vocabulary overlap in spoken Uyghur neural machine translation (NMT), we propose a novel method to enhance the common used subword units based segmentation method. In particular, we apply a log-linear model as the main framework and integrate several features such as subword, morphological information, bilingual word alignment and monolingual language model into it. Experimental results show that spoken Uyghur segmentation with our proposed method improves the performance of the spoken Uyghur-Chinese NMT significantly (yield up to 1.52 BLEU improvements).

Keywords:

Artificial intelligence
Machine translation
BLEU
Task analysis
Segmentation
Pattern recognition
Feature extraction
Computer science
Language model
Vocabulary
Machine learning
Natural language processing

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations