Segmentation and Alignment of Chinese and Khmer Bilingual Names Based on Hierarchical Dirichlet Process
2018
Transliteration is an important foundation of cross-language Natural Language Processing technology. In order to solve the problem of Khmer- Chinese name transliteration, the hierarchical Dirichlet process model allows multi-to-multi alignment, and also solves overfitting problems, and it can involve the influence of the previous syllable alignment on the alignment of the next syllable and effectively mine the latent information in the natural language, Therefore, Chinese and khmer name transliteration based on hierarchical Dirichlet process is proposed in this paper. This paper firstly builds a hierarchical Dirichlet process based on the Chinese-Khmer name alignment model, then, makes the aligned Chinese and Khmer syllables as the training corpus and uses the Mose to train the Chinese-Khmer transliteration model, finally, tests the performance of the Chinese-Khmer name alignment model by the effect of the transliteration model. The results show that the way of aligning Chinese and Khmer names by the alignment model based on hierarchical Dirichlet process firstly and transliterating the names next can get a better performance.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
9
References
0
Citations
NaN
KQI