Learning Bilingual Lexicon for Low-Resource Language Pairs

ShaoLin Zhu,Xiao Li,Yating Yang,Lei Wang,Chenggang Mi

Learning Bilingual Lexicon for Low-Resource Language Pairs

2017

ShaoLin Zhu
Xiao Li
Yating Yang
Lei Wang
Chenggang Mi

Learning bilingual lexicon from monolingual data is a novel idea in natural language process which can benefit many low-resource language pairs. In this paper, we present an approach for obtaining bilingual lexicon from monolingual data. Our method only requires a small seed bilingual lexicon and we use the Canonical Correlation Analysis to construct a shared latent space to explain two monolingual embeddings how to be linked. Experimental results show that a considerable precision and size bilingual lexicon can be learned in Chinese-Uyghur and Chinese-Kazakh monolingual data.

Keywords:

Natural language processing
Bilingual lexicon
Natural language
Artificial intelligence
Computer science
low resource
Canonical correlation

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations