Fully unsupervised word translation from cross-lingual word embeddings especially for healthcare professionals

2021 
Unsupervised word to word translation without parallel corpora has attracted much research interest in the recent years. Even with the remarkable success of the recent techniques that trained with adversarial learning methods achieved a high accuracy. But they suffer from the typical drawbacks of generative adversarial models that is sensitivity to hyper-parameters, long training time and lack of interpretability. In this paper, we proposed a method of cross-lingual word embedding generation for English and morphological rich Hindi language pairs especially for healthcare professional because it will remove the communication barrier among patients regardless of its language. There is no requirement of aligned document or sentence aligned corpus, nor any bilingual dictionary because fully unsupervised learning method has been used. We are following the assumption of intra-lingual similarity distribution idea that the distribution graph is identical for the most common terms between language pairs and isometric embeddings. The performance is analyzed by using different word retrieval methods and compared for the cross-lingual word embedding of an English Hindi language pair, which is trained for both fully unsupervised and semi-supervised ways by passing the seed dictionary. We have also provided the comparative analysis of results of adversarial training and robust self-learning method for English and Hindi languages.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []