Learning Multilingual Topics with Neural Variational Inference

2020 
Multilingual topic models are one of the most popular methods for revealing common latent semantics of cross-lingual documents. However, traditional approximation methods adopted by existing probabilistic models sometimes do not effectively lead to high-quality multilingual topics. Besides, as the generative processes of these models become more expressive, the difficulty of performing fast and accurate inference methods over parameters grows. In this paper, to address these issues, we propose a new multilingual topic model that permits training by backpropagation in the framework of neural variational inference. We propose to infer topic distributions via a shared inference network to capture common word semantics and an incorporating module to incorporate the topic-word distribution from another language through a novel transformation method. Thus, the networks of cross-lingual corpora are coupled together. With jointly training the coupled networks, our model can infer more interpretable multilingual topics and discriminative topic distributions. Experimental results on real-world datasets show the superiority of our model both in terms of topic quality and text classification performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    2
    Citations
    NaN
    KQI
    []