Momentum online LDA for large-scale datasets

2014 
Modeling large-scale document collections is a significant direction in machine learning research. Online LDA uses stochastic gradient optimization technology to speed the convergence; however the large noise of stochastic gradients leads to slower convergence and worse performance. In this paper, we employ the momentum term to smooth out the noise of stochastic gradients, and propose an extension of Online LDA, namely Momentum Online LDA (MOLDA). We collect a large-scale corpus consisting of 2M documents to evaluate our model. Experimental results indicate that MOLDA achieves faster convergence and better performance than the state-of-the-art.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    2
    Citations
    NaN
    KQI
    []