Topic change point detection using a mixed Bayesian model

2021 
Dynamic text documents, including news articles, user reviews, and blogs, are now commonly encountered in many fields. Accordingly, the topics underlying text streams also change over time. To grasp the topic changes in the increasing accumulation of text documents, there is a great need to develop automatic text analysis models to find the key changes in topics. To this end, this study proposes a topic change point detection (Topic-CD) model. Different from previous studies, we define the change point of topics from the perspective of hyperparameters associated with topic-word distributions. This allows the model to detect change points underlying the whole topic set. Under this definition, the topic modeling and change point detection are combined in a unified framework and then performed simultaneously using a Markov chain Monte Carlo algorithm. In addition, the Topic-CD model is free from setting the number of change points in advance, which makes it more convenient for practical use. We investigate the performance of the Topic-CD model numerically using synthetic data and three real datasets. The results show that the Topic-CD model can well identify the change points in topics when compared with several state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    0
    Citations
    NaN
    KQI
    []