A novel sentence embedding based topic detection method for micro-blog

2020 
Topic detection is a difficult challenging task, especially when the exact number of topics is unknown. In this article, we present a novel topic detection approach based on neural computing to detect topics in a microblogging dataset. We use an unsupervised neural sentence embedding model to map blogs to an embedding space. The proposed model is a weighted power mean sentence embedding model in which weights are calculated by a targeted attention mechanism. The experimental results show that our embedding model performs better than baseline in sentence clustering. In addition, we propose a clustering algorithm, referred to as Relationship-Aware DBSCAN (RADBSCAN), to discover topics from a microblogging dataset in which the number of topics is automatically determined by the characteristics of the dataset. Moreover, to provide parameter insensibility, we use the forwarding relationship in the blogs as a bridge of two independent clusters. Finally, we validate the proposed method on a dataset from the Sina microblog. The results show that our approach can detect all topics successfully and can extract the keywords of each topic.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    56
    References
    1
    Citations
    NaN
    KQI
    []