Topic-Level Clustering on Web Resources

2016 
The rapid development of Internet, social media, and news portals has provided a large amount of information in various aspects. Confronting such plenty of resources, it is valuable to develop effective clustering approaches. However, performance of traditional clustering models on web resources is not good enough due to the high dimension. In this paper, we propose a clustering model based on topic model and density peaks. Our model combines biterm topic model and clustering by fast search of density peaks, which firstly extract a set of features with the co-occurrence of two words from the original documents, followed by clustering analysis via topical features. Web resources are translated from raw data into clusters, and evaluation on clustering results of center part verifies the effectiveness of the proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    0
    Citations
    NaN
    KQI
    []