Text stream clustering based on Squeezer algorithm

2012 
To solve the problems of"chain data"and"high-dimension,multi-topic,large-scale text stream"in data stream clustering,a modified Squeezer clustering algorithm is proposed,which combines the idea of projected clustering and redefines the class centroid,radius,and judging distance.The preprocessing stage and the projected clustering stage are introduced to improve the performance significantly and attach the semantic to the clusters for better understanding respectively.The experiment on the Internet corpus shows that the cluster result is significantly improved at a small cost of speed decrease and the performance of the proposed algorithm is better than that of Squeezer algorithm.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []