A Distributed Framework for Online Stream Data Clustering

2020 
The recent prevalence of positioning sensors and mobile devices generates a massive amount of spatial-temporal data from moving objects in real-time. As one of the fundamental processes in data analysis, the clustering on spatial-temporal data creates various applications, like event detection and travel pattern extraction. However, most of the existing works only focus on the offline scenario, which is not applicable to online time-sensitive applications due to their low efficiency and ignorance of temporal features. In this paper, we propose a distributed streaming framework for spatial-temporal data clustering, which accepts various clustering algorithms while ensuring low resource consumption and result correctness. The framework includes a dynamic partitioning strategy for continuous load-balancing and a cluster-merging algorithm based on convex hulls [10], which guarantees the result correctness. Extensive experiments on real dataset prove the effectiveness of our proposed framework and its advantage over existing solutions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []