Density peaks clustering with gap-based automatic center detection

2020 
Abstract Clustering is a task used to group data from variegated sources, including Big Data, the Internet of Things, and social media. Density peaks clustering (DPC) has become a popular clustering technique for its simplicity and quality. However, DPC requires a proper subset of input data points to be selected as centers using a plot called “decision graph”. This manual specification adds subjectivity and instability, besides breaking the continuous flow of the algorithm. Automatic center detection approaches struggle with obtaining good results while avoiding to add parameters and complexity to the algorithm. We propose an approach to automatically determine cluster centers by detecting gaps between data points in a one-dimensional version of the decision graph; we detect these gaps heuristically by comparing the distance (difference) between pairs of consecutive points in terms of their gamma score. We tested our approach on synthetic and UCI data sets. Results show that the number of clusters is accurately predicted in comparison to other state-of-the-art methods using F-score and Adjusted Rand Index.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    60
    References
    7
    Citations
    NaN
    KQI
    []