Adaptive DBSCAN Algorithm Based on Sample Density Gradient

2019 
DBSCAN is a classic and commonly applied density-based clustering algorithm, but its clustering accuracy depends on the choice of two input parameters. This paper presents a new algorithm for adaptative parameter determination in DBSCAN. The assumption of this new algorithm is that regions with larger sample density gradient usually corresponds to the edge areas of clusters. The main idea is to generate some pre-clusters and determine the values of parameters by their statistics information. We first randomly pick pairs of points in the sample space to form circles, which are called "disks". Then we estimate the density information of a disk by random sampling, and define the criteria of disk quality to select disks with larger sample density gradient. Finally, we obtain the suitable parameters of DBSCAN in terms of the distributions of radius and points number of these extracted disks by Gaussian kernel density estimation. Experimental results show that the new algorithm improves the accuracy of DBSCAN and performs better than classic algorithms like k-means and Birch in some cases.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    2
    Citations
    NaN
    KQI
    []