A novel clustering algorithm combining niche genetic algorithm with canopy and K-means

2018 
Given a dataset, it is difficult to find the number of natural clusters and moreover the clustering result is sensitive to the selection of the initial seeds. This sensitivity can make many clustering algorithms converge to the local optima. This paper proposes a novel niching genetic algorithm (NNGA) with K-means (NClust) that is capable of automatically finding the better number of clusters and identifying the right genes/chromosomes through a novel initial population approach based on an improved canopy and K-means++ which, may effectively accelerate the convergence speed and enhance the global searching ability for the purpose of a more efficient result. Furthermore, adaptive probabilities of crossover and mutation are also employed to prevent the convergence of NClust to a local optimum. With the help of the performance of our algorithm, the existing cluster indices, including SSE, DBI, PBM, and COSEC are employed as fitness functions. Using the real-world data sets, this paper compares the performance of NClust with other GA-based clustering methods (GAK and GenClust). Experiment results indicate that the NClust has high performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    4
    Citations
    NaN
    KQI
    []