Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering
2017
It is crucial to determine the optimal number of clusters for the clustering quality in cluster analysis. From the standpoint of sample geometry, two concepts, i.e., the sample clustering dispersion degree and the sample clustering synthesis degree, are defined, and a new clustering validity index is designed. Moreover, a method for determining the optimal number of clusters based on an agglomerative hierarchical clustering (AHC) algorithm is proposed. The new index and the method can evaluate the clustering results produced by the AHC and determine the optimal number of clusters for multiple types of datasets, such as linear, manifold, annular, and convex structures. Theoretical research and experimental results indicate the validity and good performance of the proposed index and the method.
Keywords:
- Machine learning
- Single-linkage clustering
- Correlation clustering
- Artificial intelligence
- Hierarchical clustering
- k-medians clustering
- Complete-linkage clustering
- Determining the number of clusters in a data set
- Pattern recognition
- Brown clustering
- CURE data clustering algorithm
- Mathematics
- Data mining
- Cluster analysis
- Fuzzy clustering
- Computer science
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
29
References
80
Citations
NaN
KQI