Hierarchical clustering algorithms with automatic estimation of the number of clusters
2017
The problem of estimating appropriate number of clusters has been a main and difficult issue in clustering researches. There are different methods for this in hierarchical clustering; a typical approach is to try clustering for different number of clusters, and compare them using a measure to estimate cluster numbers. On the other hand, there is no such method to estimate automatically the number of clusters in agglomerative hierarchical clustering (AHC), since AHC produces a family of clusters with different cluster numbers at the same time using the form of dendrograms. An exception is the Newman method in network clustering, but this method does not have a useful dendrogram output. The aim of the present paper is to propose new methods to automatically estimate the number of clusters in AHC. We show two approaches for this purpose, one is to use a variation of cluster validity measure, and another is to use statistical model selection method like BIC.
Keywords:
- Correlation clustering
- Hierarchical clustering
- Machine learning
- Cluster analysis
- k-medians clustering
- Artificial intelligence
- Complete-linkage clustering
- Single-linkage clustering
- Hierarchical clustering of networks
- Mathematics
- Brown clustering
- Data mining
- Computer science
- Determining the number of clusters in a data set
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
4
Citations
NaN
KQI