Ensemble clustering based on evidence theory
3
Citation
32
Reference
10
Related Paper
Citation Trend
Abstract:
Ensemble clustering consists in combining multiple clustering solutions into a single one, called the consensus, which can produce a more accurate and robust clustering of the data. In this paper, we attempt to implement ensemble clustering using Dempster-Shafer evidence theory. Individual clustering solutions are obtained using evidence theory and a novel diversity measure is proposed using the distance of evidence for selecting complementary individual solutions. After establishing the correspondence among different clustering solutions' labels, the consensus clustering solution can be obtained using evidence combination. Experimental results and related analyses show that our proposed approach can effectively implement the ensemble clustering.Keywords:
Consensus clustering
Single-linkage clustering
Abstract: Fuzzy C-Means Clustering (FCM) has been widely known as a technique for performing data clustering, such as image segmentation. This study will conduct a trial using the Normalized Cross Correlation method on the Fuzzy C-Means Clustering algorithm in determining the value of the initial fuzzy pseudo-partition matrix which was previously carried out by a random process. Clustering technique is a process of grouping data which is included in unsupervised learning. Data mining generally has two techniques in performing clustering, namely: hierarchical clustering and partitional clustering. The FCM algorithm has a working principle in grouping data by adding up the level of similarity between pairs of data groups. The method applied to measure the similarity of the data based on the correlation value is the Normalized Cross Correlation (NCC). The methodology in this research is the steps taken to measure clustering performance by adding the Normalized Cross Correlation (NCC) method in determining the initial fuzzy pseudo-partition matrix in the Fuzzy C-Means Clustering (FCM) algorithm. the results of data clustering using the Normalized Cross Correlation (NCC) method on the Fuzzy C-Means Clustering (FCM) algorithm gave better results than the ordinary Fuzzy C-Means Clustering (FCM) algorithm. The increase that occurs in the proposed method is 4.27% for the Accuracy, 4.73% for the rand index and 8.26% for the F-measure..
Single-linkage clustering
k-medians clustering
FLAME clustering
Data stream clustering
Clustering high-dimensional data
Cite
Citations (0)
Data stream clustering
Single-linkage clustering
k-medians clustering
Clustering high-dimensional data
Cite
Citations (11)
With the increasing size of data set,improving the efficiency of K-modes clustering algorithm or fuzzy K-modes clustering algorithm is becoming a critical issue.In order to improve the efficiency of the algorithm,a clustering method based on divided and conquered method was proposed.This method,not a one-time clustering of all data,divided the data set into several subsets,and each subset was clustered at the same time;the fusion results of each subset cluster form the final clustering results.The results show that the efficiency of clustering has been increased greatly compared with traditional clustering method in most cases.
Single-linkage clustering
Data stream clustering
Clustering high-dimensional data
Categorical variable
k-medians clustering
Cite
Citations (0)
Abstract The classic Fuzzy C-means (FCM) algorithm has limited clustering performance and is prone to misclassification of border points. This study offers a bi-directional FCM clustering ensemble approach that takes local information into account (LI_BIFCM) to overcome these challenges and increase clustering quality. First, various membership matrices are created after running FCM multiple times, based on the randomization of the initial cluster centers, and a vertical ensemble is performed using the maximum membership principle. Second, after each execution of FCM, multiple local membership matrices of the sample points are created using multiple K-nearest neighbors, and a horizontal ensemble is performed. Multiple horizontal ensembles can be created using multiple FCM clustering. Finally, the final clustering results are obtained by combining the vertical and horizontal clustering ensembles. Twelve data sets were chosen for testing from both synthetic and real data sources. The LI_BIFCM clustering performance outperformed four traditional clustering algorithms and three clustering ensemble algorithms in the experiments. Furthermore, the final clustering results has a weak correlation with the bi-directional cluster ensemble parameters, indicating that the suggested technique is robust.
Single-linkage clustering
k-medians clustering
Data stream clustering
Cite
Citations (4)
After analyzing the problems with present FCM Clustering Algorithm,an improved version of FCM clustering algorithm is proposed in this paper.The new algorithm brings in the Nearest Neighbor Clustering Algorithm to initialize the number and center of clustering.Simulation results indicate that the improved algorithm can not only improve clustering precision,but can avoid local optimum effectively.
Data stream clustering
Single-linkage clustering
FLAME clustering
Nearest-neighbor chain algorithm
Cite
Citations (9)
Clustering is the process of dividing the observed samples into several subclasses through the clustering model according to feature similarity. This chapter focuses on the basic principles of mainstream clustering algorithms and their applications in image data. The simplest and a fairly effective clustering algorithm is the k -means algorithm. Fuzzy c-means clustering (FCM) is one of the most widely used clustering algorithms. FCM algorithm is evolved from k -means algorithm. Hierarchical clustering (HC) is a kind of classification using a series of nested tree structures generated by samples. According to the partitioning strategy, it can be divided into top-down split HC and bottom-up condensed HC. Spectral clustering is a clustering algorithm based on atlas theory, whose essence is to transform the clustering task into a graph partition task. Gaussian mixed model refers to the linear combination of multiple Gaussian distribution functions.
Single-linkage clustering
Data stream clustering
Hierarchical clustering
Clustering high-dimensional data
Cite
Citations (0)
Clustering is one of the primary tools in unsupervised learning. Clustering means creating groups of objects based on their features in such a way that the objects belonging to the same groups are similar and those belonging to different groups are dissimilar. K-means is one of the most widely used algorithms in clustering because of its simplicity and performance. The initial centriod for k-means clustering is generated randomly. In this paper, we address a method for effectively selecting initial cluster center. This method identifies the high density neighborhood (NSS) from the data and then select initial centroid of the neighborhoods as initial centers. Agglomerative Fuzzy k-means (Ak-means) clustering algorithm is then utilized to further merge these initial centers to get the preferred number of clusters and create better clustering results. Merging method is employed to produce more consistent clustering results from different sets of initial clusters centers. Experimental observations on several data sets have proved that the proposed clustering approach was very significant in automatically identifying the true cluster number and also providing correct clustering results.
Single-linkage clustering
Complete-linkage clustering
Hierarchical clustering
FLAME clustering
k-medians clustering
Centroid
Brown clustering
Clustering high-dimensional data
Cite
Citations (0)
The Post-clustering algorithms, which cluster the results of Web search engine, have several different requirements from conventional clustering algorithms. In this paper, we propose the new post-clustering algorithm satisfying those requirements as many as possible. The proposed Concept ART is the form of combining the concept vector that have several advantages in document clustering with Fuzzy ART known as real-time clustering algorithms. Moreover we show that it is applicable to general-purpose clustering as well as post-clustering
Data stream clustering
Single-linkage clustering
Conceptual clustering
Constrained clustering
Brown clustering
Document Clustering
Clustering high-dimensional data
Cite
Citations (0)
In this report, we propose to give a review of the most used clustering methods in the literature. First, we give an introduction about clustering methods, how they work and their main challenges. Second, we present the clustering methods with some comparisons including mainly the classical partitioning clustering methods like well-known k-means algorithms, Gaussian Mixture Modals and their variants, the classical hierarchical clustering methods like the agglomerative algorithm, the fuzzy clustering methods and Big data clustering methods. We present some examples of clustering algorithms comparison. Finally, we present our ideas to build a scalable and noise insensitive clustering system based on fuzzy type-2 clustering methods.
Single-linkage clustering
Brown clustering
Biclustering
Consensus clustering
Data stream clustering
Clustering high-dimensional data
Hierarchical clustering
FLAME clustering
Cite
Citations (70)