A Bi-directional Fuzzy C-Means Clustering Ensemble Algorithm Considering Local Information

International Journal of Computational Intelligence Systems (2021)

Citation

Reference

Related Paper

Citation Trend

Abstract:

Abstract The classic Fuzzy C-means (FCM) algorithm has limited clustering performance and is prone to misclassification of border points. This study offers a bi-directional FCM clustering ensemble approach that takes local information into account (LI_BIFCM) to overcome these challenges and increase clustering quality. First, various membership matrices are created after running FCM multiple times, based on the randomization of the initial cluster centers, and a vertical ensemble is performed using the maximum membership principle. Second, after each execution of FCM, multiple local membership matrices of the sample points are created using multiple K-nearest neighbors, and a horizontal ensemble is performed. Multiple horizontal ensembles can be created using multiple FCM clustering. Finally, the final clustering results are obtained by combining the vertical and horizontal clustering ensembles. Twelve data sets were chosen for testing from both synthetic and real data sources. The LI_BIFCM clustering performance outperformed four traditional clustering algorithms and three clustering ensemble algorithms in the experiments. Furthermore, the final clustering results has a weak correlation with the bi-directional cluster ensemble parameters, indicating that the suggested technique is robust.

Keywords:

Single-linkage clustering

k-medians clustering

Data stream clustering

Topics:

Advanced Clustering Algorithms Research

Face and Expression Recognition

Data Management and Algorithms

10.1007/s44196-021-00014-z

Cite

PDF

K-maxmins clustering algorithm

Jisuanji gongcheng yu sheji (2004)

Yu Wang

On the basis of analyzing k-means clustering algorithm and k-medians clustering algorithm, cluster analysis is made on a set of data objects by using tschebyshev distance (i.e. -norm) to have got a novel result that the cluster center is just the average of the maximum and minimum values of the data objects. Furthermore, a new clustering algorithm-k-maxmins clustering algorithm is presented. Finally, computing results of k-maxmins clustering algorithm, k-means clustering algorithm and k-medians clustering algorithm are given.

k-medians clustering

Single-linkage clustering

Data stream clustering

Source

Cite

Citations (1)

A robust fuzzy approach for gene expression data clustering

Soft Computing (2021)

Meskat Jahan Mahmudul Hasan

Data stream clustering

Single-linkage clustering

k-medians clustering

Clustering high-dimensional data

10.1007/s00500-021-06397-7

Cite

Citations (11)

Two-Phase Clustering Algorithm for Complex Distributed Data

Maoguo Gong Shuang Wang Meng Ma Cao Yu Licheng Jiao

In this paper, a Two-Phase Clustering (TPC) for the data sets with complex distribution is proposed. TPC contains two phases. First, the data set is partitioned into some sub-clusters with spherical distribution, and each clustering center represents all the members of its corresponding cluster. Then, by utilizing the outstanding clustering performance of the Manifold Evolutionary Clustering (MEC) for acomplex distributed data, the clustering centers obtained in the first phase are clustered. Finally, based on these two clustering results, the final results are obtained. This algorithm is based on an improved K-means, and the MEC. Manifold distance is introduced in evolutionary clustering to make the algorithm competent for the clustering of complex data sets. At the same time, the novel method reduces the computational cost brought by manifold distance. Experimental results on seven artificial data sets and seven UCI data sets with different structure show that the novel algorithm has the ability to identify clusters with simple or complex, convex, or non-convex distribution efficiently, compared with the genetic algorithm-based clustering, the K-means algorithm, and the manifold evolutionary clustering. Furthermore, TPC outperforms MEC obviously in terms of computational time.

Single-linkage clustering

Data stream clustering

k-medians clustering

Clustering high-dimensional data

Constrained clustering

Source

Cite

Citations (6)

The best clustering algorithms in data mining

K M Archana Patel Prateek Thakral

In data mining, Clustering is the most popular, powerful and commonly used unsupervised learning technique. It is a way of locating similar data objects into clusters based on some similarity. Clustering algorithms can be categorized into seven groups, namely Hierarchical clustering algorithm, Density-based clustering algorithm, Partitioning clustering algorithm, Graph-based algorithm, Grid-based algorithm, Model-based clustering algorithm and Combinational clustering algorithm. These clustering algorithms give different result according to the conditions. Some clustering techniques are better for large data set and some gives good result for finding cluster with arbitrary shapes. This paper is planned to learn and relates various data mining clustering algorithms. Algorithms which are under exploration as follows: K-Means algorithm, K-Medoids, Distributed K-Means clustering algorithm, Hierarchical clustering algorithm, Grid-based Algorithm and Density based clustering algorithm. This paper compared all these clustering algorithms according to the many factors. After comparison of these clustering algorithms I describe that which clustering algorithms should be used in different conditions for getting the best result.

Data stream clustering

Single-linkage clustering

Hierarchical clustering

10.1109/iccsp.2016.7754534

Cite

Citations (59)

Improved accelerating large data K-means clustering algorithm

Jisuanji gongcheng yu sheji (2015)

Han Ya

To deal with large-scale data clustering problems,a speeding K-means parallel clustering method was presented which randomly sampled first and then used max-min distance means to carry out K-means parallel clustering.Sampling based method avoids the problem of clustering in local solutions and max-min distance based method makes the initial clustering centers tend to be optimum.Results of a large number of experiments show that the proposed method is affected less by the initial clustering center and improves the precision of clustering in both stand-alone environment and cluster environment.It also reduces the number of iterations of clustering and the clustering time.

Data stream clustering

Single-linkage clustering

k-medians clustering

Clustering high-dimensional data

Source

Cite

Citations (0)

Correlation clustering based on genetic algorithm for documents clustering

Zhenya Zhang Hongmei Cheng Wanli Chen Shuguang Zhang Qiansheng Fang

Correlation clustering problem is a NP hard problem and technologies for the solving of correlation clustering problem can be used to cluster given data set with relation matrix for data in the given data set. In this paper, an approach based on genetic algorithm for correlation clustering problem, named as GeneticCC, is presented. To estimate the performance of a clustering division, data correlation based clustering precision is defined and features of clustering precision are discussed in this paper. Experimental results show that the performance of clustering division for UCI document data set constructed by GeneticCC is better than clustering performance of other clustering divisions constructed by SOM neural network with clustering precision as criterion.

Single-linkage clustering

Data stream clustering

Clustering high-dimensional data

k-medians clustering

10.1109/cec.2008.4631230

Cite

Citations (11)

K-means algorithm based Clustering for Big data

International journal of advance research and innovative ideas in education (2017)

Khevana Shah

Clustering is a data mining technique used to place data elements into related groups without advance knowledge of the group definition. Clustering is a pro-cess of partitioning a set of data in a set of meaningful sub-classes, called cluster. In this paper, we propose to give a review of the most used clustering methods. First, we give an introduction about clustering methods, how they work and their main challenges. Second, we present the clustering methods with some comparisons including mainly the classical partitioning clustering methods like well-known k-means algorithms, Gaussian Mixture Modals and their variants, the classical hierarchical clustering methods. Clustering algorithms can be categorized into partition-based algorithms, hierarchical-based algorithms, density-based algorithms and grid-based algorithms. Partitioning clustering algorithm splits the data points into k partition, where each partition represents a cluster. Hierarchical clustering is a technique of clustering which divide the similar dataset by constructing a hierarchy of clusters. Density based algorithms and the cluster according to the regions which grow with high density. It is the one-scan algorithms. Grid Density based algorithm uses the multi resolution grid data structure and use dense grids to form clusters. Its main distinctiveness is the fastest processing time. In this survey paper, an analysis of clustering and its different techniques in data mining is done.

Single-linkage clustering

Data stream clustering

Hierarchical clustering

Complete-linkage clustering

Consensus clustering

Constrained clustering

Source

Cite

Citations (0)

Comparing clustering algorithms performance using multiple-objective functions

International Journal of Statistics and Applied Mathematics (2020)

Avinash Navlani VB Gupta

Clustering is the bunching of the data into groups of identical objects. Here each bunch is known as a cluster, each object is identical to its objects of the same cluster and different from other clusters. In this paper, we are doing an experimental study for comparing clustering algorithms using multiple-objective functions. We have investigated K-means a Partitioning-based clustering, Hierarchical clustering, Spectral clustering, Gaussian Mixture Model Clustering, and Clustering using Hidden Markov Model. The performance of these methods was compared using multiple objective functions. Multiple objectives have two core objectives: Cluster Homogeneity and separation. These multiple objective functions will be a great help to discover robust clusters in a more efficient way.

Single-linkage clustering

k-medians clustering

Complete-linkage clustering

Hierarchical clustering

Data stream clustering

Constrained clustering

Source

Cite

Citations (0)