PCM clustering based on noise level

2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (2017)

Peixin Hou Jiguang Yue Hao Deng Shuguang Liu

Citation

Reference

Related Paper

Citation Trend

Abstract:

Possibilistic c-means (PCM) based clustering algorithms are widely used in the literature. In this paper, we develop a noise level based PCM (NPCM) clustering algorithm. The advantage of NPCM is that strong prior information of the dataset is not required, and NPCM needs two kinds of information that is intuitive to specify for the clustering task, i.e., information of the cluster number and information of the property of clusters. More specifically, there are two parameters in NPCM: one specifies the possibly over-specified cluster number, and the other characterizes the closeness of clusters in the clustering result. Both parameters are not required to be exactly specified. Furthermore, we find that the update of bandwidth in adaptive PCM (APCM) is a positive feedback process and the adaptive bandwidth-uncertainty mechanism adopted in NPCM makes this positive feedback process more stronger, which leads to a faster convergence rate. Experiments show that the clustering process can be effectively controlled by the parameters.

Keywords:

Closeness

Topics:

Advanced Clustering Algorithms Research

Rough Sets and Fuzzy Logic

Text and Document Classification Technologies

10.1109/fuzz-ieee.2017.8015379

Cite

A Survey on Data Clustering Algorithms

CiiT international journal of data mining and knowledge engineering (2010)

N. Kamalraj V. Shobana

Clustering is a technique adapted in many real world applications. Generally clustering can be thought of as partitioning the data into group or subsets, which contain analogous objects. A lot of clustering techniques like K-Means algorithm, Fuzzy C-Means algorithm (FCM), spectral clustering algorithm and so on has been proposed earlier in literature. Recently, clustering algorithms are extensively used for mixed data types to evaluate the performance of the clustering techniques. This paper presents a survey on various clustering algorithms that are proposed earlier in literature. Moreover it provides an insight into the advantages and limitations of some of those earlier proposed clustering techniques. The comparison of various clustering techniques is provided in this paper. The future enhancement section of this paper provides a general idea for improving the existing clustering algorithms to achieve better clustering accuracy.

Data stream clustering

Biclustering

Single-linkage clustering

Clustering high-dimensional data

Source

Cite

Citations (3)

Even-Sized Clustering with Noise Clustering Method

Kei Kitajima Yasunori Endo

Clustering is a method of data analysis without the use of supervised data. Clustering method focusing on cluster size is expected to be useful for task distribution problems and several methods have been proposed. We proposed Fuzzy Even-sized Clustering Based on optimization (FECBO) and COntrolled-sized Clustering Based on Optimization (COCBO) as a method focusing on cluster size. However, these methods have the problem that they are susceptible to noise. It is believed that this issue can be overcome by applying noise clustering method. Noise clustering is a method that it classify noise into noise clusters. In this study, we extend FECBO and COCBO with noise clustering and verify its effectiveness through numerical examples.

Clustering high-dimensional data

Data stream clustering

k-medians clustering

10.1109/scis-isis.2018.00138

Cite

Citations (1)

POCS-based Clustering Algorithm

arXiv (Cornell University) (2022)

Le-Anh Tran Henock M. Deberneh Truong-Dong Do Thanh-Dat Nguyen My-Ha Le

A novel clustering technique based on the projection onto convex set (POCS) method, called POCS-based clustering algorithm, is proposed in this paper. The proposed POCS-based clustering algorithm exploits a parallel projection method of POCS to find appropriate cluster prototypes in the feature space. The algorithm considers each data point as a convex set and projects the cluster prototypes parallelly to the member data points. The projections are convexly combined to minimize the objective function for data clustering purpose. The performance of the proposed POCS-based clustering algorithm is verified through experiments on various synthetic datasets. The experimental results show that the proposed POCS-based clustering algorithm is competitive and efficient in terms of clustering error and execution speed when compared with other conventional clustering methods including Fuzzy C-Means (FCM) and K-means clustering algorithms.

Data stream clustering

k-medians clustering

Single-linkage clustering

Clustering high-dimensional data

10.48550/arxiv.2208.08888

Cite

Citations (0)

POCS-based Clustering Algorithm

Le-Anh Tran Henock M. Deberneh Truong-Dong Do Thanh-Dat Nguyen My-Ha Le

Data stream clustering

Single-linkage clustering

Clustering high-dimensional data

k-medians clustering

10.1109/iwis56333.2022.9920762

Cite

Citations (6)

Clustering with side information: Further efforts to improve efficiency

Pattern Recognition Letters (2016)

Ahmad Ali Abin

Constrained clustering

Data stream clustering

Single-linkage clustering

Clustering high-dimensional data

10.1016/j.patrec.2016.10.013

Cite

Citations (11)

Clustering aggregation based on genetic algorithm for documents clustering

Zhenya Zhang Hongmei Cheng Zhang Shu-guang Wanli Chen Qiansheng Fang

Clustering aggregation problem is a kind of formal description for clustering ensemble problem and technologies for the solving of clustering aggregation problem can be used to construct clustering division with better clustering performance when the clustering performances of each original clustering division are fluctuant or weak. In this paper, an approach based on genetic algorithm for clustering aggregation problem, named as GeneticCA, is presented To estimate the clustering performance of a clustering division, clustering precision is defined and features of clustering precision are discussed In our experiments about clustering performances of GeneticCA for document clustering, hamming neural network is used to construct clustering divisions with fluctuant and weak clustering performances. Experimental results show that the clustering performance of clustering division constructed by GeneticCA is better than clustering performance of original clustering divisions with clustering precision as criterion.

Single-linkage clustering

Data stream clustering

Clustering high-dimensional data

10.1109/cec.2008.4631225

Cite

Citations (31)

Automatic Aggregation Enhanced Affinity Propagation Clustering Based on Mutually Exclusive Exemplar Processing

Computers, materials & continua/Computers, materials & continua (Print) (2023)

Zhihong Ouyang Lei Xue Feng Ding Yongsheng Duan

Affinity propagation (AP) is a widely used exemplar-based clustering approach with superior efficiency and clustering quality. Nevertheless, a common issue with AP clustering is the presence of excessive exemplars, which limits its ability to perform effective aggregation. This research aims to enable AP to automatically aggregate to produce fewer and more compact clusters, without changing the similarity matrix or customizing preference parameters, as done in existing enhanced approaches. An automatic aggregation enhanced affinity propagation (AAEAP) clustering algorithm is proposed, which combines a dependable partitioning clustering approach with AP to achieve this purpose. The partitioning clustering approach generates an additional set of findings with an equivalent number of clusters whenever the clustering stabilizes and the exemplars emerge. Based on these findings, mutually exclusive exemplar detection was conducted on the current AP exemplars, and a pair of unsuitable exemplars for coexistence is recommended. The recommendation is then mapped as a novel constraint, designated mutual exclusion and aggregation. To address this limitation, a modified AP clustering model is derived and the clustering is restarted, which can result in exemplar number reduction, exemplar selection adjustment, and other data point redistribution. The clustering is ultimately completed and a smaller number of clusters are obtained by repeatedly performing automatic detection and clustering until no mutually exclusive exemplars are detected. Some standard classification data sets are adopted for experiments on AAEAP and other clustering algorithms for comparison, and many internal and external clustering evaluation indexes are used to measure the clustering performance. The findings demonstrate that the AAEAP clustering algorithm demonstrates a substantial automatic aggregation impact while maintaining good clustering quality.

Data stream clustering

Constrained clustering

Affinity propagation

Single-linkage clustering

Consensus clustering

Clustering high-dimensional data

10.32604/cmc.2023.042222

Cite

Citations (1)

Improved K-Means Algorithm for Optimizing Initial Centers

Smart innovation, systems and technologies (2020)

Jianming Liu Lili Xu Zhenna Zhang Xuemei Zhen

Data stream clustering

Clustering high-dimensional data

10.1007/978-981-15-3863-6_24

Cite

Citations (1)

A genetic approach to the automatic clustering problem

Pattern Recognition (2001)

Lin‐Yu Tseng Shiueng Bien Yang

Single-linkage clustering

Constrained clustering

Clustering high-dimensional data

Data stream clustering

k-medians clustering

10.1016/s0031-3203(00)00005-4

Cite

Citations (210)

Clustering by hybrid K-Means and black hole entropic fuzzy clustering algorithm for medical data

Advances in Complex Systems (2022)

A. Jaya Mabel Rani A. Pravin

Today clustering-based machine learning algorithms are the important field in data mining. Here, medical data clustering is one of the core applications of data mining to predict and identify the risk factor of the disease. At the same time, medical data clustering is a very important and challenging task due to its complexity and high frequency of data. In order to achieve proper data clustering, this paper proposed a hybrid data clustering algorithm by the combination of [Formula: see text]-Means and Black Hole Entropic Fuzzy Clustering (BHEFC). [Formula: see text]-Means is the first and one of the most popular and low-computation cost partitioned-based clustering algorithms. There are two modules in this hybrid clustering, first some number of iterations are executed by the first module of this hybrid clustering algorithm, which is [Formula: see text]-Means clustering. After some number of iterations, the clustering solutions are shifted to the second module of this hybrid clustering algorithm, which is Entrophic Fuzzy Clustering. So, it can get the advantages of both algorithms. [Formula: see text]-Means clustering algorithm can produce fast clustering solution due to its low-computation cost. But it can go for premature convergence. To overcome this problem, the second module used BHEFC, which can use large amount of high-frequency medical data. The experimental results are done with the medical practitioners to predict the risk factors of the heart disease patients and doctors can give the suggestions based on the risk factors. Finally, the efficiency of the proposed Hybrid [Formula: see text]-Means and BHEFC is analyzed by three different performance measures.

Data stream clustering

Clustering high-dimensional data

10.1142/s179396232341012x

Cite

Citations (1)