Multiple Partitions Aligned Clustering

Zhao Kang Zipeng Guo Shudong Huang Siying Wang Wenyu Chen Yuanzhang Su Zenglin Xu

Citation

Reference

Related Paper

Citation Trend

Abstract:

Multi-view clustering is an important yet challenging task due to the difficulty of integrating the information from multiple representations. Most existing multi-view clustering methods explore the heterogeneous information in the space where the data points lie. Such common practice may cause significant information loss because of unavoidable noise or inconsistency among views. Since different views admit the same cluster structure, the natural space should be all partitions. Orthogonal to existing techniques, in this paper, we propose to leverage the multi-view information by fusing partitions. Specifically, we align each partition to form a consensus cluster indicator matrix through a distinct rotation matrix. Moreover, a weight is assigned for each view to account for the clustering capacity differences of views. Finally, the basic partitions, weights, and consensus clustering are jointly learned in a unified framework. We demonstrate the effectiveness of our approach on several real datasets, where significant improvement is found over other state-of-the-art multi-view clustering methods.

Keywords:

Leverage (statistics)

Consensus clustering

Constrained clustering

Topics:

Advanced Clustering Algorithms Research

Remote-Sensing Image Classification

Face and Expression Recognition

10.24963/ijcai.2019/375

Cite

PDF

Clustering Ensemble Approach Based on Incremental Learning

Khedairia Soufiane Houari Imene Ababsia Manel Khadir Mohamed Tarek

The clustering ensemble aims to combine multiple clustering results into a probably better and more robust consensus clustering. This technique has shown its efficiency in finding bizarre clusters, dealing with noise, and integrating clustering solutions from multiple distributed sources. Consensus clustering methods based on voting mechanism are widely used in literature. The idea behind majority voting is that the judgement of a group is superior to those of individuals. However, Voting-based consensus methods suffer from the problem of assigning the appropriate cluster label to data objects without majority vote. To deal with this ambiguity as well as clustering when datasets are too large or when new information can arrive dynamically at any time, we have proposed a new clustering approach based on two stage clustering technique where in the first stage a clustering ensemble method based on relabeling and voting process is used to cluster the data objects. Therefore, a new set of disjoint sub-clusters is generated based on majority vote, where each data object vote for the cluster in which it belongs and for its corresponding cluster in each other clustering results. data objects without majority vote are collected in new dataset. In the second stage, the new database as well as the set of previously obtained sub-clusters are processed using an incremental clustering algorithm. The used incremental clustering algorithm is initialized using the obtained sub-clusters and operate on the new dataset elements. The main advantage of incremental clustering methods is that the system can updates its assumptions based on recently available learning data without re-examining old data. The proposed approach have been evaluated using different datasets, where the experimental results have demonstrated the effectiveness and robustness of the proposed method.

Consensus clustering

Data stream clustering

Constrained clustering

Single-linkage clustering

k-medians clustering

Clustering high-dimensional data

10.1145/3361570.3361603

Cite

Citations (1)

Clustering of Biological Sequences

Computational biology (2015)

K. Erciyes

Single-linkage clustering

Hierarchical clustering

Constrained clustering

Consensus clustering

Data stream clustering

Brown clustering

10.1007/978-3-319-24966-7_7

Cite

Citations (0)

Clustering Techniques: A Brief Survey of Different Clustering Algorithms

Deepti Sisodia Lokesh Singh Sheetal Sisodia

Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining. The operation is needed in a number of data mining tasks. Clustering or data grouping is the key technique of the data mining. It is an unsupervised learning task where one seeks to identify a finite set of categories termed clusters to describe the data . The grouping of data into clusters is based on the principle of maximizing the intra class similarity and minimizing the inter class similarity. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? This paper deal with the study of various clustering algorithms of data mining and it focus on the clustering basics, requirement, classification, problem and application area of the clustering algorithms.

Constrained clustering

Data stream clustering

Single-linkage clustering

Consensus clustering

Clustering high-dimensional data

Similarity (geometry)

Data set

Source

Cite

Citations (68)

Review of Traditional and Ensemble Clustering Algorithms for High Dimensional Data

SSRN Electronic Journal (2018)

K. Kalaiselvi D. Karthika

High-dimensional data is explained by a huge quantity of features, introduces new issues to clustering. The so-named 'high dimensionality', creates initially to explain the common increase in time complexity of several computational issues, so the performances of the general clustering algorithms are unsuccessful. Accordingly, several works have been focused on introducing new techniques and clustering algorithms for handling higher dimensionality data. Regular to all clustering algorithms is the fact with the purpose of they need a various fundamental evaluation of similarity among data objects. However still, the existing clustering algorithms have some open research issues. In this review work, we provide a summary of the result of high-dimensional data space and their implications for various clustering algorithms. It also presents a detailed overview of many clustering algorithms with several types: subspace methods, modelbased clustering, density-based clustering methods; partition based clustering methods, etc., including a more detailed description of recent work of their own advantages and disadvantages for solving higher dimensionality data problem. The scope of the future work to extend the present clustering methods and algorithms are also discussed at end of the work.

Data stream clustering

Clustering high-dimensional data

Constrained clustering

Consensus clustering

10.2139/ssrn.3170321

Cite

Citations (0)

Autonomous Data Density based clustering method

2022 International Joint Conference on Neural Networks (IJCNN) (2016)

Plamen Angelov Xiaowei Gu Germán Gutiérrez José Antonio Iglesias Araceli Sanchis

It is well known that clustering is an unsupervised machine learning technique. However, most of the clustering methods need setting several parameters such as number of clusters, shape of clusters, or other user- or problem-specific parameters and thresholds. In this paper, we propose a new clustering approach which is fully autonomous, in the sense that it does not require parameters to be pre-defined. This approach is based on data density automatically derived from their mutual distribution in the data space. It is called ADD clustering (Autonomous Data Density based clustering). It is entirely based on the experimentally observable data and is free from restrictive prior assumptions. This new method exhibits highly accurate clustering performance. Its performance is compared on benchmarked data sets with other competitive alternative approaches. Experimental results demonstrate that ADD clustering significantly outperforms other clustering methods yet does not require restrictive user- or problem-specific parameters or assumptions. The new clustering method is a solid basis for further applications in the field of data analytics.

Data stream clustering

Constrained clustering

Clustering high-dimensional data

Consensus clustering

10.1109/ijcnn.2016.7727498

Cite

Citations (10)

Subspace Clustering, Ensemble Clustering, Alternative Clustering, Multiview Clustering: What Can We Learn From Each Other?

Hans‐Peter Kriegel Arthur Zimek Ludwig-Maximilians-Universität München Subspace Clustering

Though subspace clustering, ensemble clustering, alternative clustering, and multiview clustering are different approaches motivated by different problems and aiming at different goals, there are similar problems in these fields. Here we shortly survey these areas from the point of view of subspace clustering. Based on this survey, we try to identify problems where the different research areas could probably learn from each other.

Consensus clustering

Single-linkage clustering

Data stream clustering

Clustering high-dimensional data

Constrained clustering

k-medians clustering

Brown clustering

Source

Cite

Citations (17)

Data Clustering: User’s Dilemma

Lecture notes in computer science (2007)

Anil K. Jain

Constrained clustering

Consensus clustering

Single-linkage clustering

Clustering high-dimensional data

Data stream clustering

Complete-linkage clustering

10.1007/978-3-540-73499-4_1

Cite

Citations (14)

A fragment-based iterative consensus clustering algorithm with a robust similarity

Knowledge and Information Systems (2013)

Chih-Heng Chung Bi-Ru Dai

Single-linkage clustering

Consensus clustering

Similarity (geometry)

Constrained clustering

Robustness

Data stream clustering

Clustering high-dimensional data

Complete-linkage clustering

10.1007/s10115-013-0667-1

Cite

Citations (8)

Data clustering approaches survey and analysis

G. Ahalya Hari Mohan Pandey

In the current world, there is a need to analyze and extract information from data. Clustering is one such analytical method which involves the distribution of data into groups of identical objects. Every group is known as a cluster, which consists of objects that have affinity within the cluster and disparity with the objects in other groups. This paper is intended to examine and evaluate various data clustering algorithms. The two major categories of clustering approaches are partition and hierarchical clustering. The algorithms which are dealt here are: k-means clustering algorithm, hierarchical clustering algorithm, density based clustering algorithm, self-organizing map algorithm, and expectation maximization clustering algorithm. All the mentioned algorithms are explained and analyzed based on the factors like the size of the dataset, type of the data set, number of clusters created, quality, accuracy and performance. This paper also provides the information about the tools which are used to implement the clustering approaches. The purpose of discussing the various software/tools is to make the beginners and new researchers to understand the working, which will help them to come up with new product and approaches for the improvement.

Single-linkage clustering

Hierarchical clustering

Data stream clustering

Constrained clustering

Consensus clustering

10.1109/ablaze.2015.7154919

Cite

Citations (32)