logo
    Fuzzy Clustering Ensemble Considering Cluster Dependability
    5
    Citation
    0
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    Clustering ensemble has been progressively popular in the ongoing years by combining several base clustering methods into a most likely better and increasingly robust one. Nonetheless, fuzzy clustering dependability (durability) has been unnoticed within the majority of the proposed clustering ensemble approach. This makes them weak against low-quality fuzzy base clusters. In spite of a few endeavors made to the clustering methods, it appears that they consider each base-clustering separately without considering its local diversity. In this paper, to compensate for the mentioned weakness a new fuzzy clustering ensemble approach has been proposed using a weighting strategy at fuzzy cluster level. Indeed, each fuzzy cluster has a contribution weight computed based on its reliability (dependability/durability). After computing fuzzy cluster dependability (reliability/durability), dependability based fuzzy cluster-wise weighted matrix (DFCWWM) is computed. As a final point, the final clustering is obtained by applying the FCM traditional clustering algorithm over DFCWWM. The time complexity of the proposed approach is linear in terms of the number of data-points. The proposed approach has been assessed on 15 various standard datasets. The experimental evaluation has indicated that the proposed method has better performance than the state-of-the-art methods.
    Keywords:
    k-medians clustering
    Abstract In the big data era, clustering is one of the most popular data mining method. The majority of clustering algorithms have complications like automatic cluster number determination, poor clustering precision, inconsistent clustering of various datasets and parameter-dependent etc. A new fuzzy autonomous solution for clustering named Meskat-Mahmudul (MM) clustering algorithm proposed to overcome the complexity of parameter–free automatic cluster number determination and clustering accuracy. MM clustering algorithm finds out the exact number of clusters based on Average Silhouette method in multivariate mixed attribute dataset, including real-time gene expression dataset and dealt missing values, noise and outliers. MM Extended K-Means (MMK) clustering algorithm is an enhancement of the K-Means algorithm, which serves the purpose for automatic cluster discovery and runtime cluster placement. Several validation methods used to evaluate cluster and certify optimum cluster partitioning and perfection. Some datasets used to assess the performance of the proposed algorithms to other algorithms in terms of time complexity and clustering efficiency. Finally, MM clustering and MMK clustering algorithms found superior over conventional algorithms.
    Data stream clustering
    Single-linkage clustering
    k-medians clustering
    Clustering high-dimensional data
    Abstract: Fuzzy C-Means Clustering (FCM) has been widely known as a technique for performing data clustering, such as image segmentation. This study will conduct a trial using the Normalized Cross Correlation method on the Fuzzy C-Means Clustering algorithm in determining the value of the initial fuzzy pseudo-partition matrix which was previously carried out by a random process. Clustering technique is a process of grouping data which is included in unsupervised learning. Data mining generally has two techniques in performing clustering, namely: hierarchical clustering and partitional clustering. The FCM algorithm has a working principle in grouping data by adding up the level of similarity between pairs of data groups. The method applied to measure the similarity of the data based on the correlation value is the Normalized Cross Correlation (NCC). The methodology in this research is the steps taken to measure clustering performance by adding the Normalized Cross Correlation (NCC) method in determining the initial fuzzy pseudo-partition matrix in the Fuzzy C-Means Clustering (FCM) algorithm. the results of data clustering using the Normalized Cross Correlation (NCC) method on the Fuzzy C-Means Clustering (FCM) algorithm gave better results than the ordinary Fuzzy C-Means Clustering (FCM) algorithm. The increase that occurs in the proposed method is 4.27% for the Accuracy, 4.73% for the rand index and 8.26% for the F-measure..
    Single-linkage clustering
    k-medians clustering
    FLAME clustering
    Data stream clustering
    Clustering high-dimensional data
    A novel clustering technique based on the projection onto convex set (POCS) method, called POCS-based clustering algorithm, is proposed in this paper. The proposed POCS-based clustering algorithm exploits a parallel projection method of POCS to find appropriate cluster prototypes in the feature space. The algorithm considers each data point as a convex set and projects the cluster prototypes parallelly to the member data points. The projections are convexly combined to minimize the objective function for data clustering purpose. The performance of the proposed POCS-based clustering algorithm is verified through experiments on various synthetic datasets. The experimental results show that the proposed POCS-based clustering algorithm is competitive and efficient in terms of clustering error and execution speed when compared with other conventional clustering methods including Fuzzy C-Means (FCM) and K-means clustering algorithms.
    Data stream clustering
    k-medians clustering
    Single-linkage clustering
    Clustering high-dimensional data
    Citations (0)
    Data stream clustering
    Single-linkage clustering
    k-medians clustering
    Clustering high-dimensional data
    Citations (11)
    A novel clustering technique based on the projection onto convex set (POCS) method, called POCS-based clustering algorithm, is proposed in this paper. The proposed POCS-based clustering algorithm exploits a parallel projection method of POCS to find appropriate cluster prototypes in the feature space. The algorithm considers each data point as a convex set and projects the cluster prototypes parallelly to the member data points. The projections are convexly combined to minimize the objective function for data clustering purpose. The performance of the proposed POCS-based clustering algorithm is verified through experiments on various synthetic datasets. The experimental results show that the proposed POCS-based clustering algorithm is competitive and efficient in terms of clustering error and execution speed when compared with other conventional clustering methods including Fuzzy C-Means (FCM) and K-Means clustering algorithms.
    Data stream clustering
    Single-linkage clustering
    Clustering high-dimensional data
    k-medians clustering
    Abstract The classic Fuzzy C-means (FCM) algorithm has limited clustering performance and is prone to misclassification of border points. This study offers a bi-directional FCM clustering ensemble approach that takes local information into account (LI_BIFCM) to overcome these challenges and increase clustering quality. First, various membership matrices are created after running FCM multiple times, based on the randomization of the initial cluster centers, and a vertical ensemble is performed using the maximum membership principle. Second, after each execution of FCM, multiple local membership matrices of the sample points are created using multiple K-nearest neighbors, and a horizontal ensemble is performed. Multiple horizontal ensembles can be created using multiple FCM clustering. Finally, the final clustering results are obtained by combining the vertical and horizontal clustering ensembles. Twelve data sets were chosen for testing from both synthetic and real data sources. The LI_BIFCM clustering performance outperformed four traditional clustering algorithms and three clustering ensemble algorithms in the experiments. Furthermore, the final clustering results has a weak correlation with the bi-directional cluster ensemble parameters, indicating that the suggested technique is robust.
    Single-linkage clustering
    k-medians clustering
    Data stream clustering
    Single-linkage clustering
    k-medians clustering
    Data stream clustering
    Spectral Clustering
    Clustering high-dimensional data
    In order to solve the problem that the density clustering algorithm is sensitive to neighborhood parameters,this article introduces a density-based fuzzy adaptive clustering algorithm. Without predefined clustering number and neighborhood parameters,this algorithm adaptively determines the radius of neighborhood to obtain the density of each sample and increases cluster centers based on the density. A new validity measure for fuzzy clustering is proposed to choose the best clustering number so that the sensitivity of density clustering is eliminated. UCI benchmark data sets are used to compare the proposed algorithm and the traditional density clustering algorithm. Experiment results demonstrate that the proposed algorithm improves the clustering accuracy and the adaptability effectively.
    DBSCAN
    FLAME clustering
    Data stream clustering
    k-medians clustering
    Single-linkage clustering
    Adaptability
    Citations (6)
    Clustering is a method of data analysis without the use of supervised data. Even-sized clustering based on optimization (ECBO) is a clustering algorithm that focuses on cluster size with the constraints that cluster sizes must be the same. However, this constraints makes ECBO inconvenient to apply in cases where a certain margin of cluster size is allowed. It is believed that this issue can be overcome by applying a fuzzy clustering method. Fuzzy clustering can represent the membership of data to clusters more flexible. In this paper, we propose a new even-sized clustering algorithm based on fuzzy clustering and verify its effectiveness through numerical examples.
    Margin (machine learning)
    Single-linkage clustering
    k-medians clustering
    FLAME clustering
    Constrained clustering
    Data stream clustering
    Clustering high-dimensional data