Fuzzy Clustering Technique for Numerical and Categorical dataset

2010 
Data clustering is a common technique for data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. In Fuzzy logic System, Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. Use of traditional fuzzy c-mean type algorithm is limited to numeric data. We present a modified description of cluster center to overcome the numeric data only limitation of Fuzzy c-mean algorithm and provide a better characterization of clusters. The fuzzy k-modes algorithm for clustering categorical data. We are going to propose new cost function and distance measure based on co-occurrence of values. The measures also take into account the significance of an attribute towards the clustering process. Fuzzy k-modes algorithm for clustering categorical data is extended by representing the clusters of categorical data with fuzzy centroids. Use of fuzzy centroids makes it possible to fully exploit the power of fuzzy sets in representing the uncertainty in the classification of categorical data. The effectiveness of the new fuzzy k-modes algorithm is better than those of the other existing k-modes algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    3
    Citations
    NaN
    KQI
    []