MAP PROBABILISTIC DENSITY BASED SUBSPACE CLUSTERING FOR DIMENSIONALITY REDUCTION OF BIG DATA ANALYTICS

2020 
Density based subspace clustering algorithms focus on finding dense clusters of random shape and size. Most of the existing density based subspace clustering algorithms in the literature is less effective and accuracy while taking big dataset as input. In order to overcome such limitations, a MAP Probabilistic Density based Subspace Clustering (MAPPD-SC) Technique is introduced. The MAPPD-SC technique is designed for high dimensional data to improve the clustering accuracy and dimensionality reduction. Initially MAPPD-SC technique designs Map Probabilistic Density Based Subspace Clustering (MPDSC) algorithm with aim of grouping the similar data with higher accuracy and minimum time utilization. During big data clustering, the MAPPD-SC technique applies the maximum a posteriori (MAP) calculation with the goal of clustering more related data together and thereby forming optimal number of clusters with high accuracy. After completing clustering process, the MAPPD-SC technique designs Fusion Tree Data Storage Structure (FTDSS) with objective of storing clustered big data with reduced space complexity. The FTDSS only stores bits values of clustered data in its memory by using fusion tree concepts. This generated bit values of input clustered data takes minimal amount of memory space. From that, proposed MAPPD-SC technique reduces the dimensionality of big data for effective big data analytics. Experimental evaluation of MAPPD-SC technique is carried out on factors such as clustering accuracy, clustering time and false positive rate and space complexity with respect to number of climate data using El Nino Data Set.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []