language-icon Old Web
English
Sign In

Mass data clustering method

2013 
The invention discloses a mass data clustering method comprising the following steps of: firstly, analyzing a mass data set and establishing a corresponding differential matrix; then, carrying out rarefaction treatment on the differential matrix to obtain an attribute core and a simplest decision table; and then, realizing dynamic clustering by using a MapReduce distributed parallel algorithm to obtain the optimal cluster of mass data, wherein the clustered data can be used for various data excavation analysis and application. By using the mass data clustering method, the advantages of the differential matrix on the aspect of information simplicity are sufficiently exerted, high-dimension mass data is subjected to efficient and simple processing such as dimensionality reduction, rarefaction and the like in advance through the differential matrix, and the problem that a MapReduce computing frame can only be used for processing a single data set, but can not be used for directly supporting the processing on multiple relevant data sets or effectively processing the high-dimension mass data is solved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []