Parallelization support vector machine (SVM) solving method based on Hadoop

2012 
The invention discloses a parallelization SVM solving method based on the Hadoop. The method includes the steps of storing data into a distributed cluster file system; executing a random sampling process on each data block according to distribution conditions of the data, distributing randomly selected sampling data one by one, and forming a plurality of data subsets; performing a local first method on the data subsets; performing fusion of averaging on results of the local first method on the data subsets and outputting an average result. According to the parallelization SVM solving method, Pegasos solving of mass data can be processed without damage to accuracy, operation time is greatly shortened, and expansion can be good.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []