Clustering-Based Data Reduction Approach to Speed up SVM in Classification and Regression Tasks

2020 
Support Vector Machine (SVM) is a popular machine learning algorithm, being able to tackle non-linear problem by use of appropriate kernels. However, the use of SVM can become unfeasible for many applications where relatively large datasets are used. Its application becomes particularly more prohibitive when considering environments where the hardware has strict memory and processing limitations. By appropriately reducing the size of the training set we can in succession speed up the learning and diagnosis of SVM. In this work we implement a data reduction approach using clustering to build a smaller and representative set. The approach is extended for both classification and regression problems. Results evaluated on both normal PC and low resource edge device showed a better performance with only a small loss in diagnosis accuracy for most cases. Still, on cases where a high loss was observed, the reduction approach allowed to regain the accuracy with a faster hyper-parameter optimization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []