Two Stages Outlier Removal as Pre-processing Digitizer Data on Fine Motor Skills (FMS) Classification Using Covariance Estimator and Isolation Forest

2021 
The increase of the classification accuracy level has become an important problem in machine learning especially in diverse data-set that contain the outlier data. In the data stream or the data from sensor readings thatproduce large data, it allows a lot of noise to occur. It makes the performance of the machine learning model is disrupted or even decreased. Therefore, clean data from noise is needed to obtain good accuracy and to improve the performance of the machine learning model. This research proposes a two-stages for detecting and removing outlier data by using the covariance estimator and isolation forest methods as pre-processing in the classification process to determine fine motor skill (FMS). The dataset was generated from the process of recording data directly during cursive writing by using a digitizer. The data included the relative position of the stylus on the digitizer board. x position, y position, z position, and pressure values are then used as features in the classification process. In the process of observation and recording, the generated data was very huge so some of them produce the outlier data. From the experimental results that have been implemented, the level of accuracy in the FMS classification process increases between 0.5-1% by using the Random Forest classifier after the detection and outlier removal by using covariance estimator and isolation forest. The highest accuracy rate achieves 98.05% compared to the accuracy without outlier removal, which is only about 97.3%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []