Depth Limitation and Splitting Criteria Optimization on Random Forest for Efficient Human Activity Classification

2019 
Random Forest (RF) is known as one of the best classifiers in many fields. They are parallelizable, fast to train and to predict, robust to outlier, handle unbalanced data, have low bias, and moderate variance. Apart from these advantages, there are still opportunities to increase RF efficiency. The absence of recommendations regarding the number of trees involved in RF ensembles could make the number of trees very large. This can increase the computational complexity of RF. Recommendations for not pruning the decision tree further aggravates the condition. This research attempts to build an efficient RF ensemble while maintaining its accuracy, especially in problem activity. Data collection is performed using an accelerometer sensor on a smartphone device. The data used in this research are collected from five peoples who perform 11 different activities. Each activity is carried out five times to enrich the data. This study uses two steps to improve the efficiency of the classification of the activity: 1) Optimal splitting criteria for activity classification, 2) Measured pruning to limit the tree depth in RF ensemble. The first method in this study can be applied to determine the splitting criteria that are most suitable for the classification problem of activities using Random Forest. In this case, the decision model built using the Gini Index can produce the highest accuracy. The second method proposed in this research successfully builds less complex pruned-tree without reducing its classification accuracy. The research results showed that the method applied to the Random Forest in this study was able to produce a decision model that was simple but yet accurate to classify activity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []