Cross-validation Metrics for Evaluating Classification Performance on Imbalanced Data

Ni Wayan Surya Wardhani,Masithoh Yessi Rochayani,Atiek Iriany,Agus Dwi Sulistyono,Prayudi Lestantyo

Cross-validation Metrics for Evaluating Classification Performance on Imbalanced Data

2019

Imbalanced data was often a classification issue, because a training process using the data would make model too suitable for the majority class. Meanwhile, ensemble technique was one alternative to deal with imbalanced data. The paper aimed to compare metrics, measuring classification performance for imbalanced data through an empirical study on cabbage image classification. Metrics used were accuracy, F1 score, g-mean, MCC, Cohen’s Kappa statistics, and AUC. We used three ensembles i.e. bagging, Breiman boosting, and Freund boosting. The empirical study result indicated that accuracy, F1 score, and g-mean gave values not reflecting the actual confusion cases. Accuracy, F1 score, g-mean, MCC, and Kappa showed the same values in different confusion matrix conditions, but AUC gave the different values in different confusion matrix. Based on the result, AUC become the robust metrics to measure on imbalanced condition.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations