Towards Deeper Insights into Deep Learning from Imbalanced Data
2017
Imbalanced performance usually happens to those classifiers (including deep neural networks) trained on imbalanced training data. These classifiers are more likely to make mistakes on minority class instances than on those majority class ones. Existing explanations attribute the imbalanced performance to the imbalanced training data. In this paper, using deep neural networks, we strive for deeper insights into the imbalanced performance. We find that imbalanced data is a neither sufficient nor necessary condition for imbalanced performance in deep neural networks, and another important factor for imbalanced performance is the distance between the majority class instances and the decision boundary. Based on our observations, we propose a new under-sampling method (named Moderate Negative Mining) which is easy to implement, state-of-the-art in performance and suitable for deep neural networks, to solve the imbalanced classification problem. Various experiments validate our insights and demonstrate the superiority of the proposed under-sampling method.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
21
References
1
Citations
NaN
KQI