Class Imbalance Oriented Logistic Regression

2014 
Class-imbalance is quite common in real world. For the imbalanced class distribution, traditional state-of-the-art classifiers do not work well on imbalanced data sets. In this paper, we apply logistic regression model to class-imbalance problem, and propose a novel algorithm called CILR (Class Imbalance oriented Logistic Regression) to tackle imbalanced data sets. Unlike traditional logistic regression which tries to optimize MLE (maximum likelihood Estimation) function, CILR optimizes the proposed objective function based on MLE and recall metric in this paper. The loss function takes full use of the characteristic of both majority class and minority class simultaneously, which guarantees that CILR enhances the classification performance of logistic regression on rare class without decreasing accuracy in general. Experimental results on 16 data sets show that CILR performs significantly better than traditional logistic regression, under-sampled logistic regression and over-sampled logistic regression.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    10
    Citations
    NaN
    KQI
    []