Balancing Exploration and Exploitation: A novel active learner for imbalanced data

2020 
Abstract Active learning receives great interest from researchers with the aim of reducing the amount of time, cost, and efforts for labeling data in many applications. Active learning aims to generate/select the smallest possible amount of training data that ensures strong classification performance in the test phase. An active learner carries out two main steps: (i) selecting a set of promising queries from unlabeled data, and (ii) annotating the selected queries. Most active learners choose either the most informative or representative instances for annotation. In this paper, we combined these two criteria for query selection. First, in the exploration phase, the proposed algorithm explores the search space and tries in each iteration to visit new regions for better exploration. This improves the capability of exploring the space of minority classes with imbalanced data. Second, in the exploitation phase, the goal is to generate a new point in an uncertain region, which is expected to be around the decision boundaries of the target functions. Some variants of the proposed algorithm do not require any labeled or unlabeled data in advance. There is only comparably few existing work which addresses this scenario. Experiments on synthetic and real datasets with different dimensions and imbalance ratios indicate that the proposed algorithm has significant advantages compared to various well-known active learners.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    3
    Citations
    NaN
    KQI
    []