Active Sample Selection Through Sparse Neighborhood for Imbalanced Datasets

2019 
For the imbalanced datasets, a novel biased-based active sampling learning algorithm is proposed for the first time. The algorithm combines two important sampling factors with minority confidence and instances’ informativeness in active learning framework The sampling strategy aims at taking into account the selected instances’ utilities while avoiding sampling invalid majority instances. For this purpose, a novel label propagation algorithm through sparse neighborhood independent of super-parameter k is proposed to calculate minority confidence. Different from other semisupervised learning methods, the algorithm learns the instances by sparse coding theory and adaptively constructs the sparse neighborhood and the sparse neighborhood graph. For calculating instances’ informativeness, we propose an informativeness measure method based on the nearest boundary distance. It mainly utilizes direction vector feature and a heuristic search strategy to construct an auxiliary decision boundary. Then we evaluate the instances’ informativeness based on the auxiliary decision boundary.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    1
    Citations
    NaN
    KQI
    []