Weakly Supervised Deep Learning for Objects Detection from Images

2020 
Training of a convolutional neural network for object detection requires a large number of images with pixel-level annotations. Weakly supervised learning uses image-level labels to circumvent the issue of lack of semantic examples, which remains an open challenging. This paper proposes a cascaded deep network architecture that leverages the class activation mapping with global average pooling. The first stage of this architecture learns to infer object localization maps based on the image-level annotations, which generates bounding boxes of objects in every image. These image patches are adhesion areas in the original image. In the second stage, the image patches are used to train the detection network. Experiments are conducted using the PASCAL VOC 2012 datasets. Our proposed method obtains a mean average precision of 87.2% and demonstrates a competitive performance of classification performance with respect to the state-of-the-art methods. In the evaluation of object localization, the recall of our method is improved by 9%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    4
    Citations
    NaN
    KQI
    []