Improving object detection accuracy with region and regression based deep CNNs

2017 
Object detection has made great improvements in convolutional neural networks (CNNs), which is the high-capacity visual model that yields hierarchies of discriminative features. Object detection based on CNNs is in general divided into two aspects: region based detection and regression based detection. In this paper, we aim at further advancing object detection performance by properly utilizing the complementary results of those two streams. By investigating errors of several previous state-of-the-art methods about the two streams, we discover that those detection results of two general streams are complementary in object recognition and localization. Region based methods achieve high recall but simultaneously struggle with localization problems, while regression based methods make less localization errors by iteratively regressing the object to target localization. Driven by these observations, we propose two kinds of fusion paradigms to combine the results of those two streams. One is direct fusion utilizing the complementary results of those two streams and adopting non-maximal suppression (NMS) and voting operation to make full use of the results generated by two streams. In addition, considering direct fusion may compromise the original performance of object detections, we also propose another method, modifies voting operation that just refines the box coordinate without having any other impact on the original detections and further boosts the performance by an adding operation. Extensive experiments show that our two ensemble paradigms both boost the state-of-the-art results on Pascal VOC dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    2
    Citations
    NaN
    KQI
    []