Vehicle Counting Network with Attention-based Mask Refinement and Spatial-awareness Block Loss

2021 
Vehicle counting aims to calculate the number of vehicles in congested traffic scenes. Although object detection and crowd counting have made tremendous progress with the development of deep learning, vehicle counting remains a challenging task, due to scale variations, viewpoint changes, inconsistent location distributions, diverse visual appearances and severe occlusions. In this paper, a well-designed Vehicle Counting Network (VCNet) is novelly proposed to alleviate the problem of scale variation and inconsistent spatial distribution in congested traffic scenes. Specifically, VCNet is composed of two major components: (i) To capture multi-scale vehicles across different types and camera viewpoints, an effective multi-scale density map estimation structure is designed by building an attention-based mask refinement module. The multi-branch structure with hybrid dilated convolution blocks is proposed to assign receptive fields to generate multi-scale density maps. To efficiently aggregate multi-scale density maps, the attention-based mask refinement is well-designed to highlight the vehicle regions, which enables each branch to suppress the scale interference from other branches. (ii) In order to capture the inconsistent spatial distributions, a spatial-awareness block loss (SBL) based on the region-weighted reward strategy is proposed to calculate the loss of different spatial regions including sparse, congested and occluded regions independently by dividing the density map into different regions. Extensive experiments conducted on three benchmark datasets, TRANCOS, VisDrone2019 Vehicle and CVCSet demonstrate that the proposed VCNet outperforms the state-of-the-art approaches in vehicle counting. Moreover, the proposed idea can be applicable for crowd counting, which produces competitive results on ShanghaiTech crowd counting dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []