Crowd counting by using multi-level density-based spatial information: A Multi-scale CNN framework

2020 
Abstract Crowd counting is an extremely challenging task due to occlusions, scale variations of people’s heads, and non-uniform distributions of people. In this paper, we propose a scale-aware convolutional neural network (CNN), named MMNet, to generate density maps for crowd counting. In comparison with most extant scale-aware works, the proposed MMNet not only captures multi-scale features generated by various sizes of filters, but also integrates multi-scale features generated by different stages to handle scale variations of people’s heads. Considering that crowd density distribution information contains critical information with respect to people’s head sizes, multi-level density-based spatial information is employed to supervise the fusion of multi-scale features in our proposed network. Specifically, two kinds of effective spatial distribution prior representation are introduced by using estimated density maps generated from intermediate stages for integrating two kinds of multi-scale features, respectively. Experimental results demonstrate the effectiveness of our proposed MMNet in comparison to state-of-the-art methods on four benchmark datasets and a real-world application.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    13
    Citations
    NaN
    KQI
    []