Pyramid-dilated deep convolutional neural network for crowd counting

2021 
Statistics on crowds in crowded scenes can reflect the density level of crowds and provide safety warnings. This is a laborious task if conducted manually. In recent years, automated crowd counting has received extensive attention in the computer vision field. However, this task is still challenging mainly due to the serious occlusion in crowds and large appearance variations caused by the viewing angles of cameras. To overcome these difficulties, a pyramid-dilated deep convolutional neural network for accurate crowd counting called PDD-CNN is proposed. PDD-CNN is based on a VGG-16 network that is designed to generate dense attribute feature maps from an image with an arbitrary size or resolution. Then, two pyramid dilated modules are adopted, each consisting of four parallel dilated convolutional layers with different rates and a parallel average pooling layer to capture the multiscale features. Finally, three cascading dilated convolutions are used to regress the density map and perform accurate count estimation. In addition, a novel training loss, combining the Euclidean loss with the structural similarity loss, is employed to attenuate the blurry effects of density map estimation. The experimental results on three datasets (ShanghaiTech, UCF_CC_50, and UCF-QNRF) demonstrate that the proposed PDD-CNN produces high-quality density maps and achieves a good counting performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    0
    Citations
    NaN
    KQI
    []