Spatiotemporal consistency-enhanced network for video anomaly detection

2022 
Abstract Video anomaly detection aims to detect abnormal segments in a video sequence, which is a key problem in video surveillance. Based on deep prediction methods, we propose a spatiotemporal consistency-enhanced network to generate spatiotemporal consistency predictions. A 3D CNN-based encoder and 2D CNN-based decoder constitute the main part of our model. A resampling strategy is applied to the latent space vector when the model is trained by the normal data, yet this can cause the model to perform poorly if the data include abnormal data. Moreover, we combine an input clip with a generated frame into a reformed video clip, which is then fed into a discriminator that is constructed by the 3D CNN to evaluate the consistency of the input clip. Owing to the adversarial training between the generator and discriminator, the spatiotemporal consistency of the generated results is enhanced. During the testing stage, the abnormal data generates a different appearance and motion changes, which affect the ability of our model to predict spatiotemporal consistency in future images. Then, the prediction quality gap between normal and anomalous contents is used to infer whether anomalies occur. Extensive experiments confirm that the proposed method achieves state-of-the-art performance on three benchmark datasets, including ShanghaiTech, CUHK Avenue, and UCSD Ped2.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    1
    Citations
    NaN
    KQI
    []