Graph-Based Knowledge Driven Approach for Violence Detection
1
Citation
18
Reference
10
Related Paper
Citation Trend
Abstract:
Automatically identifying violence in videos is critical, and combining visual and audio cues is often the most effective approach that provides complementary information for violence detection. However, existing research on fusing these cues is computationally demanding and limited. To address this issue, we propose a novel fused vision-based graph neural network (FV-GNN) for violence detection using audiovisual information. This approach combines local and global features from both audio and video, leveraging a residual learning strategy to extract the most informative cues. Furthermore, FV-GNN utilizes dynamic graph filtering to analyze the inherent relationships between audio and video samples, enhancing violence recognition. The network consists of three branches: integrated, specialized, and scoring. The integrated branch captures long-range dependencies based on similarity, while the specialized branch focuses on local positional relationships. Finally, the scoring branch assesses the predicted violence likelihood against reality. We extensively explored the use of graphs for modeling temporal context in videos and found FV-GNN to be particularly well-suited for real-time violence detection. Our experiments demonstrate that FV-GNN outperforms current state-of-the-art methods on the XD-Violence datasets.Keywords:
Similarity (geometry)
Robustness
Cite
Citations (17)
In this article, we take one step toward understanding the learning behavior of deep residual networks, and supporting the observation that deep residual networks behave like ensembles. We propose a new convolutional neural network architecture which builds upon the success of residual networks by explicitly exploiting the interpretation of very deep networks as an ensemble. The proposed multi-residual network increases the number of residual functions in the residual blocks. Our architecture generates models that are wider, rather than deeper, which significantly improves accuracy. We show that our model achieves an error rate of 3.73% and 19.45% on CIFAR-10 and CIFAR-100 respectively, that outperforms almost all of the existing models. We also demonstrate that our model outperforms very deep residual networks by 0.22% (top-1 error) on the full ImageNet 2012 classification dataset. Additionally, inspired by the parallel structure of multi-residual networks, a model parallelism technique has been investigated. The model parallelism method distributes the computation of residual blocks among the processors, yielding up to 15% computational complexity improvement.
Residual neural network
Cite
Citations (36)
In this article, we take one step toward understanding the learning behavior of deep residual networks, and supporting the hypothesis that deep residual networks are exponential ensembles by construction. We examine the effective range of ensembles by introducing multi-residual networks that significantly improve classification accuracy of residual networks. The multi-residual networks increase the number of residual functions in the residual blocks. This is shown to improve the accuracy of the residual network when the network is deeper than a threshold. Based on a series of empirical studies on CIFAR-10 and CIFAR-100 datasets, the proposed multi-residual network yield $6\%$ and $10\%$ improvement with respect to the residual networks with identity mappings. Comparing with other state-of-the-art models, the proposed multi-residual network obtains a test error rate of $3.92\%$ on CIFAR-10 that outperforms all existing models.
Residual neural network
Cite
Citations (17)
Graph autoencoders are efficient at embedding graph-based data sets. Most graph autoencoder architectures have shallow depths which limits their ability to capture meaningful relations between nodes separated by multi-hops. In this paper, we propose Residual Variational Graph Autoencoder, ResVGAE, a deep variational graph autoencoder model with multiple residual modules. We show that our multiple residual modules, a convolutional layer with residual connection, improve the average precision of the graph autoencoders. Experimental results suggest that our proposed model with residual modules outperforms the models without residual modules and achieves similar results when compared with other state-of-the-art methods.
Autoencoder
Cite
Citations (0)
A residual-networks family with hundreds or even thousands of layers dominates major image recognition tasks, but building a network by simply stacking residual blocks inevitably limits its optimization ability. This paper proposes a novel residual-network architecture, Residual networks of Residual networks (RoR), to dig the optimization ability of residual networks. RoR substitutes optimizing residual mapping of residual mapping for optimizing original residual mapping. In particular, RoR adds level-wise shortcut connections upon original residual networks to promote the learning capability of residual networks. More importantly, RoR can be applied to various kinds of residual networks (ResNets, Pre-ResNets and WRN) and significantly boost their performance. Our experiments demonstrate the effectiveness and versatility of RoR, where it achieves the best performance in all residual-network-like structures. Our RoR-3-WRN58-4+SD models achieve new state-of-the-art results on CIFAR-10, CIFAR-100 and SVHN, with test errors 3.77%, 19.73% and 1.59%, respectively. RoR-3 models also achieve state-of-the-art results compared to ResNets on ImageNet data set.
Residual neural network
Cite
Citations (316)
Residual Neural Networks [1] won first place in all five main tracks of the ImageNet and COCO 2015 competitions. This kind of network involves the creation of pluggable modules such that the output contains a residual from the input. The residual in that paper is the identity function. We propose to include residuals from all lower layers, suitably normalized, to create the residual. This way, all previous layers contribute equally to the output of a layer. We show that our approach is an improvement on [1] for the CIFAR-10 dataset.
Deep Neural Networks
Cite
Citations (0)
Single image super-resolution (SISR), which aims at obtaining a high-resolution image from a single low-resolution image, is a classical problem in computer vision. In this paper, we address this problem based on a deep learning method with residual learning in an end-to-end manner. We propose a novel residual-network architecture, Residual networks of Residual networks (RoR), to promote the learning capability of residual networks for SISR. In residual network, the signal can be directly propagated from one unit to any other units in both forward and backward passes when using identity mapping as the skip connections. Based on it, we add level-wise connections upon original residual networks, to dig the optimization ability of residual networks. Our experiments demonstrate the effectiveness and versatility of RoR, it can get a faster convergence speed and gain higher resolution accuracy from considerably increased depth.
Cite
Citations (1)
SIGNAL (programming language)
Feature (linguistics)
Cite
Citations (0)
Smoothing
Rounding
Round-off error
Krylov subspace
Cite
Citations (4)