Offline and Online Optical Flow Enhancement for Deep Video Compression
3
Citation
43
Reference
10
Related Paper
Citation Trend
Abstract:
Video compression relies heavily on exploiting the temporal redundancy between video frames, which is usually achieved by estimating and using the motion information. The motion information is represented as optical flows in most of the existing deep video compression networks. Indeed, these networks often adopt pre-trained optical flow estimation networks for motion estimation. The optical flows, however, may be less suitable for video compression due to the following two factors. First, the optical flow estimation networks were trained to perform inter-frame prediction as accurately as possible, but the optical flows themselves may cost too many bits to encode. Second, the optical flow estimation networks were trained on synthetic data, and may not generalize well enough to real-world videos. We address the twofold limitations by enhancing the optical flows in two stages: offline and online. In the offline stage, we fine-tune a trained optical flow estimation network with the motion information provided by a traditional (non-deep) video compression scheme, e.g. H.266/VVC, as we believe the motion information of H.266/VVC achieves a better rate-distortion trade-off. In the online stage, we further optimize the latent features of the optical flows with a gradient descent-based algorithm for the video to be compressed, so as to enhance the adaptivity of the optical flows. We conduct experiments on two state-of-the-art deep video compression schemes, DCVC and DCVC-DC. Experimental results demonstrate that the proposed offline and online enhancement together achieves on average 13.4% bitrate saving for DCVC and 4.1% bitrate saving for DCVC-DC on the tested videos, without increasing the model or computational complexity of the decoder side.Keywords:
Optical Flow
Online video
This paper address the problem of optical expansion (OE). OE describes the object scale change between two frames, widely used in monocular 3D vision tasks. Previous methods estimate optical expansion mainly from optical flow results, but this two-stage architecture makes their results limited by the accuracy of optical flow and less robust. To solve these problems, we propose the concept of 3D optical flow by integrating optical expansion into the 2D optical flow, which is implemented by a plug-and-play module, namely TPCV. TPCV implements matching features at the correct location and scale, thus allowing the simultaneous optimization of optical flow and optical expansion tasks. Experimentally, we apply TPCV to the RAFT optical flow baseline. Experimental results show that the baseline optical flow performance is substantially improved. Moreover, we apply the optical flow and optical expansion results to various dynamic 3D vision tasks, including motion-in-depth, time-to-collision, and scene flow, often achieving significant improvement over the prior SOTA. Code is available at https://github.com/HanLingsgjk/TPCV.
Optical Flow
Cite
Citations (1)
This paper presents a new method for obstacle detection using optical flow. The method employs a highly efficient and accurate adaptive motion detection algorithm for determining the regions in the image which are more likely to contain obstacles. These regions then have optical flow performed on them. We call this method targeted optical flow. Targeted optical flow performs significantly faster compared to regular optical flow. We employ two types of optical flow to demonstrate the performance and speed increase of the proposed system. Finally, k-means clustering is employed for obstacle reconstruction. The system is designed for color videos for better performance. Several benchmark and recorded sequences have been used for testing the system.
Optical Flow
Benchmark (surveying)
Cite
Citations (19)
Optical Flow
Frame rate
Cite
Citations (3)
Motion representation becomes the critical factor since more and more trimmed video action recognition tasks rely on machine learning. In this paper, we proposed a new motion representation whose hint is from optical flow algorithms which have been proved to be effective and efficient in the aspect of video action recognition. Our motion flow is one kind of modality which is different from RGB, RGB diff, and the most popular optical flow, although the methodology is derived from the optical flow, it is faster and more accurate than optical flow algorithms. Furthermore, we introduced the most excellent convolutional neural network framework named densely connected convolutional networks (DenseNet) to optimize the networks and we use the motion flow as the inputs of the framework. We achieve experimental evaluations, when the proposed motion representation is plugged into the DenseNet framework, the accuracy on the UCF-101 and HMDB-51 is 96% and 74.2% respectively, which turn out to be our proposed methodology is satisfactory and 15 times faster in speed.
Optical Flow
Representation
RGB color model
Feature (linguistics)
Action Recognition
Cite
Citations (0)
Real-time optical flow detection at 1000 fps was realized by implementing an improved optical flow detection algorithm as hardware logic on a high-speed vision platform. The improved gradient-based algorithm, which is based on the Lucas-Kanade algorithm, can select a pseudo variable frame rate adaptively according to the amplitude of optical flow to estimate the accurate optical flow for objects moving at high speeds and low speeds in the same scene. The high-speed vision platform on which the optical flow detection algorithm is implemented can be used to calculate optical flow at 1000 fps for images of 1024 x 1024 pixels; by considering real scenarios such as rapid human motion, the performance of our developed optical flow detection algorithm and system was verified.
Optical Flow
Frame rate
Cite
Citations (13)
Inaccurate optical flow estimates in and near occluded regions, and out-of-boundary regions are two of the current significant limitations of optical flow estimation algorithms. Recent state-of-the-art optical flow estimation algorithms are two-frame based methods where optical flow is estimated sequentially for each consecutive image pair in a sequence. While this approach gives good flow estimates, it fails to generalize optical flows in occluded regions mainly due to limited local evidence regarding moving elements in a scene. In this work, we propose a learning-based multi-frame optical flow estimation method that estimates two or more consecutive optical flows in parallel from multi-frame image sequences. Our underlying hypothesis is that by understanding temporal scene dynamics from longer sequences with more than two frames, we can characterize pixel-wise dependencies in a larger spatiotemporal domain, generalize complex motion patterns and thereby improve the accuracy of optical flow estimates in occluded regions. We present learning-based spatiotemporal recurrent transformers for multi-frame based optical flow estimation (SSTMs). Our method utilizes 3D Convolutional Gated Recurrent Units (3D-ConvGRUs) and spatiotemporal transformers to learn recurrent space-time motion dynamics and global dependencies in the scene and provide a generalized optical flow estimation. When compared with recent state-of-the-art two-frame and multi-frame methods on real world and synthetic datasets, performance of the SSTMs were significantly higher in occluded and out-of-boundary regions. Among all published state-of-the-art multi-frame methods, SSTM achieved state-of the-art results on the Sintel Final and KITTI2015 benchmark datasets.
Optical Flow
Frame rate
Cite
Citations (0)
Optical Flow
Tracking (education)
Cite
Citations (11)
A novel video smoke recognition method based on optical flow is presented. The result of optical flow is assumed to be an approximation of motion field. The method is proposed as following, first, moving pixels and regions in the video are determined by a background estimation method. Then, a pyramidal implementation of the Lucas Kanade feature tracker is proposed to calculate the optical flow of regions determined by the first step. And the average and variance of the corner points' optical velocity are calculated which we call optical flow features and use to differentiate smoke from some other moving objects. Finally, examples consisting of features extracted from sequences of off-line videos are collected for the training of a discriminating model. A prototype of back-propagation neural networks is introduced for the discriminating model. Experiments show that the algorithm is significant for improving the accuracy of fire smoke detection and reducing false alarms.
Optical Flow
Feature (linguistics)
Cite
Citations (14)
In this paper, we develop a high-frame-rate (HFR) vision system that can estimate the optical flow in real time at 1000 f/s for 1024×1024 pixel images via the hardware implementation of an improved optical flow detection algorithm on a high-speed vision platform. Based on the Lucas-Kanade method, we adopt an improved gradient-based algorithm that can adaptively select a pseudo-variable frame rate according to the amplitude of the estimated optical flow to accurately detect the optical flow for objects moving at high and low speeds in the same image. The performance of our developed HFR optical flow system was verified through experimental results for high-speed movements such as a top's spinnning motion and a human's pitching motion.
Optical Flow
Frame rate
Cite
Citations (80)
Group action recognition in soccer videos is a challenging problem due to the difficulties of group action representation and camera motion estimation. This paper presents a novel approach for recognizing group action with a moving camera. In our approach, ego-motion is estimated by the Kanade-Lucas-Tomasi feature sets on successive frames. The optical flow is then computed on compensated frames. Due to the inaccurate ego-motion estimation, the optical flow can not reflect accurate motion of objects. In this paper, we propose a new motion descriptor which treats the optical flow as spatial patterns and extracts accurate global motion from the noisy optical flow. The latent-dynamic conditional random field model is employed to recognize group action. Experimental results show that our approach is promising.
Optical Flow
Feature (linguistics)
Motion field
Representation
Motion analysis
Structure from Motion
Cite
Citations (13)