RT-VENet: A Convolutional Network for Real-time Video Enhancement

Mohan Zhang,Qiqi Gao,Jinglu Wang,Henrik Turbell,David Zhao,Jinhui Yu,Yan Lu

RT-VENet: A Convolutional Network for Real-time Video Enhancement

2020

Real-time video enhancement is in great demand due to the extensive usage of live video applications, but existing approaches are far from satisfying the strict requirements of speed and stability. We present a novel convolutional network that can perform high-quality enhancement on 1080p videos at 45 FPS with a single CPU, which has high potential for real-world deployment. The proposed network is designed based on a light-weight image network and further consolidated for temporal consistency with a temporal feature aggregation (TFA) module. Unlike most image translation networks that use decoders to generate target images, our network discards decoders and employs only an encoder and a small head. The network predicts color mapping functions instead of pixel values in a grid-like container which fits the CNN structure well and also advances the enhancement to be scalable to any video resolution. Furthermore, the temporal consistency of the output will be enforced by the TFA module which utilizes the learned temporal coherence of semantics across frames. We also demonstrate that the mapping representation is general to various enhancement tasks, such as relighting, retouching and dehazing, on benchmark datasets. Our approach achieves the state-of-the-art performance and performs about 10 times faster than the current real-time method on high-resolution videos.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations