Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

Jiang Yu,Yong Liu

Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

2019

Jiang Yu
Yong Liu

In this paper, we propose attention-based neural encoder-decoder networks for predicting user Field-of-View (FoV) in 360-degree videos. Our proposed prediction methods are based on the attention mechanism that learns the weighted prediction power of historical FoV time series through end-to-end training. Attention-based neural encoder-decoder networks do not involve recursion, thus can be highly parallelized during training. Using publicly available 360-degree head movement datasets, we demonstrate that our FoV prediction models outperform the state-of-art FoV prediction models, achieving lower prediction error, higher training throughput, and faster convergence. Better FoV prediction leads to reduced bandwidth consumption, better video quality, and improved user quality of experience.

Keywords:

Computer vision
Field of view
Encoder
Artificial intelligence
Computer science
Quality of experience
Convergence (routing)
Video quality
Artificial neural network
Machine learning
Throughput
Recursion
Bandwidth (signal processing)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations