Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

2019 
In this paper, we propose attention-based neural encoder-decoder networks for predicting user Field-of-View (FoV) in 360-degree videos. Our proposed prediction methods are based on the attention mechanism that learns the weighted prediction power of historical FoV time series through end-to-end training. Attention-based neural encoder-decoder networks do not involve recursion, thus can be highly parallelized during training. Using publicly available 360-degree head movement datasets, we demonstrate that our FoV prediction models outperform the state-of-art FoV prediction models, achieving lower prediction error, higher training throughput, and faster convergence. Better FoV prediction leads to reduced bandwidth consumption, better video quality, and improved user quality of experience.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    6
    Citations
    NaN
    KQI
    []