Constant Velocity 3D Convolution
2018
We propose a novel 3-D convolution method, cv3dconv, for extracting spatiotemporal features from videos. It reduces the number of sum-of-products operations in 3-D convolution by thousands of times by assuming the constant moving velocity of the features. We observed that a specific class of video sequences, such as video captured by an in-vehicle camera, can be well approximated with piece-wise linear movements of 2-D features in a temporal dimension. Our principal finding is that a 3-D kernel, represented by constant velocity, can be decomposed into a convolution of a 2-D-shaped kernel and a 3-D-velocity kernel, which is parameterized using only two parameters. We derived an efficient recursive algorithm for this class of 3-D convolution, which is exceptionally suited for sparse spatiotemporal data, and this parameterized decomposed representation imposes a structured regularization along a temporal direction. We experimentally verified the validity of our approximation using a controlled dataset, and we also showed the effectiveness of the cv3dconv by adopting it for deep neural networks (DNNs) in visual odometry estimation task using publicly available event-based camera dataset captured in urban road scene. Our DNN architecture improves the estimation accuracy for about 30% compared with the existing states-of-the-arts architecture designed for event data.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
22
References
1
Citations
NaN
KQI