Fast Monocular Depth Estimation on an FPGA

Youki Sada,Naoto Soga,Masayuki Shimoda,Akira Jinguji,Shimpei Sato,Hiroki Nakahara

Fast Monocular Depth Estimation on an FPGA

2020

Depth sensing is crucial for understanding 3D scenes on embedded systems such as home robots, self-driving cars, and drones. Monocular depth estimation which gives pixel-wise depth from a general camera, has attracted attention in recent years, due to the reliability, low-cost and small area requirement. Past research by using Convolutional Neural Network (CNN) has gained high accuracy and been increasing interest. However, the CNN requires a massive amount of MACs (Multiply ACcumulations) and weights, so its latency is extremely long. To address this problem, we present hardware-oriented pruning for separable convolutions and effectively parallelized MAC Unit. We introduce a filter-wise pruned DepthFCN and novel FPGA architecture that exploit its sparsity. Moreover, dense convolution and pruned separable convolution are implemented on a shared convolutional circuit due to high hardware efficiency and a high parallel degree. We compare the proposed FPGA-based system with the Jetson TX2. The FPGA accelerator achieves 123.6 FPS with 0.3 W power consumption for a 256×256 image, and its accuracy is 76.2%. Compared with the mobile GPU, it is 1.5 times faster and its power consumption is 20 times lower. We demonstrate the fastest monocular depth estimation by using a low-cost FPGA board that is suitable for embedded systems.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations