Fast Semantic Segmentation for Scene Perception

2019 
Semantic segmentation is a challenging problem in computer vision. Many applications, such as autonomous driving and robot navigation with urban road scene, need accurate and efficient segmentation. Most state-of-the-art methods focus on accuracy, rather than efficiency. In this paper, we propose a more efficient neural network architecture, which has fewer parameters, for semantic segmentation in the urban road scene. An asymmetric encoder–decoder structure based on ResNet is used in our model. In the first stage of encoder, we use continuous factorized block to extract low-level features. Continuous dilated block is applied in the second stage, which ensures that the model has a larger view field, while keeping the model small-scale and shallow. The down sampled features from encoder are up sampled with decoder to the same-size output as the input image and the details refined. Our model can achieve end-to-end and pixel-to-pixel training without pretraining from scratch. The parameters of our model are only $0.2M$ , $100 \times$ less than those of others such as SegNet, etc. Experiments are conducted on five public road scene datasets (CamVid, CityScapes, Gatech, KITTI Road Detection, and KITTI Semantic Segmentation), and the results demonstrate that our model can achieve better performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    35
    Citations
    NaN
    KQI
    []