Caption Generation from Road Images for Traffic Scene Construction

2020 
In this paper, an image captioning network is proposed for traffic scene modeling, which incorporates element attention into the encoder-decoder mechanism to generate more reasonable scene captions. Firstly, the traffic scene elements are detected and segmented according to their clustered locations. Then, the image captioning network is applied to generate the corresponding caption of each subregion. The static and dynamic traffic elements are appropriately organized to construct a 3D corridor scene model. The semantic relationships between the traffic elements are specified according to the captions. The constructed 3D scene model can be utilized for the offline test of unmanned vehicles. The evaluations and comparisons based on the TSD-max and COCO datasets prove the effectiveness of the proposed framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []