INFER: INtermediate representations for FuturE pRediction

2019 
In urban driving scenarios, forecasting future trajectories of surrounding vehicles is of paramount importance. While several approaches for the problem have been proposed, the best-performing ones tend to require extremely detailed input representations (e.g. image sequences). As a result, such methods do not generalize to datasets they have not been trained on. In this paper, we propose intermediate representations that are particularly well-suited for future prediction. As opposed to using texture (color) information from images, we condition on semantics and train an autoregressive model to accurately predict future trajectories of traffic participants (vehicles) (see Fig. above). We demonstrate that semantics provide a significant boost over techniques that operate over raw pixel intensities/disparities. Uncharacteristic of state-of-the-art approaches, our representations and models generalize across different sensing modalities (stereo imagery, LiDAR, a combination of both), and also across completely different datasets, collected across several cities, and also across countries where people drive on opposite sides of the road (left-handed vs right-handed driving). Additionally, we demonstrate an application of our approach in multi-object tracking (data association). To foster further research in transferable representations and ensure reproducibility, we release all our code and data.33More qualitative and quantitative results, along with code and data can be found at https://rebrand.ly/INFER-results
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    34
    Citations
    NaN
    KQI
    []