Learning to Translate Between Real World and Simulated 3D Sensors While Transferring Task Models

2019 
Learning-based vision tasks are usually specialized on the sensor technology for which data has been labeled. The knowledge of a learned model is simply useless when it comes to data which differs from the data on which the model has been initially trained or if the model should be applied to a totally different imaging or sensor source. New labeled data has to be acquired on which a new model can be trained. Depending on the sensor, this can even get more complicated when the sensor data becomes more abstract and hard to be interpreted and labeled by humans. To enable reuse of models trained for a specific task across different sensors minimizes the data acquisition effort. Therefore, this work focuses on learning sensor models and translating between them, thus aiming for sensor interoperability. We show that even for the complex task of human pose estimation from 3D depth data recorded with different sensors, i.e. a simulated and a Kinect 2 TM depth sensor, human pose estimation can greatly improve by translating between sensor models without modifying the original task model. This process especially benefits sensors and applications for which labels and models are difficult if at all possible to retrieve from raw sensor data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []