Bayesian multimodal integration in a robot replicating human head and eye movements

2014 
Autonomous robots need to make sense of their surroundings, recognize objects, detect and, possibly, identify the people around them. Visual perception would be ideally suited for these tasks. Yet, visual perception remains one of the limiting factors of modern robotics: artificial vision systems tend to perform poorly in normal environments, where the scene and the illumination conditions are unpredictable. Evolution has faced similar problems, leading to surprisingly accurate visual capabilities, even in species with very small brains. We argue that the success of biological perception systems relies on three fundamental computational principles: (a) the continual coupling of perception and behavior; (b) the resulting emergence of multimodal cues; (c) and their efficient integration. Building on these computational principles, we describe a humanoid robot that emulates the dynamic strategy by which humans examine a visual scene. The proprioceptive and visual depth cues resulting from this strategy are integrated in statistically optimal manner into a unified representation. We show that this approach yields accurate and robust 3D representations of the observed scene.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    6
    Citations
    NaN
    KQI
    []