Organizing multimodal perception for autonomous learning and interactive systems

2008 
A stable perception of the environment is a crucial prerequisite for researching the learning of semantics from human-robot interaction and also for the generation of behavior relying on the robots perception. In this paper, we propose several contributions to this research field. To organize visual perception the concept of proto-objects is used for the representation of scene elements. These proto-objects are created by several different sources and can be combined to provide the means for interactive autonomous behavior generation. They are also processed by several classifiers, extracting different visual properties. The robot learns to associate speech labels with these properties by using the outcome of the classifiers for online training of a speech recognition system. To ease the combination of visual and speech classifier outputs, a necessity for the online training and basis for future learning of semantics, a common representation for all classifier results is used. This uniform handling of multimodal information provides the necessary flexibility for further extension. We will show the feasibility of the proposed approach by interactive experiments with the humanoid robot ASIMO.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    13
    Citations
    NaN
    KQI
    []