Programming by Visual Demonstration for Pick-and-Place Tasks using Robot Skills

2019 
In this paper, we present a vision-based robot programming system for pick-and-place tasks that can generate programs from human demonstrations. The system consists of a detection network and a program generation module. The detection network leverages convolutional pose machines to detect the key-points of the objects. The network is trained in a simulation environment in which the train set is collected and auto-labeled. To bridge the gap between reality and simulation, we propose a design method of transform function for mapping a real image to synthesized style. Compared with the unmapped results, the Mean Absolute Error (MAE) of the model completely trained with synthesized images is reduced by 23% and the False Negative Rate FNR (FNR) of the model fine-tuned by the real images is reduced by 42.5% after mapping. The program generation module provides a human-readable program based on the detection results to reproduce a real-world demonstration, in which a longshort memory (LSM) is designed to integrate current and historical information. The system is tested in the real world with a UR5 robot on the task of stacking colored cubes in different orders.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    2
    Citations
    NaN
    KQI
    []