Distributed Learning for the Decentralized Control of Articulated Mobile Robots

2018 
Decentralized control architectures, such as those conventionally defined by central pattern generators, independently coordinate spatially distributed portions of articulated bodies to achieve system-level objectives. State of the art distributed algorithms for reinforcement learning employ a different but conceptually related idea; independent agents simultaneously coordinating their own behaviors in parallel environments while asynchronously updating the policy of a system-or, rather, meta-level agent. This work, to the best of the authors' knowledge, is the first to explicitly explore the potential relationship between the underlying concepts in homogeneous decentralized control for articulated locomotion and distributed learning. We present an approach that leverages the structure of the asynchronous advantage actor-critic (A3C) algorithm to provide a natural framework for learning decentralized control policies on a single platform. Our primary contribution shows an individual agent in the A3C algorithm can be defined by an independently controlled portion of the robot's body, thus enabling distributed learning on a single platform for efficient hardware implementation. To this end, we show how the system is trained offline using hardware experiments implementing an autonomous decentralized compliant control framework. Our experimental results show that the trained agent outperforms the compliant control baseline by more than 40% in terms of steady progression through a series of randomized, highly cluttered evaluation environments.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    10
    Citations
    NaN
    KQI
    []