A Hierarchical Constrained Reinforcement Learning for Optimization of Bitumen Recovery Rate in a Primary Separation Vessel

2020 
Abstract This work proposes a two-level hierarchical constrained control structure for reinforcement learning (RL) with application in a Primary Separation Vessel (PSV). The lower level is concerned with servo tracking and regulation of the interface level against variances in ore quality by manipulating middlings flow rate. At the higher level, with the objective to optimize bitumen recovery rate, a supervisory interface level setpoint control is implemented. To prevent sanding, tailings density regulation using tailings withdrawal flow rate is proposed. For each case, an asynchronous advantage actor-critic (A3C) based agent is chosen to interact with a high-fidelity PSV model to learn the near optimal control strategy through episodic interactions. Each of the three control loops is sequentially learnt. In the interface level control loop, a behavioral cloning based two-phase learning scheme to promote stable state space exploration is proposed. The proposed hierarchical structure successfully demonstrates improved bitumen recovery rate by manipulating the interface level while preventing sanding.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    4
    Citations
    NaN
    KQI
    []