Augmenting Reinforcement Learning with a Planning Model for Optimizing Energy Demand Response

2020 
While reinforcement learning (RL) on humans has shown incredible promise, it often suffers from a scarcity of data and few steps. In instances like these, a planning model of human behavior may greatly help. We present an experimental setup for the development and testing of an Soft Actor Critic (SAC) V2 RL architecture for several different neural architectures for planning models: an autoML optimized LSTM, an OLS, and a baseline model. We present the effects of including a planning model in agent learning within a simulation of the office, currently reporting a limited success with the LSTM.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    7
    Citations
    NaN
    KQI
    []