XCS classifier system with experience replay

2020 
XCS constitutes the most deeply investigated classifier system today. It offers strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter domain, recent advances have been mainly driven by algorithms which model their policies based on deep neural networks, among which the Deep-Q-Network (DQN) being a prominent representative. Experience Replay (ER) constitutes one of the crucial factors for the DQN's successes, since it facilitates stabilized training of the neural network-based Q-function approximators. Surprisingly, XCS barely takes advantage of similar mechanisms that leverage remembered raw experiences. To bridge this gap, this paper investigates the benefits of extending XCS with ER. We demonstrate that for single-step tasks ER yields strong improvements in terms of sample efficiency. On the downside, however, we reveal that ER might further aggravate well-studied issues not yet solved for XCS when applied to sequential decision problems demanding for long-action-chains.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    9
    Citations
    NaN
    KQI
    []