XCS-CR: determining accuracy of classifier by its collective reward in action set toward environment with action noise

2018 
Accuracy based Learning Classifier System (XCS) prefers to generalize the classifiers that always acquire the same reward, because they make accurate reward predictions. However, real-world problems have noise, which means that classifiers may not receive the same reward even if they always take the correct action. For this case, since all classifiers acquire multiple values as the reward, XCS cannot identify accurate classifiers. In this paper, we study a single step environment with action noise, where XCS's action is sometimes changed at random. To overcome this problem, this paper proposes XCS based on Collective weighted Reward (XCS-CR) to identify the accurate classifiers. In XCS each rule predicts its next reward by averaging its past rewards. Instead, XCS-CR predicts its next reward by selecting a reward from the set of past rewards, by comparing the past rewards to the collective weighted average reward of the rules matching the current input for each action. This comparison helps XCS-CR identify rewards that result from action noise. In experiments, XCS-CR acquired the optimal generalized classifier subset in 6-Multiplexer problems with action noise, similar to the environment without noise, and judged those optimal generalized classifiers correctly accurate.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    2
    Citations
    NaN
    KQI
    []