Policy Transition of Reinforcement Learning for an Agent Based SCM System

2006 
Reinforcement learning (RL) is successfully applied to some dynamical and unpredictable domains. The Supply Chain Management (SCM) is NP-hard problem. Some proposed RL methods perform better than traditional tools for dynamic problem solving in SCM. It realizes on-line learning and performs efficiently in some applications, but RL agent reacts worse than some heuristic methods to sudden changes in SCM demand since the trial-and-error characteristic of RL is time-consuming in practice. By surveying an efficient policy transition mechanism in RL about how to mapping existing policies in the previous task to a new policies in a changed task, this paper proposes a novel RL agent based SCM system that decreases learning time of the RL agent to a dynamic environment. As the result, the RL agent derives the maximal profit using RL technique as jobs coming with a stable distribution. Further, the RL agent makes the optimal procurement satisfying the requirement of sudden changes in the supply chain network by the policy transition mechanism.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    5
    Citations
    NaN
    KQI
    []