Stochastic scheduling with compatible job families by an improved Q-learning algorithm

2017 
We consider a system that a single batch processing machine can serve compatible job families (jobs from different families can be processed together in the same batch). The system is characterized by random jobs arrival, random processing time and unlimited buffer capacities. Optimization scheduling method needs to be used with the objective to minimize average cycle time in the long run. First, the stochastic scheduling problem is modeled as a continuous time Markov decision process (CTMDP) with average-cost criteria. Then, a Q-learning algorithm combined with simulated annealing technique (SA-Q learning) is used to derive the optimal or near-optimal policy to schedule the batch processing machine. Computational simulations on randomly generated instances show the effectiveness of the method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    1
    Citations
    NaN
    KQI
    []