Learning Workflow Scheduling on Multi-Resource Clusters

2019 
Workflow scheduling is one of the key issues in the management of workflow execution. Typically, a workflow application can be modeled as a Directed-Acyclic Graph (DAG). In this paper, we present GoDAG, an approach that can learn to well schedule workflows on multi-resource clusters. GoDAG directly learns the scheduling policy from experience through deep reinforcement learning. In order to adapt deep reinforcement learning methods, we propose a novel state representation, a practical action space and a corresponding reward definition for workflow scheduling problem. We implement a GoDAG prototype and a simulator to simulate task running on multi-resource clusters. In the evaluation, we compare the GoDAG with three state-of-the-art heuristics. The results show that GoDAG outperforms the baseline heuristics, leading to less average makespan to different workflow structures.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    2
    Citations
    NaN
    KQI
    []