Walltime Prediction and Its Impact on Job Scheduling Performance and Predictability

2020 
For more than two decades researchers have been analyzing the impact of inaccurate job walltime (runtime) estimates on the performance of job scheduling algorithms, especially the backfilling. In this paper, we extend these existing works by focusing on the overall impact that improved walltime estimates have both on job scheduling performance and predictability. For this purpose, we evaluate such impact in several steps. First, we present a simple walltime predictor and analyze its accuracy with respect to original user walltime estimates captured in real-life workload traces. Next, we use these traces and a simulator to see what is the impact of improved estimates on general performance (backfilling ratio and wait time) as well as predictability. We show that even a simple predictor can significantly decrease user-based errors in runtime estimates, while also slightly improving job wait times and backfilling ratio. Concerning predictions, we show that walltime predictor significantly decreases errors in job wait time forecasting while having little effect on the ability of the scheduler to provide solid advance predictions about which nodes will be used by a given waiting job.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    2
    Citations
    NaN
    KQI
    []