An efficient dynamic load-balancing algorithm in a large-scale cluster

2005 
Random stealing is a well-known dynamic load-balancing algorithm. However, for a large-scale cluster, the simple random stealing policy is no longer efficient because an idle node must randomly steal many times to obtain a task from another node. This will not only increase the idle time for all nodes but also produce a heavy network communication overhead. In this paper, we propose a novel dynamic load-balancing algorithm, Transitive Random Stealing (TRS), which can make any idle node obtain a task from another node with much fewer stealing times in a large-scale cluster. A probabilistic model is constructed to analyze the performance of TRS, random stealing and Shis, one of load balance policies in the EARTH system. Finally, by the random baseline technique, an experiment designed to compare TRS with Shis and random stealing for five different load distributions in the Tsinghua EastSun cluster convinces us that TRS is a highly efficient dynamic load-balancing algorithm in a large-scale cluster.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    5
    Citations
    NaN
    KQI
    []