PrTaurus: An Availability-Enhanced EMR Service on Preemptible Cloud Instances

2020 
EMR (Elastic Map Reduce) is a service provided by mainstream cloud vendors for data processing users to directly obtain well-managed Hadoop YARN clusters on the cloud. Preemptible instance is a kind of cloud server that is cheap but is likely to be reclaimed by cloud vendors suddenly. Running EMR clusters on preemptible instances relies on YARN's own fault-tolerance, which is limited. In this paper, we present PrTaurus as an availability-enhanced EMR service on preemptible instances. PrTaurus integrates a system-level checkpoint capability based on Docker into YARN to further improve its fault-tolerance. In addition, PrTaurus's scheduling strategy takes advantage of Alibaba Cloud's one-hour protection policy. Furthermore, a new method that comprehensively considers cost-efficiency, preemption risk and overhead is proposed to select cluster instances. We evaluated PrTaurus through simulations on real-world workload and instance price traces. Experimental results show that compared with the existing EMR clusters running on preemptible instances, PrTaurus significantly reduces cost (13.0%-74.6%), instance preemptions (60.3%-88.9%), and task preemptions (86.0 % – 98.6 %).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []