Less Provisioning: A Fine-grained Resource Scaling Engine for Long-running Services with Tail Latency Guarantees

2018 
Modern resource management frameworks guarantee low tail latency for long-running services using the resource over-provisioning method, resulting in serious waste of resource and increasing the service costs greatly. To reduce the over-provisioning cost, we present EFRA, an elastic and fine-grained resource allocator that enables much more efficient resource provisioning while guaranteeing the tail latency Service Level Objective (SLO). EFRA achieves this through the cooperation of three key components running on a containerized platform: The period detector identifies the period features of the workload through a convolution-based time series analysis. The resource reservation component estimates the just-right amount of resources based on the period analysis through a top-K based collaborative filtering approach. The online reprovisioning component dynamically adjusts the resources for further enforcing the tail latency SLO. Testbed experiments show that EFRA is able to increase the average resource utilization to 43%, and save up to 66% resources while guaranteeing the same tail latency objective.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    6
    Citations
    NaN
    KQI
    []