Hermes: Improving Server Utilization by Colocation-Aware Runtime Systems

2019 
Improving server utilization is increasingly important to service providers. Latency-critical services have strict tail latency service-level objects and safe colocation of the latencycritical service with other workloads on the same machine is difficult. This would underutilize server resources. We present Hermes, a user-level resource management layer to address this dilemma. We implement two kinds of runtime systems in Hermes, one for latency-critical workloads (LC runtime) and one for besteffort workloads (BE runtime). LC runtime implements userlevel thread management and controls the dedicated computing resources occupied by latency-critical workload through a feedback-based controller. BE runtimes schedule threads of besteffort workloads to take advantage of simultaneous multithreading technology. Runtime systems are aware of their colocation and work in a cooperative approach to improve server utlization without violating the tail latency service-level objects of the latency-critical workload. Hermes is implemented entirely at user-level on Linux. We evaluate Hermes using memcached and several sythetic micro-benchmarks, the result shows that Hermes could achieve both safe colocation and improvement of core utilization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []