Effect of Hyper-Threading in Latency-Critical Multithreaded Cloud Applications and Utilization Analysis of the Major System Resources

2022 
Multithreaded latency-critical applications represent an important subset of workloads running on public cloud systems. Most of these systems deploy powerful computing servers including Intel Hyper-Threading processors. Understanding how performance is affected by the consumption of the main system resources is a major concern for cloud providers in order to devise virtualization strategies that improve the system efficiency. With this aim, this paper first characterizes the impact of QPS on tail latency, analyzing different scenarios varying the number of threads and the thread-to-core allocation (single-task and multi-task execution) policy. The characterization study reveals that the performance of some applications does not scale with the number of threads, and the performance of some others is insensitive to the Hyper-Threading technology, so they can be allocated in less physical cores and improve system utilization. Identifying these applications, however, at run-time is challenging. Despite identifying these applications at run-time is challenging, this paper shows that they can be successfully detected at run-time by analyzing the utilization trend of the major system resources. In addition to CPU, we have also studied how assigning the share of each application of other major shared system resources impacts on performance. We outline considerations cloud providers should take into account to improve performance and resource utilization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []