language-icon Old Web
English
Sign In

Thread-Placement Learning

2020 
In a non-uniform memory access machine, the placement of software threads to hardware cores can have a significant effect on the performance of concurrent applications. Detecting the best possible placement for each application is a necessity for thread scheduling. Yet, due to the difficulty of this problem, operating-system schedulers do not really try to understand the needs of applications, but rather focus on (non-portable) scheduling heuristics.In this paper, we introduce thread-placement learning (TPLE), a technique for understanding the placement requirements of applications. TPLE utilizes machine learning and performance counters for choosing between different placement policies. To feed the machine learning model, TPLE requires a set of portable microbenchmarks that produce training data—i.e., performance counter measurements—for all the target placement policies. We use this data to train a classifier that is able to choose between these policies online in order to change the thread-placement of a running application.We demonstrate the practicality of TPLE by implementing a thread-placement algorithm, named Slate. Slate is able to automatically and online (i.e., in runtime) select between the two most commonly-used placement policies, namely locality and round-robin placement on the nodes of a multicore. To the best of our knowledge, Slate is the first online thread-placement algorithm that utilizes machine learning in combination with performance counters. We evaluate Slate and show that it achieves up to 93% accuracy in its decisions and outperforms the Linux scheduler by up to 16%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    1
    Citations
    NaN
    KQI
    []