Explore Be-Nice Instruction Scheduling in Open64 for an Embedded SMT Processor

2008 
A SMT processor can fetch and issue instructions from multiple independent hardware threads at every CPU cycle. Therefore, hardware resources are shared among the concurrently-running threads at a very fine grain level, whi ch can increase the utilization of processor pipeline. Howeve r, the concurrently-running threads in a SMT processor may interfere with each other and stall the CPU pipeline. We call this kind of pipeline stall inter-thread stall (ITS for short) or thread interlock. In this paper, we present our study on the ITS problem on an embedded heterogeneous SMT processor. Our experiments demonstrate that, for some test cases, 50% of the total pipeline stalls are caused by ITS. Therefore, we have developed a new instruction scheduling algorithm called be-nice instruction scheduling, based on Open64 Global Code Motion, to coordinate the conflicts between concurrent threads. The instruction scheduler use s the thread interference information (obtained by profiling ) as heuristics to decrease the number of ITS without sacrificing the overall CPU performance. The experimental results show that, for our current test cases the be-nice instructio n scheduler can reduce 15% of the inter-thread stall cycles, and increase the IPC of the critical thread by 2%-3%. The experiments are performed using the Open64 compiler infrastructure.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    0
    Citations
    NaN
    KQI
    []