Exploring system performance using elastic traces: Fast, accurate and portable

2016 
Simulation tools are indispensable to computer architects. Detailed execution-driven CPU models offer high accuracy, but at the cost of simulation speed. Trace-driven simulation is widely adopted to alleviate this problem, especially for studies focusing on memory-system exploration. Ideally, trace-driven core models will mimic out-of-order processors executing full-system workloads to enable computer architects to evaluate modern systems. Additionally, to be useful to the broader community the tracing and replay models should be publicly available. However, existing trace-driven approaches are limited in their applicability and availability. We propose elastic traces in which we accurately capture data and load/store order dependencies by instrumenting a detailed out-of-order processor model. In contrast to existing work, we do not rely on offline analysis of timestamps, and instead use accurate dependency information tracked inside the processor pipeline. We thereby account for the effects of speculation and branch misprediction resulting in a more accurate trace playback. We provide a trace player that honours the dependencies and thus adapts its execution time to memory-system changes, as would the actual CPU. Compared to the detailed CPU, our trace player achieves a speed-up of 6–8 times. When modifying the memory-system parameters, the average error in absolute execution time is 7% for SPEC 2006 benchmarks on a bare metal system and 17% for HPC benchmarks on Linux. Relative performance is predicted with less than 3% error, achieving fast and accurate system performance exploration. We make this functionality available to the broader community via a widely-used open source full-system simulator.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    20
    Citations
    NaN
    KQI
    []