SOLE: Speculative one-cycle load execution with scalability, high-performance and energy-efficiency

2012 
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor scalability and high energy consumption. Recently proposals only focus on improving the LSQ scalability to increase the in-flight instruction capacity, but with poor performance improvement and energy efficiency. This paper presents a novel speculative store-load forwarding mechanism, named SOLE (speculative one-cycle load execution) 1 . Firstly, SOLE uses address identifiers to determine the memory disambiguation, rather than the exact memory addresses as the traditional LSQ does. Since the address identifier is just simple hash from the address base and offset, the speculative store-load forwarding could be advanced earlier to reduce the load execution latency and avoid unnecessary energy consumption by filtering unnecessary accesses to the data cache. Secondly, SOLE enlarges the forwarding communication range by using SSN (store sequential number) to determine the age order between stores, which further improves the performance. Finally, the implementation of SOLE all uses set-associative structures that avoid the non-scalable problem of CAM-based LSQ. Experiments show that performance of SOLE outperforms the traditional LSQ by 13.57% in terms of performance, with only 75.2% execution energy consumption of the loads and stores.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []