Re-establishing Fetch-Directed Instruction Prefetching: An Industry Perspective

2021 
Instruction prefetching can play a pivotal role in improving the performance of workloads with large instruction footprints and frequent, costly frontend stalls. In particular, Fetch Directed Prefetching (FDP) is an effective technique to mitigate frontend stalls since it leverages existing branch prediction resources in a processor and incurs very little hardware overhead. Modern processors have been trending towards provisioning more frontend resources, which bodes well for FDP as it requires these resources to be effective. However, recent academic research has been using outdated and less than optimal frontend baselines that employ smaller structures, resulting in equivocal outcomes. This paper presents a detailed FDP microarchitecture and evaluates two improvements, better branch history management and post-fetch correction. Our mechanism provides a 41.0% speedup over the baseline (no prefetching, no FDP) with only 195 bytes of hardware overhead and outperforms the 1st Instruction Prefetching Championship (IPC-1) winners that had a 128KB storage budget. We believe that our FDP-based frontend design can serve as a new reference baseline for instruction prefetching research to bridge the gap between academia and industry.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    3
    Citations
    NaN
    KQI
    []