Compositional model of coherence and NUMA effects for optimizing thread and data placement

2016 
On today's multi-socket systems, the parallel performance is hampered by remote cache and memory access. There is much prior work on thread and data placement to curb remote access. However, the number of possible placements is large, and heuristic-based techniques only examines a fraction of the entire solution space. This paper presents a compositional model to analyze the effect of thread and data placement choices. The model includes an analysis for cache coherence and (remote) memory access. It has the property of being compositional, meaning the performances of all the placements can be composed from the results of one profiling pass. Based on this model, this paper further introduces a prototype tool called Tapas to optimize parallel programs for non-uniform memory access (NUMA) platforms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    5
    References
    7
    Citations
    NaN
    KQI
    []