The Effect of Topology-Aware Process and Thread Placement on Performance and Energy
2013
Design of modern multiprocessor computer systems has become increasingly complex and renders the performance of scientific parallel applications highly sensitive to process and thread scheduling. In particular, the Non-Uniform Memory Access (NUMA), a frequent architecture solution, demands knowledge of the hardware details as well as skills that are normally beyond the average user in order to minimise memory access penalties and achieve good application performance. This situation is further complicated by the increasing use of modern heterogeneous systems involving both CPUs and accelerators, where process proximity to the accelerator strongly determines performance.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
28
References
6
Citations
NaN
KQI