A Runtime and Non-Intrusive Approach to Optimize EDP by tuning Threads and CPU Frequency for OpenMP Applications

2020 
Efficiently exploiting thread-level parallelism has been challenging. Many parallel applications are not sufficiently balanced or CPU-bound to take advantage of the increasing number of cores and the highest possible operating frequency. Moreover, many variables may change according to the system (input set, microarchitecture, and number of cores) or during execution, influencing each parallel region in different ways. Therefore, the task of rightly choosing the ideal configuration (number of threads and DVFS) for each parallel region to deliver the best Energy-Delay Product (EDP) is not straightforward. While the significant number of variables prevents the use of exhaustive search methods, the changing nature of the problem precludes offline strategies. Few solutions are online and synergistically consider thread throttling and DVFS. However, they lack transparency (demand changes in the original code) and/or adaptability (do not automatically adjust to applications at run-time). Our proposed Hoder covers all the characteristics above, optimizing at run-time any dynamically linked OpenMP application, without requiring any code transformation or recompilation. We show Hoder’s efficiency by comparing it to two exhaustive offline and two online search approaches, three state-of-the-art techniques, and regular OpenMP execution, considering different setups (Intel 44-, 16- and 12-core; AMD 8- and 12-core).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    6
    Citations
    NaN
    KQI
    []