Autotuning of Exascale Applications With Anomalies Detection

2021 
The execution of complex distributed applications in exascale systems faces many challenges, as it involves empirical evaluation of countless code variations and application run-time parameters over a heterogeneous set of resources. To overcome these challenges, the research field of autotuning has gained on momentum. The autotuning automates the process of identification of most desirable application implementation in terms code variations and run-time parameters. However, the share complexity and size of the exascale systems makes the autotuning process very difficult, especially considering the number of parameter variations that have to be identified. Therefore, we introduce novel approach for autotuning of exascale applications based on a genetic multi-objective optimisation algorithm, integrated within the ASPIDE exascale computing framework. The approach considers multi-dimensional search space with support for plugable objectives functions, including execution time and energy requirements. Furthermore, the autotuner employs a machine learning based event detection approach, capable of detecting events and anomalies during application execution, such as hardware failures or communication bottlenecks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []