Towards Resilient Chapel: Design and implementation of a transparent resilience mechanism for Chapel

2015 
The exponential increase of components in modern High Performance Computing (HPC) systems poses a challenge on their resilience: predictions of time between failures on ExaScale systems range from hours to minutes, yet the prevalent HPC programming model today does not tolerate faults. In this paper, we describe the design and prototype implementation of transparent resilience support for Chapel [1], a parallel HPC language with focus on scalability, portability and productivity, following the Partitioned Global Address Space (PGAS) [2] programming model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    4
    Citations
    NaN
    KQI
    []