Case study of error recovery and error propagation on ranger

2017 
We give the details of two new dependability oriented use cases on recovery attempt and error propagation on the Ranger supercomputer. The use cases are: (i) Error propagation between the Lustre file-system I/O and Infiniband, and (ii) Recovery attempt and its impact on the chipset and memory system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []