Message Leak Detection in Debugging Large-Scale Parallel Applications

2015 
Debugging in large-scale parallel applications with long runtime where frequency of errors is high became very problematic. Traditional debugging techniques with locating exactly errors no longer seems to be appropriate when applying to these applications because of high overhead in storing trace files, especially they are difficult to be able to scale efficiently. An effective solution to these problems is proposed in loop-based unusual behaviors detecting technique which is capable of defining leaked messages in loops and thus, helps to warn programmers about potential errors to prevent unexpected problems. The proposed technique consists of three order rules suggested to be implemented on high performance computing systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    1
    Citations
    NaN
    KQI
    []