Don't Panic! Better, Fewer, Syntax Errors for LR Parsers

2018 
Syntax errors are generally easy to fix for humans, but not for parsers, in general, and LR parsers, in particular. Traditional 'panic mode' error recovery, though easy to implement and applicable to any grammar, often leads to a cascading chain of errors that drown out the original. More advanced error recovery techniques suffer less from this problem but have seen little practical use because their typical performance was seen as poor, their worst case unbounded, and the repairs they reported arbitrary. In this paper we show two generic error recovery algorithms that fix all three problems. First, our algorithms are the first to report the complete set of possible repair sequences for a given location, allowing programmers to select the one that best fits their intention. Second, on a corpus of 200,000 real-world syntactically invalid Java programs, we show that our best performing algorithm is able to repair 98.71% of files within a cut-off of 0.5s. Furthermore, we are also able to use the complete set of repair sequences to reduce the cascading error problem even further than previous approaches. Our best performing algorithm reports 442,252.0 error locations in the corpus to the user, while the panic mode algorithm reports 980,848.0 error locations: in other words, our algorithms reduce the cascading error problem by well over half.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    0
    Citations
    NaN
    KQI
    []