From a5c333c3ed98d539fcadcc723e992f5295d22d5c Mon Sep 17 00:00:00 2001 From: Aleksey Kladov Date: Sat, 8 Sep 2018 19:10:20 +0300 Subject: Fix yet another parser infinite loop This commit is an example of fixing a common parser error: infinite loop due to error recovery. This error typically happens when we parse a list of items and fail to parse a specific item at the current position. One choices is to skip a token and try to parse a list item at the next position. This is a good, but not universal, default. When parsing a list of arguments in a function call, you, for example, don't want to skip over `fn`, because it's most likely that it is a function declaration, and not a mistyped arg: ``` fn foo() { quux(1, 2 fn bar() { } ``` Another choice is to bail out of the loop immediately, but it isn't perfect either: sometimes skipping over garbage helps: ``` quux(1, foo:, 92) // should skip over `:`, b/c that's part of `foo::bar` ``` In general, parser tries to balance these two cases, though we don't have a definitive strategy yet. However, if the parser accidentally neither skips over a token, nor breaks out of the loop, then it becomes stuck in the loop infinitely (there's an internal counter to self-check this situation and panic though), and that's exactly what is demonstrated by the test. To fix such situation, first of all, add the test case to tests/data/parser/{err,fuzz-failures}. Then, run ``` RUST_BACKTRACE=short cargo test --package libsyntax2 ```` to verify that parser indeed panics, and to get an idea what grammar production is the culprit (look for `_list` functions!). In this case, I see ``` 10: libsyntax2::grammar::expressions::atom::match_arm_list at crates/libsyntax2/src/grammar/expressions/atom.rs:309 ``` and that's look like it might be a culprit. I verify it by adding `eprintln!("loopy {:?}", p.current());` and indeed I see that this is printed repeatedly. Diagnosing this a bit shows that the problem is that `pattern::pattern` function does not consume anything if the next token is `let`. That is a good default to make cases like ``` let let foo = 92; ``` where the user hasn't typed the pattern yet, to parse in a reasonable they correctly. For match arms, pretty much the single thing we expect is a pattern, so, for a fix, I introduce a special variant of pattern that does not do recovery. --- crates/libsyntax2/src/grammar/expressions/atom.rs | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) (limited to 'crates/libsyntax2/src/grammar/expressions/atom.rs') diff --git a/crates/libsyntax2/src/grammar/expressions/atom.rs b/crates/libsyntax2/src/grammar/expressions/atom.rs index fdb4718ba..8335c700f 100644 --- a/crates/libsyntax2/src/grammar/expressions/atom.rs +++ b/crates/libsyntax2/src/grammar/expressions/atom.rs @@ -323,11 +323,9 @@ fn match_arm_list(p: &mut Parser) { // } fn match_arm(p: &mut Parser) -> BlockLike { let m = p.start(); - loop { + patterns::pattern_r(p, TokenSet::EMPTY); + while p.eat(PIPE) { patterns::pattern(p); - if !p.eat(PIPE) { - break; - } } if p.eat(IF_KW) { expr_no_struct(p); -- cgit v1.2.3