Add ExpressionContext for expression parsing (#11055)

## Summary

This PR adds a new `ExpressionContext` struct which is used in
expression parsing.

This solves the following problem:
1. Allowing starred expression with different precedence
2. Allowing yield expression in certain context
3. Remove ambiguity with `in` keyword when parsing a `for ... in`
statement

For context, (1) was solved by adding `parse_star_expression_list` and
`parse_star_expression_or_higher` in #10623, (2) was solved by by adding
`parse_yield_expression_or_else` in #10809, and (3) was fixed in #11009.
All of the mentioned functions have been removed in favor of the context
flags.

As mentioned in #11009, an ideal solution would be to implement an
expression context which is what this PR implements. This is passed
around as function parameter and the call stack is used to automatically
reset the context.

### Recovery

How should the parser recover if the target expression is invalid when
an expression can consume the `in` keyword?

1. Should the `in` keyword be part of the target expression?
2. Or, should the expression parsing stop as soon as `in` keyword is
encountered, no matter the expression?

For example:
```python
for yield x in y: ...

# Here, should this be parsed as
for (yield x) in (y): ...
# Or
for (yield x in y): ...
# where the `in iter` part is missing
```

Or, for binary expression parsing:
```python
for x or y in z: ...

# Should this be parsed as
for (x or y) in z: ...
# Or
for (x or y in z): ...
# where the `in iter` part is missing
```

This need not be solved now, but is very easy to change. For context
this PR does the following:
* For binary, comparison, and unary expressions, stop at `in`
* For lambda, yield expressions, consume the `in`

## Test Plan

1. Add test cases for the `for ... in` statement and verify the
snapshots
2. Make sure the existing test suite pass
3. Run the fuzzer for around 3000 generated source code
4. Run the updated logic on a dozen or so open source repositories
(codename "parser-checkouts")
This commit is contained in:
Dhruv Manilawala 2024-04-23 09:49:05 +05:30 committed by GitHub
parent 62478c3070
commit c30735d4a7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
22 changed files with 1151 additions and 869 deletions

View file

@ -6,6 +6,8 @@ use crate::parser::{recovery, Parser, RecoveryContextKind, SequenceMatchPatternP
use crate::token_set::TokenSet;
use crate::{ParseErrorType, Tok, TokenKind};
use super::expression::ExpressionContext;
/// The set of tokens that can start a literal pattern.
const LITERAL_PATTERN_START_SET: TokenSet = TokenSet::new([
TokenKind::None,
@ -483,7 +485,7 @@ impl<'src> Parser<'src> {
TokenKind::Int | TokenKind::Float | TokenKind::Complex
) =>
{
let unary_expr = self.parse_unary_expression();
let unary_expr = self.parse_unary_expression(ExpressionContext::default());
if unary_expr.op.is_u_add() {
self.add_error(