ruff/crates/ruff_python_parser
Dhruv Manilawala a4688aebe9
Use TokenSource to find new location for re-lexing (#12060)
## Summary

This PR splits the re-lexing logic into two parts:
1. `TokenSource`: The token source will be responsible to find the
position the lexer needs to be moved to
2. `Lexer`: The lexer will be responsible to reduce the nesting level
and move itself to the new position if recovered from a parenthesized
context

This split makes it easy to find the new lexer position without needing
to implement the backwards lexing logic again which would need to handle
cases involving:
* Different kinds of newlines
* Line continuation character(s)
* Comments
* Whitespaces

### F-strings

This change did reveal one thing about re-lexing f-strings. Consider the
following example:
```py
f'{'
#  ^
f'foo'
```

Here, the quote as highlighted by the caret (`^`) is the start of a
string inside an f-string expression. This is unterminated string which
means the token emitted is actually `Unknown`. The parser tries to
recover from it but there's no newline token in the vector so the new
logic doesn't recover from it. The previous logic does recover because
it's looking at the raw characters instead.

The parser would be at `FStringStart` (the one for the second line) when
it calls into the re-lexing logic to recover from an unterminated
f-string on the first line. So, moving backwards the first character
encountered is a newline character but the first token encountered is an
`Unknown` token.

This is improved with #12067 

fixes: #12046 
fixes: #12036

## Test Plan

Update the snapshot and validate the changes.
2024-06-27 17:12:39 +05:30
..
resources Consider 2-character EOL before line continuation (#12035) 2024-06-26 14:00:48 +05:30
src Use TokenSource to find new location for re-lexing (#12060) 2024-06-27 17:12:39 +05:30
tests Use TokenSource to find new location for re-lexing (#12060) 2024-06-27 17:12:39 +05:30
Cargo.toml Remove less used parser dependencies (#11718) 2024-06-03 13:08:24 +00:00
CONTRIBUTING.md Add basic docs for the parser crate (#11199) 2024-04-29 17:08:07 +00:00
README.md Add basic docs for the parser crate (#11199) 2024-04-29 17:08:07 +00:00

Ruff Python Parser

Ruff's Python parser is a hand-written recursive descent parser which can parse Python source code into an Abstract Syntax Tree (AST). It also utilizes the Pratt parsing technique to parse expressions with different precedence.

Try out the parser in the playground.

Python version support

The parser supports the latest Python syntax, which is currently Python 3.12. It does not throw syntax errors if it encounters a syntax feature that is not supported by the target-version. This will be fixed in a future release (see https://github.com/astral-sh/ruff/issues/6591).

Contributing

Refer to the contributing guidelines to get started and GitHub issues with the parser label for issues that need help.