ruff/crates/ruff_python_parser
Dhruv Manilawala cdc7c71449
Avoid consuming trailing whitespace during re-lexing (#11933)
## Summary

This PR updates the re-lexing logic to avoid consuming the trailing
whitespace and move the lexer explicitly to the last newline character
encountered while moving backwards.

Consider the following code snippet as taken from the test case
highlighted with whitespace (`.`) and newline (`\n`) characters:
```py
# There are trailing whitespace before the newline character but those whitespaces are
# part of the comment token
f"""hello {x # comment....\n
#                     ^
y = 1\n
```

The parser is at `y` when it's trying to recover from an unclosed `{`,
so it calls into the re-lexing logic which tries to move the lexer back
to the end of the previous line. But, as it consumed all whitespaces it
moved the lexer to the location marked by `^` in the above code snippet.
But, those whitespaces are part of the comment token. This means that
the range for the two tokens were overlapping which introduced the
panic.

Note that this is only a bug when there's a comment with a trailing
whitespace otherwise it's fine to move the lexer to the whitespace
character. This is because the lexer would just skip the whitespace
otherwise. Nevertheless, this PR updates the logic to move it explicitly
to the newline character in all cases.

fixes: #11929 

## Test Plan

Add test cases and update the snapshot. Make sure that it doesn't panic
on the code snippet in the linked issue.
2024-06-19 12:14:18 +05:30
..
resources Avoid consuming trailing whitespace during re-lexing (#11933) 2024-06-19 12:14:18 +05:30
src Avoid consuming trailing whitespace during re-lexing (#11933) 2024-06-19 12:14:18 +05:30
tests Avoid consuming trailing whitespace during re-lexing (#11933) 2024-06-19 12:14:18 +05:30
Cargo.toml Remove less used parser dependencies (#11718) 2024-06-03 13:08:24 +00:00
CONTRIBUTING.md Add basic docs for the parser crate (#11199) 2024-04-29 17:08:07 +00:00
README.md Add basic docs for the parser crate (#11199) 2024-04-29 17:08:07 +00:00

Ruff Python Parser

Ruff's Python parser is a hand-written recursive descent parser which can parse Python source code into an Abstract Syntax Tree (AST). It also utilizes the Pratt parsing technique to parse expressions with different precedence.

Try out the parser in the playground.

Python version support

The parser supports the latest Python syntax, which is currently Python 3.12. It does not throw syntax errors if it encounters a syntax feature that is not supported by the target-version. This will be fixed in a future release (see https://github.com/astral-sh/ruff/issues/6591).

Contributing

Refer to the contributing guidelines to get started and GitHub issues with the parser label for issues that need help.