Summary
--
Detects starred assignment targets outside of tuples and lists like `*a
= (1,)`.
This PR only considers assignment statements. I also checked annotated
assigment statements, but these give a separate error that we already
catch, so I think they're okay not to consider:
```pycon
>>> *a: list[int] = []
File "<python-input-72>", line 1
*a: list[int] = []
^
SyntaxError: invalid syntax
```
Fixes#13759
Test Plan
--
New inline tests, plus a new `SemanticSyntaxError` for an existing
parser test. I also removed a now-invalid case from an otherwise-valid
test fixture.
The new semantic error leads to two errors for the case below:
```python
*foo() = 42
```
but this matches [pyright] too.
[pyright]: https://pyright-play.net/?code=FQMw9mAUCUAEC8sAsAmAUEA
Summary
--
Detect setting or deleting `__debug__`. Assigning to `__debug__` was a
`SyntaxError` on the earliest version I tested (3.8). Deleting
`__debug__` was made a `SyntaxError` in [BPO 45000], which said it was
resolved in Python 3.10. However, `del __debug__` was also a runtime
error (`NameError`) when I tested in Python 3.9.6, so I thought it was
worth including 3.9 in this check.
I don't think it was ever a *good* idea to try `del __debug__`, so I
think there's also an argument for not making this version-dependent at
all. That would only simplify the implementation very slightly, though.
[BPO 45000]: https://github.com/python/cpython/issues/89163
Test Plan
--
New inline tests. This also required adding a `PythonVersion` field to
the `TestContext` that could be taken from the inline `ParseOptions` and
making the version field on the options accessible.
Summary
--
This PR detects multiple assignments to the same name in `case` patterns
by recursively visiting each pattern.
Test Plan
--
New inline tests.
Summary
--
Detects irrefutable `match` cases before the final case using a modified
version
of the existing `Pattern::is_irrefutable` method from the AST crate. The
modified method helps to retrieve a more precise diagnostic range to
match what
Python 3.13 shows in the REPL.
Test Plan
--
New inline tests, as well as some updates to existing tests that had
irrefutable
patterns before the last block.
Summary
--
Fixes#16943 by checking if the tuple is not parenthesized before
emitting an error.
Test Plan
--
New inline test based on the initial report
Summary
--
Detects duplicate type parameter names in function definitions, class
definitions, and type alias statements.
I also boxed the `type_params` field on `StmtTypeAlias` to make it
easier to
`match` with functions and classes. (That's the reason for the red-knot
code
owner review requests, sorry!)
Test Plan
--
New `ruff_python_syntax_errors` unit tests.
Fixes#11119.
## Summary
This PR implements the "greeter" approach for checking the AST for
syntax errors emitted by the CPython compiler. It introduces two main
infrastructural changes to support all of the compile-time errors:
1. Adds a new `semantic_errors` module to the parser crate with public
`SemanticSyntaxChecker` and `SemanticSyntaxError` types
2. Embeds a `SemanticSyntaxChecker` in the `ruff_linter::Checker` for
checking these errors in ruff
As a proof of concept, it also implements detection of two syntax
errors:
1. A reimplementation of
[`late-future-import`](https://docs.astral.sh/ruff/rules/late-future-import/)
(`F404`)
2. Detection of rebound comprehension iteration variables
(https://github.com/astral-sh/ruff/issues/14395)
## Test plan
Existing F404 tests, new inline tests in the `ruff_python_parser` crate,
and a linter CLI test showing an example of the `Message` output.
I also tested in VS Code, where `preview = false` and turning off syntax
errors both disable the new errors:

And on the playground, where `preview = false` also disables the errors:

Fixes#14395
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This change continues to resolve#16071 (and continues the work started
in #16162). Specifically, this PR changes the code in the parser so that
it uses the `OperatorPrecedence` struct from `ruff_python_ast` instead
of its own version. This is part of an effort to get rid of the
redundant definitions of `OperatorPrecedence` throughout the codebase.
Note that this PR only makes this change for `ruff_python_parser` -- we
still want to make a similar change for the formatter (namely the
`OperatorPrecedence` defined in the expression part of the formatter,
the pattern one is different). I separated the work to keep the PRs
small and easily reviewable.
## Test Plan
Because this is an internal change, I didn't add any additional tests.
Existing tests do pass.
Summary
--
Fixes#16874. I previously emitted a syntax error when starred
annotations were _allowed_ rather than when they were actually used.
This caused false positives for any starred parameter name because these
are allowed to have starred annotations but not required to. The fix is
to check if the annotation is actually starred after parsing it.
Test Plan
--
New inline parser tests derived from the initial report and more
examples from the comments, although I think the first case should cover
them all.
## Summary
This PR detects the use of PEP 701 f-strings before 3.12. This one
sounded difficult and ended up being pretty easy, so I think there's a
good chance I've over-simplified things. However, from experimenting in
the Python REPL and checking with [pyright], I think this is correct.
pyright actually doesn't even flag the comment case, but Python does.
I also checked pyright's implementation for
[quotes](98dc4469cc/packages/pyright-internal/src/analyzer/checker.ts (L1379-L1398))
and
[escapes](98dc4469cc/packages/pyright-internal/src/analyzer/checker.ts (L1365-L1377))
and think I've approximated how they do it.
Python's error messages also point to the simple approach of these
characters simply not being allowed:
```pycon
Python 3.11.11 (main, Feb 12 2025, 14:51:05) [Clang 19.1.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> f'''multiline {
... expression # comment
... }'''
File "<stdin>", line 3
}'''
^
SyntaxError: f-string expression part cannot include '#'
>>> f'''{not a line \
... continuation}'''
File "<stdin>", line 2
continuation}'''
^
SyntaxError: f-string expression part cannot include a backslash
>>> f'hello {'world'}'
File "<stdin>", line 1
f'hello {'world'}'
^^^^^
SyntaxError: f-string: expecting '}'
```
And since escapes aren't allowed, I don't think there are any tricky
cases where nested quotes or comments can sneak in.
It's also slightly annoying that the error is repeated for every nested
quote character, but that also mirrors pyright, although they highlight
the whole nested string, which is a little nicer. However, their check
is in the analysis phase, so I don't think we have such easy access to
the quoted range, at least without adding another mini visitor.
## Test Plan
New inline tests
[pyright]:
https://pyright-play.net/?pythonVersion=3.11&strict=true&code=EYQw5gBAvBAmCWBjALgCgO4gHaygRgEoAoEaCAIgBpyiiBiCLAUwGdknYIBHAVwHt2LIgDMA5AFlwSCJhwAuCAG8IoMAG1Rs2KIC6EAL6iIxosbPmLlq5foRWiEAAcmERAAsQAJxAomnltY2wuSKogA6WKIAdABWfPBYqCAE%2BuSBVqbpWVm2iHwAtvlMWMgB2ekiolUAgq4FjgA2TAAeEMieSADWCsoV5qoaqrrGDJ5MiDz%2B8ABuLqosAIREhlXlaybrmyYMXsDw7V4AnoysyAmQ5SIhwYo3d9cheADUeKlv5O%2BpQA
## Summary
A small followup to https://github.com/astral-sh/ruff/pull/16386. We now
tell the user exactly what it was about their decorator that constituted
invalid syntax on Python <3.9, and the range now highlights the specific
sub-expression that is invalid rather than highlighting the whole
decorator
## Test Plan
Inline snapshots are updated, and new ones are added.
## Summary
This PR detects unparenthesized assignment expressions used in set
literals and comprehensions and in sequence indexes. The link to the
release notes in https://github.com/astral-sh/ruff/issues/6591 just has
this entry:
> * Assignment expressions can now be used unparenthesized within set
literals and set comprehensions, as well as in sequence indexes (but not
slices).
with no other information, so hopefully the test cases I came up with
cover all of the changes. I also tested these out in the Python REPL and
they actually worked in Python 3.9 too. I'm guessing this may be another
case that was "formally made part of the language spec in Python 3.10,
but usable -- and commonly used -- in Python >=3.9" as @AlexWaygood
added to the body of #6591 for context managers. So we may want to
change the version cutoff, but I've gone along with the release notes
for now.
## Test Plan
New inline parser tests and linter CLI tests.
Summary
--
This is closely related to (and stacked on)
https://github.com/astral-sh/ruff/pull/16544 and detects star
annotations in function definitions.
I initially called the variant `StarExpressionInAnnotation` to mirror
`StarExpressionInIndex`, but I realized it's not really a "star
expression" in this position and renamed it. `StarAnnotation` seems in
line with the PEP.
Test Plan
--
Two new inline tests. It looked like there was pretty good existing
coverage of this syntax, so I just added simple examples to test the
version cutoff.
Summary
--
This PR reuses a slightly modified version of the
`check_tuple_unpacking` method added for detecting unpacking in `return`
and `yield` statements to detect the same issue in the iterator clause
of `for` loops.
I ran into the same issue with a bare `for x in *rest: ...` example
(invalid even on Python 3.13) and added it as a comment on
https://github.com/astral-sh/ruff/issues/16520.
I considered just making this an additional `StarTupleKind` variant as
well, but this change was in a different version of Python, so I kept it
separate.
Test Plan
--
New inline tests.
Summary
--
Unlike the other syntax errors detected so far, parenthesized keyword
arguments are only allowed *before* 3.8. It sounds like they were only
accidentally allowed before that [^1].
As an aside, you get a pretty confusing error from Python for this, so
it's nice that we can catch it:
```pycon
>>> def f(**kwargs): ...
... f((a)=1)
...
File "<python-input-0>", line 2
f((a)=1)
^^^
SyntaxError: expression cannot contain assignment, perhaps you meant "=="?
>>>
```
Test Plan
--
Inline tests.
[^1]: https://github.com/python/cpython/issues/78822
Summary
--
Checks for tuple unpacking in `return` and `yield` statements before
Python 3.8, as described [here].
Test Plan
--
Inline tests.
[here]: https://github.com/python/cpython/issues/76298
Summary
--
Another simple one, just detect type parameter lists in functions
and classes. Like pyright, we don't emit a second diagnostic for
`type` alias statements, which were also introduced in 3.12.
Test Plan
--
Inline tests.
Summary
--
Detects the presence of a [PEP 696] type parameter default before Python
3.13.
Test Plan
--
New inline parser tests for type aliases, generic functions and generic
classes.
[PEP 696]: https://peps.python.org/pep-0696/#grammar-changes
Summary
--
This is a follow-up to #16446 to fix the diagnostic range to point to
the `*` like `pyright` does
(https://github.com/astral-sh/ruff/pull/16446#discussion_r1976900643).
Storing the range in the `ExceptClauseKind::Star` variant feels slightly
awkward, but we don't store the star itself anywhere on the
`ExceptHandler`. And we can't just take `ExceptHandler.start() +
"except".text_len()` because this code appears to be valid:
```python
try: ...
except * Error: ...
```
Test Plan
--
Existing tests.
Summary
--
This is a follow up addressing the comments on #16425. As @dhruvmanila
pointed out, the naming is a bit tricky. I went with `has_no_errors` to
try to differentiate it from `is_valid`. It actually ends up negated in
most uses, so it would be more convenient to have `has_any_errors` or
`has_errors`, but I thought it would sound too much like the opposite of
`is_valid` in that case. I'm definitely open to suggestions here.
Test Plan
--
Existing tests.
## Summary
This PR is the first in a series derived from
https://github.com/astral-sh/ruff/pull/16308, each of which add support
for detecting one version-related syntax error from
https://github.com/astral-sh/ruff/issues/6591. This one should be
the largest because it also includes the addition of the
`Parser::add_unsupported_syntax_error` method
Otherwise I think the general structure will be the same for each syntax
error:
* Detecting the error in the parser
* Inline parser tests for the new error
* New ruff CLI tests for the new error
## Test Plan
As noted above, there are new inline parser tests, as well as new ruff
CLI
tests. Once https://github.com/astral-sh/ruff/pull/16379 is resolved,
there should also be new mdtests for red-knot,
but this PR does not currently include those.
## Summary
This PR adds support for a pragma-style header for inline parser tests
containing JSON-serialized `ParseOptions`. For example,
```python
# parse_options: { "target-version": "3.9" }
match 2:
case 1:
pass
```
The line must start with `# parse_options: ` and then the rest of the
(trimmed) line is deserialized into `ParseOptions` used for parsing the
the test.
## Test Plan
Existing inline tests, plus two new inline tests for
`match-before-py310`.
---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
## Summary
This is part of the preparation for detecting syntax errors in the
parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested
in [this
comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509),
I started working on a `ParseOptions` struct that could be stored in the
parser. For this initial refactor, I only made it hold the existing
`Mode` option, but for syntax errors, we will also need it to have a
`PythonVersion`. For that use case, I'm picturing something like a
`ParseOptions::with_python_version` method, so you can extend the
current calls to something like
```rust
ParseOptions::from(mode).with_python_version(settings.target_version)
```
But I thought it was worth adding `ParseOptions` alone without changing
any other behavior first.
Most of the diff is just updating call sites taking `Mode` to take
`ParseOptions::from(Mode)` or those taking `PySourceType`s to take
`ParseOptions::from(PySourceType)`. The interesting changes are in the
new `parser/options.rs` file and smaller parts of `parser/mod.rs` and
`ruff_python_parser/src/lib.rs`.
## Test Plan
Existing tests, this should not change any behavior.
This update includes some missing `^` in the diagnostic annotations.
This update also includes some shifting of "syntax error" annotations to
the end of the preceding line. I believe this is technically a
regression, but fixing them has proven quite difficult. I *think* the
best way to do that might be to tweak the spans generated by the Python
parser errors, but I didn't want to dig into that. (Another approach
would be to change the `annotate-snippets` rendering, but when I tried
that and managed to fix these regressions, I ended up causing a bunch of
other regressions.)
Ref 77d454525e (r1915458616)
We do this because `...` is valid Python, which makes it pretty likely
that some line trimming will lead to ambiguous output. So we add support
for overriding the cut indicator. This also requires changing some of
the alignment math, which was previously tightly coupled to `...`.
For Ruff, we go with `…` (`U+2026 HORIZONTAL ELLIPSIS`) for our cut
indicator.
For more details, see the patch sent to upstream:
https://github.com/rust-lang/annotate-snippets-rs/pull/172
This looks like a bug fix that occurs when the annotation is a
zero-width span immediately following a line terminator. Previously, the
caret seems to be rendered on the next line, but it should be rendered
at the end of the line the span corresponds to.
I admit that this one is kinda weird. I would somewhat expect that our
spans here are actually incorrect, and that to obtain this sort of
rendering, we should identify a span just immediately _before_ the line
terminator and not after it. But I don't want to dive into that rabbit
hole for now (and given how `annotate-snippets` now renders these
spans, perhaps there is more to it than I see), and this does seem like
a clear improvement given the spans we feed to `annotate-snippets`.
The previous rendering just seems wrong in that a `^` is omitted. The
new version of `annotate-snippets` seems to get this right. I checked a
pseudo random sample of these, and it seems to only happen when the
position pointed at a line terminator.
These updates center around the addition of annotations in the
diagnostic rendering. Previously, the annotation was just not rendered
at all. With the `annotate-snippets` upgrade, it is now rendered. I
examined a pseudo random sample of these, and they all look correct.
As will be true in future batches, some of these snapshots also have
changes to whitespace in them as well.
These snapshot changes should *all* only be a result of changes to
trailing whitespace in the output. I checked a psuedo random sample of
these, and the whitespace found in the previous snapshots seems to be an
artifact of the rendering and _not_ of the source data. So this seems
like a strict bug fix to me.
There are other snapshots with whitespace changes, but they also have
other changes that we split out into separate commits. Basically, we're
going to do approximately one commit per category of change.
This represents, by far, the biggest chunk of changes to snapshots as a
result of the `annotate-snippets` upgrade.
This is pretty much just moving to the new API and taking care to use
byte offsets. This is *almost* enough. The next commit will fix a bug
involving the handling of unprintable characters as a result of
switching to byte offsets.
When confronted with `raise from exc` the parser will now create a
`StmtRaise` that has `None` for the exception and `exc` for the cause.
Before, the parser created a `StmtRaise` with `from` for the exception,
no cause, and a spurious expression `exc` afterwards.
This PR adds a syntax error if the parser encounters a `TryStmt` that
has except clauses both with and without a star.
The displayed error points to each except clause that contradicts the
original except clause kind. So, for example,
```python
try:
....
except: #<-- we assume this is the desired except kind
....
except*: #<--- error will point here
....
except*: #<--- and here
....
```
Closes#14860
## Summary
This PR fixes a bug to raise a syntax error when an unparenthesized
generator expression is used as an argument to a call when there are
more than one argument.
For reference, the grammar is:
```
primary:
| ...
| primary genexp
| primary '(' [arguments] ')'
| ...
genexp:
| '(' ( assignment_expression | expression !':=') for_if_clauses ')'
```
The `genexp` requires the parenthesis as mentioned in the grammar. So,
the grammar for a call expression is either a name followed by a
generator expression or a name followed by a list of argument. In the
former case, the parenthesis are excluded because the generator
expression provides them while in the later case, the parenthesis are
explicitly provided for a list of arguments which means that the
generator expression requires it's own parenthesis.
This was discovered in https://github.com/astral-sh/ruff/issues/12420.
## Test Plan
Add test cases for valid and invalid syntax.
Make sure that the parser from CPython also raises this at the parsing
step:
```console
$ python3.13 -m ast parser/_.py
File "parser/_.py", line 1
total(1, 2, x for x in range(5), 6)
^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
$ python3.13 -m ast parser/_.py
File "parser/_.py", line 1
sum(x for x in range(10), 10)
^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
```
## Summary
This PR splits the re-lexing logic into two parts:
1. `TokenSource`: The token source will be responsible to find the
position the lexer needs to be moved to
2. `Lexer`: The lexer will be responsible to reduce the nesting level
and move itself to the new position if recovered from a parenthesized
context
This split makes it easy to find the new lexer position without needing
to implement the backwards lexing logic again which would need to handle
cases involving:
* Different kinds of newlines
* Line continuation character(s)
* Comments
* Whitespaces
### F-strings
This change did reveal one thing about re-lexing f-strings. Consider the
following example:
```py
f'{'
# ^
f'foo'
```
Here, the quote as highlighted by the caret (`^`) is the start of a
string inside an f-string expression. This is unterminated string which
means the token emitted is actually `Unknown`. The parser tries to
recover from it but there's no newline token in the vector so the new
logic doesn't recover from it. The previous logic does recover because
it's looking at the raw characters instead.
The parser would be at `FStringStart` (the one for the second line) when
it calls into the re-lexing logic to recover from an unterminated
f-string on the first line. So, moving backwards the first character
encountered is a newline character but the first token encountered is an
`Unknown` token.
This is improved with #12067fixes: #12046fixes: #12036
## Test Plan
Update the snapshot and validate the changes.
## Summary
This PR fixes the lexer logic to **not** consume the newline character
for an unterminated string literal.
Currently, the lexer would consume it to be part of the string itself
but that would be bad for recovery because then the lexer wouldn't emit
the newline token ever. This PR fixes that to avoid consuming the
newline character in that case.
This was discovered during https://github.com/astral-sh/ruff/pull/12060.
## Test Plan
Update the snapshots and validate them.
## Summary
This PR fixes a bug introduced in
https://github.com/astral-sh/ruff/pull/12008 which didn't consider the
two character newline after the line continuation character.
For example, consider the following code highlighted with whitespaces:
```py
call(foo # comment \\r\n
\r\n
def bar():\r\n
....pass\r\n
```
The lexer is at `def` when it's running the re-lexing logic and trying
to move back to a newline character. It encounters `\n` and it's being
escaped (incorrect) but `\r` is being escaped, so it moves the lexer to
`\n` character. This creates an overlap in token ranges which causes the
panic.
```
Name 0..4
Lpar 4..5
Name 5..8
Comment 9..20
NonLogicalNewline 20..22 <-- overlap between
Newline 21..22 <-- these two tokens
NonLogicalNewline 22..23
Def 23..26
...
```
fixes: #12028
## Test Plan
Add a test case with line continuation and windows style newline
character.
## Summary
(I'm pretty sure I added this in the parser re-write but must've got
lost in the rebase?)
This PR raises a syntax error if the type parameter list is empty.
As per the grammar, there should be at least one type parameter:
```
type_params:
| invalid_type_params
| '[' type_param_seq ']'
type_param_seq: ','.type_param+ [',']
```
Verified via the builtin `ast` module as well:
```console
$ python3.13 -m ast parser/_.py
Traceback (most recent call last):
[..]
File "parser/_.py", line 1
def foo[]():
^
SyntaxError: Type parameter list cannot be empty
```
## Test Plan
Add inline test cases and update the snapshots.
## Summary
This PR updates the parser test infrastructure to validate the token
ranges.
From the code documentation:
```
/// Verifies that:
/// * the ranges are strictly increasing when loop the tokens in insertion order
/// * all ranges are within the length of the source code
```
Follow-up from #12016 and #12017resolves: #11938
## Test Plan
Make sure that there are no failures.
## Summary
This PR updates the unterminated string error range to not include the
final newline character.
This is a follow-up to #12016 and required for #12019
This is not done for when the unterminated string goes till the end of
file (not a newline character). The unterminated f-string range is
correct.
### Why is this required for #12019 ?
Because otherwise the token ranges will overlap. For example:
```py
f"{"
f"{foo!r"
```
Here, the re-lexing logic recovers from an unterminated f-string and
thus emitting a `Newline` token for the one at the end of the first
line. But, currently the `Unknown` and the `Newline` token would overlap
because the `Unknown` token (unterminated string literal) range would
include the newline character.
## Test Plan
Update and validate the snapshot.
## Summary
This PR fixes the range highlighted for the line continuation error.
Previously, it would highlight an incorrect range:
```
1 | call(a, b, \\\
| ^^ Syntax Error: unexpected character after line continuation character
2 |
3 | def bar():
|
```
And now:
```
|
1 | call(a, b, \\\
| ^ Syntax Error: unexpected character after line continuation character
2 |
3 | def bar():
|
```
This is implemented by avoiding to update the token range for the
`Unknown` token which is emitted when there's a lexical error. Instead,
the `push_error` helper method will be responsible to update the range
to the error location.
This actually becomes a requirement which can be seen in follow-up PRs.
## Test Plan
Update and validate the snapshot.
## Summary
This PR fixes a bug where the re-lexing logic didn't consider the line
continuation character being present before the newline character. This
meant that the lexer was being moved back to the newline character which
is actually ignored via `\`.
Considering the following code:
```py
f'middle {'string':\
'format spec'}
```
The old token stream is:
```
...
Colon 18..19
FStringMiddle 19..29 (flags = F_STRING)
Newline 20..21
Indent 21..29
String 29..42
Rbrace 42..43
...
```
Notice how the ranges are overlapping between the `FStringMiddle` token
and the tokens emitted after moving the lexer backwards.
After this fix, the new token stream which is without moving the lexer
backwards in this scenario:
```
FStringStart 0..2 (flags = F_STRING)
FStringMiddle 2..9 (flags = F_STRING)
Lbrace 9..10
String 10..18
Colon 18..19
FStringMiddle 19..29 (flags = F_STRING)
FStringEnd 29..30 (flags = F_STRING)
Name 30..36
Name 37..41
Unknown 41..44
Newline 44..45
```
fixes: #12004
## Test Plan
Add test cases and update the snapshots.