mirrors/ruff - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/astral-sh/ruff.git synced 2025-07-12 23:55:08 +00:00

Author	SHA1	Message	Date
Dylan	9bbf4987e8	Implement template strings (#17851 ) This PR implements template strings (t-strings) in the parser and formatter for Ruff. Minimal changes necessary to compile were made in other parts of the code (e.g. ty, the linter, etc.). These will be covered properly in follow-up PRs.	2025-05-30 15:00:56 -05:00
Max Mynter	02fd48132c	[ty] Don't warn `yield` not in function when `yield` is in function (#18008 ) Some checks are pending CI / Determine changes (push) Waiting to run Details CI / cargo fmt (push) Waiting to run Details CI / cargo clippy (push) Blocked by required conditions Details CI / cargo test (linux) (push) Blocked by required conditions Details CI / cargo test (linux, release) (push) Blocked by required conditions Details CI / cargo test (windows) (push) Blocked by required conditions Details CI / cargo test (wasm) (push) Blocked by required conditions Details CI / cargo build (release) (push) Waiting to run Details CI / cargo build (msrv) (push) Blocked by required conditions Details CI / cargo fuzz build (push) Blocked by required conditions Details CI / fuzz parser (push) Blocked by required conditions Details CI / test scripts (push) Blocked by required conditions Details CI / ecosystem (push) Blocked by required conditions Details CI / Fuzz for new ty panics (push) Blocked by required conditions Details CI / cargo shear (push) Blocked by required conditions Details CI / python package (push) Waiting to run Details CI / pre-commit (push) Waiting to run Details CI / mkdocs (push) Waiting to run Details CI / formatter instabilities and black similarity (push) Blocked by required conditions Details CI / test ruff-lsp (push) Blocked by required conditions Details CI / check playground (push) Blocked by required conditions Details CI / benchmarks (push) Blocked by required conditions Details [ty Playground] Release / publish (push) Waiting to run Details	2025-05-21 18:16:25 +02:00
Micha Reiser	9ae698fe30	Switch to Rust 2024 edition (#18129 )	2025-05-16 13:25:28 +02:00
Micha Reiser	fa628018b2	Use `#[expect(lint)]` over `#[allow(lint)]` where possible (#17822 )	2025-05-03 21:20:31 +02:00
Abhijeet Prasad Bodas	cf59cee928	[syntax-errors] `nonlocal` declaration at module level (#17559 ) ## Summary Part of #17412 Add a new compile-time syntax error for detecting `nonlocal` declarations at a module level. ## Test Plan - Added new inline tests for the syntax error - Updated existing tests for `nonlocal` statement parsing to be inside a function scope Co-authored-by: Brent Westbrook <36778786+ntBre@users.noreply.github.com>	2025-04-24 16:11:46 -04:00
Brent Westbrook	e7f38fe74b	[red-knot] Detect semantic syntax errors (#17463 ) Summary -- This PR extends semantic syntax error detection to red-knot. The main changes here are: 1. Adding `SemanticSyntaxChecker` and `Vec<SemanticSyntaxError>` fields to the `SemanticIndexBuilder` 2. Calling `SemanticSyntaxChecker::visit_stmt` and `visit_expr` in the `SemanticIndexBuilder`'s `visit_stmt` and `visit_expr` methods 3. Implementing `SemanticSyntaxContext` for `SemanticIndexBuilder` 4. Adding new mdtests to test the context implementation and show diagnostics (3) is definitely the trickiest and required (I think) a minor addition to the `SemanticIndexBuilder`. I tried to look around for existing code performing the necessary checks, but I definitely could have missed something or misused the existing code even when I found it. There's still one TODO around `global` statement handling. I don't think there's an existing way to look this up, but I'm happy to work on that here or in a separate PR. This currently only affects detection of one error (`LoadBeforeGlobalDeclaration` or [PLE0118](https://docs.astral.sh/ruff/rules/load-before-global-declaration/) in ruff), so it's not too big of a problem even if we leave the TODO. Test Plan -- New mdtests, as well as new errors for existing mdtests --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2025-04-23 09:52:58 -04:00
Brent Westbrook	014bb526f4	[syntax-errors] `await` outside async functions (#17363 ) Summary -- This PR implements detecting the use of `await` expressions outside of async functions. This is a reimplementation of [await-outside-async (PLE1142)](https://docs.astral.sh/ruff/rules/await-outside-async/) as a semantic syntax error. Despite the rule name, PLE1142 also applies to `async for` and `async with`, so these are covered here too. Test Plan -- Existing PLE1142 tests. I also deleted more code from the `SemanticSyntaxCheckerVisitor` to avoid changes in other parser tests.	2025-04-14 13:01:48 -04:00
Brent Westbrook	ffef71d106	[syntax-errors] `yield`, `yield from`, and `await` outside functions (#17298 ) Summary -- This PR reimplements [yield-outside-function (F704)](https://docs.astral.sh/ruff/rules/yield-outside-function/) as a semantic syntax error. Despite the name, this rule covers `yield from` and `await` in addition to `yield`. Test Plan -- New linter tests, along with the existing F704 test. --------- Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>	2025-04-11 10:16:23 -04:00
Brent Westbrook	144484d46c	Refactor semantic syntax error scope handling (#17314 ) ## Summary Based on the discussion in https://github.com/astral-sh/ruff/pull/17298#discussion_r2033975460, we decided to move the scope handling out of the `SemanticSyntaxChecker` and into the `SemanticSyntaxContext` trait. This PR implements that refactor by: - Reverting all of the `Checkpoint` and `in_async_context` code in the `SemanticSyntaxChecker` - Adding four new methods to the `SemanticSyntaxContext` trait - `in_async_context`: matches `SemanticModel::in_async_context` and only detects the nearest enclosing function - `in_sync_comprehension`: uses the new `is_async` tracking on `Generator` scopes to detect any enclosing sync comprehension - `in_module_scope`: reports whether we're at the top-level scope - `in_notebook`: reports whether we're in a Jupyter notebook - In-lining the `TestContext` directly into the `SemanticSyntaxCheckerVisitor` - This allows modifying the context as the visitor traverses the AST, which wasn't possible before One potential question here is "why not add a single method returning a `Scope` or `Scopes` to the context?" The main reason is that the `Scope` type is defined in the `ruff_python_semantic` crate, which is not currently a dependency of the parser. It also doesn't appear to be used in red-knot. So it seemed best to use these more granular methods instead of trying to access `Scope` in `ruff_python_parser` (and red-knot). ## Test Plan Existing parser and linter tests.	2025-04-09 14:23:29 -04:00
Brent Westbrook	acc5662e8b	[syntax-errors] Allow `yield` in base classes and annotations (#17206 ) Summary -- This PR fixes the issue pointed out by @JelleZijlstra in https://github.com/astral-sh/ruff/pull/17101#issuecomment-2777480204. Namely, I conflated two very different errors from CPython: ```pycon >>> def m[T](x: (yield from 1)): ... File "<python-input-310>", line 1 def m[T](x: (yield from 1)): ... ^^^^^^^^^^^^ SyntaxError: yield expression cannot be used within the definition of a generic >>> def m(x: (yield from 1)): ... File "<python-input-311>", line 1 def m(x: (yield from 1)): ... ^^^^^^^^^^^^ SyntaxError: 'yield from' outside function >>> def outer(): ... def m(x: (yield from 1)): ... ... >>> ``` I thought the second error was the same as the first, but `yield` (and `yield from`) is actually valid in this position when inside a function scope. The same is true for base classes, as pointed out in the original comment. We don't currently raise an error for `yield` outside of a function, but that should be handled separately. On the upside, this had the benefit of removing the `InvalidExpressionPosition::BaseClass` variant and the `allow_named_expr` field from the visitor because they were both no longer used. Test Plan -- Updated inline tests.	2025-04-04 13:48:28 -04:00
Brent Westbrook	6e2b8f9696	[syntax-errors] Detect duplicate keys in `match` mapping patterns (#17129 ) Summary -- Detects duplicate literals in `match` mapping keys. This PR also adds a `source` method to `SemanticSyntaxContext` to display the duplicated key in the error message by slicing out its range. Test Plan -- New inline tests.	2025-04-03 10:22:37 -04:00
Brent Westbrook	d382065f8a	[syntax-errors] Reimplement PLE0118 (#17135 ) Summary -- This PR reimplements [load-before-global-declaration (PLE0118)](https://docs.astral.sh/ruff/rules/load-before-global-declaration/) as a semantic syntax error. I added a `global` method to the `SemanticSyntaxContext` trait to make this very easy, at least in ruff. Does red-knot have something similar? If this approach will also work in red-knot, I think some of the other PLE rules are also compile-time errors in CPython, PLE0117 in particular. 0115 and 0116 also mention `SyntaxError`s in their docs, but I haven't confirmed them in the REPL yet. Test Plan -- Existing linter tests for PLE0118. I think this actually can't be tested very easily in an inline test because the `TestContext` doesn't have a real way to track globals. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2025-04-02 13:03:44 +00:00
Brent Westbrook	a0819f0c51	[syntax-errors] Store to or delete `__debug__` (#16984 ) Summary -- Detect setting or deleting `__debug__`. Assigning to `__debug__` was a `SyntaxError` on the earliest version I tested (3.8). Deleting `__debug__` was made a `SyntaxError` in [BPO 45000], which said it was resolved in Python 3.10. However, `del __debug__` was also a runtime error (`NameError`) when I tested in Python 3.9.6, so I thought it was worth including 3.9 in this check. I don't think it was ever a good idea to try `del __debug__`, so I think there's also an argument for not making this version-dependent at all. That would only simplify the implementation very slightly, though. [BPO 45000]: https://github.com/python/cpython/issues/89163 Test Plan -- New inline tests. This also required adding a `PythonVersion` field to the `TestContext` that could be taken from the inline `ParseOptions` and making the version field on the options accessible.	2025-03-29 12:07:20 -04:00
Brent Westbrook	2baaedda6c	[syntax-errors] Start detecting compile-time syntax errors (#16106 ) ## Summary This PR implements the "greeter" approach for checking the AST for syntax errors emitted by the CPython compiler. It introduces two main infrastructural changes to support all of the compile-time errors: 1. Adds a new `semantic_errors` module to the parser crate with public `SemanticSyntaxChecker` and `SemanticSyntaxError` types 2. Embeds a `SemanticSyntaxChecker` in the `ruff_linter::Checker` for checking these errors in ruff As a proof of concept, it also implements detection of two syntax errors: 1. A reimplementation of [`late-future-import`](https://docs.astral.sh/ruff/rules/late-future-import/) (`F404`) 2. Detection of rebound comprehension iteration variables (https://github.com/astral-sh/ruff/issues/14395) ## Test plan Existing F404 tests, new inline tests in the `ruff_python_parser` crate, and a linter CLI test showing an example of the `Message` output. I also tested in VS Code, where `preview = false` and turning off syntax errors both disable the new errors: ![image](https://github.com/user-attachments/assets/cf453d95-04f7-484b-8440-cb812f29d45e) And on the playground, where `preview = false` also disables the errors: ![image](https://github.com/user-attachments/assets/a97570c4-1efa-439f-9d99-a54487dd6064) Fixes #14395 --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2025-03-21 14:45:25 -04:00
Brent Westbrook	37fbe58b13	Document `LinterResult::has_syntax_error` and add `Parsed::has_no_syntax_errors` (#16443 ) Summary -- This is a follow up addressing the comments on #16425. As @dhruvmanila pointed out, the naming is a bit tricky. I went with `has_no_errors` to try to differentiate it from `is_valid`. It actually ends up negated in most uses, so it would be more convenient to have `has_any_errors` or `has_errors`, but I thought it would sound too much like the opposite of `is_valid` in that case. I'm definitely open to suggestions here. Test Plan -- Existing tests.	2025-03-04 08:35:38 -05:00
Brent Westbrook	764aa0e6a1	Allow passing `ParseOptions` to inline tests (#16357 ) ## Summary This PR adds support for a pragma-style header for inline parser tests containing JSON-serialized `ParseOptions`. For example, ```python # parse_options: { "target-version": "3.9" } match 2: case 1: pass ``` The line must start with `# parse_options: ` and then the rest of the (trimmed) line is deserialized into `ParseOptions` used for parsing the the test. ## Test Plan Existing inline tests, plus two new inline tests for `match-before-py310`. --------- Co-authored-by: Alex Waygood <alex.waygood@gmail.com>	2025-02-27 10:23:15 -05:00
Brent Westbrook	97d0659ce3	Pass `ParserOptions` to the parser (#16220 ) ## Summary This is part of the preparation for detecting syntax errors in the parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested in [this comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509), I started working on a `ParseOptions` struct that could be stored in the parser. For this initial refactor, I only made it hold the existing `Mode` option, but for syntax errors, we will also need it to have a `PythonVersion`. For that use case, I'm picturing something like a `ParseOptions::with_python_version` method, so you can extend the current calls to something like ```rust ParseOptions::from(mode).with_python_version(settings.target_version) ``` But I thought it was worth adding `ParseOptions` alone without changing any other behavior first. Most of the diff is just updating call sites taking `Mode` to take `ParseOptions::from(Mode)` or those taking `PySourceType`s to take `ParseOptions::from(PySourceType)`. The interesting changes are in the new `parser/options.rs` file and smaller parts of `parser/mod.rs` and `ruff_python_parser/src/lib.rs`. ## Test Plan Existing tests, this should not change any behavior.	2025-02-19 10:50:50 -05:00
Andrew Gallant	84ba4ecaf5	ruff_annotate_snippets: support overriding the "cut indicator" We do this because `...` is valid Python, which makes it pretty likely that some line trimming will lead to ambiguous output. So we add support for overriding the cut indicator. This also requires changing some of the alignment math, which was previously tightly coupled to `...`. For Ruff, we go with `…` (`U+2026 HORIZONTAL ELLIPSIS`) for our cut indicator. For more details, see the patch sent to upstream: https://github.com/rust-lang/annotate-snippets-rs/pull/172	2025-01-15 13:37:52 -05:00
Andrew Gallant	84179aaa96	ruff_linter,ruff_python_parser: migrate to updated `annotate-snippets` This is pretty much just moving to the new API and taking care to use byte offsets. This is almost enough. The next commit will fix a bug involving the handling of unprintable characters as a result of switching to byte offsets.	2025-01-15 13:37:52 -05:00
Dhruv Manilawala	7109214b57	Update parser tests to validate token ranges (#12019 ) ## Summary This PR updates the parser test infrastructure to validate the token ranges. From the code documentation: ``` /// Verifies that: /// * the ranges are strictly increasing when loop the tokens in insertion order /// * all ranges are within the length of the source code ``` Follow-up from #12016 and #12017 resolves: #11938 ## Test Plan Make sure that there are no failures.	2024-06-25 08:14:28 +00:00
Dhruv Manilawala	b617d90651	Update `E999` to show all syntax errors (#11900 ) ## Summary This PR updates the linter to show all the parse errors as diagnostics instead of just the first one. Note that this doesn't affect the parse error displayed as error log message. This will be removed in a follow-up PR. ### Breaking? I don't think this is a breaking change even though this might give more diagnostics. The main reason is that this shouldn't affect any users because it'll only give additional diagnostics in the case of multiple syntax errors. ## Test Plan Add an integration test case which would raise more than one parse error.	2024-06-19 13:09:54 +05:30
Micha Reiser	32ca704956	Rename `PreorderVisitor` to `SourceOrderVisitor` (#11798 ) Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-06-07 17:01:58 +00:00
Dhruv Manilawala	bf5b62edac	Maintain synchronicity between the lexer and the parser (#11457 ) ## Summary This PR updates the entire parser stack in multiple ways: ### Make the lexer lazy * https://github.com/astral-sh/ruff/pull/11244 * https://github.com/astral-sh/ruff/pull/11473 Previously, Ruff's lexer would act as an iterator. The parser would collect all the tokens in a vector first and then process the tokens to create the syntax tree. The first task in this project is to update the entire parsing flow to make the lexer lazy. This includes the `Lexer`, `TokenSource`, and `Parser`. For context, the `TokenSource` is a wrapper around the `Lexer` to filter out the trivia tokens[^1]. Now, the parser will ask the token source to get the next token and only then the lexer will continue and emit the token. This means that the lexer needs to be aware of the "current" token. When the `next_token` is called, the current token will be updated with the newly lexed token. The main motivation to make the lexer lazy is to allow re-lexing a token in a different context. This is going to be really useful to make the parser error resilience. For example, currently the emitted tokens remains the same even if the parser can recover from an unclosed parenthesis. This is important because the lexer emits a `NonLogicalNewline` in parenthesized context while a normal `Newline` in non-parenthesized context. This different kinds of newline is also used to emit the indentation tokens which is important for the parser as it's used to determine the start and end of a block. Additionally, this allows us to implement the following functionalities: 1. Checkpoint - rewind infrastructure: The idea here is to create a checkpoint and continue lexing. At a later point, this checkpoint can be used to rewind the lexer back to the provided checkpoint. 2. Remove the `SoftKeywordTransformer` and instead use lookahead or speculative parsing to determine whether a soft keyword is a keyword or an identifier 3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted by the lexer but it contains owned data which makes it expensive to clone. The new `TokenKind` enum just represents the type of token which is very cheap. This brings up a question as to how will the parser get the owned value which was stored on `Tok`. This will be solved by introducing a new `TokenValue` enum which only contains a subset of token kinds which has the owned value. This is stored on the lexer and is requested by the parser when it wants to process the data. For example: `8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)` [^1]: Trivia tokens are `NonLogicalNewline` and `Comment` ### Remove `SoftKeywordTransformer` * https://github.com/astral-sh/ruff/pull/11441 * https://github.com/astral-sh/ruff/pull/11459 * https://github.com/astral-sh/ruff/pull/11442 * https://github.com/astral-sh/ruff/pull/11443 * https://github.com/astral-sh/ruff/pull/11474 For context, https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220 added support for soft keywords in the parser which uses infinite lookahead to classify a soft keyword as a keyword or an identifier. This is a brilliant idea as it basically wraps the existing Lexer and works on top of it which means that the logic for lexing and re-lexing a soft keyword remains separate. The change here is to remove `SoftKeywordTransformer` and let the parser determine this based on context, lookahead and speculative parsing. * Context: The transformer needs to know the position of the lexer between it being at a statement position or a simple statement position. This is because a `match` token starts a compound statement while a `type` token starts a simple statement. The parser already knows this. * Lookahead: Now that the parser knows the context it can perform lookahead of up to two tokens to classify the soft keyword. The logic for this is mentioned in the PR implementing it for `type` and `match soft keyword. * Speculative parsing: This is where the checkpoint - rewind infrastructure helps. For `match` soft keyword, there are certain cases for which we can't classify based on lookahead. The idea here is to create a checkpoint and keep parsing. Based on whether the parsing was successful and what tokens are ahead we can classify the remaining cases. Refer to #11443 for more details. If the soft keyword is being parsed in an identifier context, it'll be converted to an identifier and the emitted token will be updated as well. Refer `8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491)`. The `case` soft keyword doesn't require any special handling because it'll be a keyword only in the context of a match statement. ### Update the parser API * https://github.com/astral-sh/ruff/pull/11494 * https://github.com/astral-sh/ruff/pull/11505 Now that the lexer is in sync with the parser, and the parser helps to determine whether a soft keyword is a keyword or an identifier, the lexer cannot be used on its own. The reason being that it's not sensitive to the context (which is correct). This means that the parser API needs to be updated to not allow any access to the lexer. Previously, there were multiple ways to parse the source code: 1. Passing the source code itself 2. Or, passing the tokens Now that the lexer and parser are working together, the API corresponding to (2) cannot exists. The final API is mentioned in this PR description: https://github.com/astral-sh/ruff/pull/11494. ### Refactor the downstream tools (linter and formatter) * https://github.com/astral-sh/ruff/pull/11511 * https://github.com/astral-sh/ruff/pull/11515 * https://github.com/astral-sh/ruff/pull/11529 * https://github.com/astral-sh/ruff/pull/11562 * https://github.com/astral-sh/ruff/pull/11592 And, the final set of changes involves updating all references of the lexer and `Tok` enum. This was done in two-parts: 1. Update all the references in a way that doesn't require any changes from this PR i.e., it can be done independently * https://github.com/astral-sh/ruff/pull/11402 * https://github.com/astral-sh/ruff/pull/11406 * https://github.com/astral-sh/ruff/pull/11418 * https://github.com/astral-sh/ruff/pull/11419 * https://github.com/astral-sh/ruff/pull/11420 * https://github.com/astral-sh/ruff/pull/11424 2. Update all the remaining references to use the changes made in this PR For (2), there were various strategies used: 1. Introduce a new `Tokens` struct which wraps the token vector and add methods to query a certain subset of tokens. These includes: 1. `up_to_first_unknown` which replaces the `tokenize` function 2. `in_range` and `after` which replaces the `lex_starts_at` function where the former returns the tokens within the given range while the latter returns all the tokens after the given offset 2. Introduce a new `TokenFlags` which is a set of flags to query certain information from a token. Currently, this information is only limited to any string type token but can be expanded to include other information in the future as needed. https://github.com/astral-sh/ruff/pull/11578 3. Move the `CommentRanges` to the parsed output because this information is common to both the linter and the formatter. This removes the need for `tokens_and_ranges` function. ## Test Plan - [x] Update and verify the test snapshots - [x] Make sure the entire test suite is passing - [x] Make sure there are no changes in the ecosystem checks - [x] Run the fuzzer on the parser - [x] Run this change on dozens of open-source projects ### Running this change on dozens of open-source projects Refer to the PR description to get the list of open source projects used for testing. Now, the following tests were done between `main` and this branch: 1. Compare the output of `--select=E999` (syntax errors) 2. Compare the output of default rule selection 3. Compare the output of `--select=ALL` Conclusion: all output were same ## What's next? The next step is to introduce re-lexing logic and update the parser to feed the recovery information to the lexer so that it can emit the correct token. This moves us one step closer to having error resilience in the parser and provides Ruff the possibility to lint even if the source code contains syntax errors.	2024-06-03 18:23:50 +05:30
Dhruv Manilawala	13ffb5bc19	Replace LALRPOP parser with hand-written parser (#10036 ) (Supersedes #9152, authored by @LaBatata101) ## Summary This PR replaces the current parser generated from LALRPOP to a hand-written recursive descent parser. It also updates the grammar for [PEP 646](https://peps.python.org/pep-0646/) so that the parser outputs the correct AST. For example, in `data[*x]`, the index expression is now a tuple with a single starred expression instead of just a starred expression. Beyond the performance improvements, the parser is also error resilient and can provide better error messages. The behavior as seen by any downstream tools isn't changed. That is, the linter and formatter can still assume that the parser will _stop_ at the first syntax error. This will be updated in the following months. For more details about the change here, refer to the PR corresponding to the individual commits and the release blog post. ## Test Plan Write _lots_ and _lots_ of tests for both valid and invalid syntax and verify the output. ## Acknowledgements - @MichaReiser for reviewing 100+ parser PRs and continuously providing guidance throughout the project - @LaBatata101 for initiating the transition to a hand-written parser in #9152 - @addisoncrump for implementing the fuzzer which helped [catch](https://github.com/astral-sh/ruff/pull/10903) [a](https://github.com/astral-sh/ruff/pull/10910) [lot](https://github.com/astral-sh/ruff/pull/10966) [of](https://github.com/astral-sh/ruff/pull/10896) [bugs](https://github.com/astral-sh/ruff/pull/10877) --------- Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org> Co-authored-by: Micha Reiser <micha@reiser.io>	2024-04-18 17:57:39 +05:30

24 commits