mirrors/ruff - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/astral-sh/ruff.git synced 2025-07-15 00:55:08 +00:00

Author	SHA1	Message	Date
Charlie Marsh	646ff6497c	Ignore end-of-line file exemption comments (#6160 ) ## Summary This PR protects against code like: ```python from typing import Optional import bar # ruff: noqa import baz class Foo: x: Optional[str] = None ``` In which the user wrote `# ruff: noqa` to ignore a specific error, not realizing that it was a file-level exemption that thus turned off all lint rules. Specifically, if a `# ruff: noqa` directive is not at the start of a line, we now ignore it and warn, since this is almost certainly a mistake.	2023-07-29 00:40:32 +00:00
Micha Reiser	40f54375cb	Pull in RustPython parser (#6099 )	2023-07-27 09:29:11 +00:00
konsti	13f9a16e33	Rewrite placement logic (#6040 ) ## Summary This is a rewrite of the main comment placement logic. `place_comment` now has three parts: - place own line comments - between branches - after a branch - place end-of-line comments - after colon - after a branch - place comments for specific nodes (that include module level comments) The rewrite fixed three bugs: `class A: # trailing comment` comments now stay end-of-line, `try: # comment` remains end-of-line and deeply indented try-else-finally comments remain with the right nested statement. It will be much easier to give more alternative branches nodes since this is abstracted away by `is_node_with_body` and the first/last child helpers. Adding new node types can now be done by adding an entry to the `place_comment` match. The code went from 1526 lines before #6033 to 1213 lines now. It thinks it easier to just read the new `placement.rs` rather than reviewing the diff. ## Test Plan The existing fixtures staying the same or improving plus new ones for the bug fixes.	2023-07-26 16:21:23 +00:00
Micha Reiser	2cf00fee96	Remove parser dependency from ruff-python-ast (#6096 )	2023-07-26 17:47:22 +02:00
Dhruv Manilawala	025fa4eba8	Integrate the new Jupyter AST nodes in Ruff (#6086 ) ## Summary This PR adds the implementation for the new Jupyter AST nodes i.e., `ExprLineMagic` and `StmtLineMagic`. ## Test Plan Add test cases for `unparse` containing magic commands resolves: #6087	2023-07-26 08:20:30 +00:00
Harutaka Kawamura	62f821daaa	Avoid raising PT012 for simple `with` statements (#6081 )	2023-07-26 01:43:31 +00:00
Zanie Blue	389fe13c93	Implement visitation of type aliases and parameters (#5927 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary <!-- What's the purpose of the change? What does it do, and why? --> Part of #5062 Requires https://github.com/astral-sh/RustPython-Parser/pull/32 Adds visitation of type alias statements and type parameters in class and function definitions. Duplicates tests for `PreorderVisitor` into `Visitor` with new snapshots. Testing required node implementations for the `TypeParam` enum, which is a chunk of the diff and the reason we need `Ranged` implementations in https://github.com/astral-sh/RustPython-Parser/pull/32. ## Test Plan <!-- How was it tested? --> Adds unit tests with snapshots.	2023-07-25 17:11:26 +00:00
konsti	e7f228f781	Placement refactor (#6034 ) ## Summary This PR is a refactoring of placement.rs. The code got more consistent, some comments were updated and some dead code was removed or replaced with debug assertions. It also contains a bugfix for the placement of end-of-branch comments with nested bodies inside try statements that occurred when refactoring the nested body loop. ## Test Plan The existing test cases don't change. I added a couple of cases that i think should be tested but weren't, and a regression test for the bugfix	2023-07-25 11:49:05 +02:00
Charlie Marsh	0d94337b96	Avoid allocations in `SimpleCallArgs` (#6021 ) ## Summary My intuition is that it's faster to do these checks as-needed rather than allocation new hash maps and vectors for the arguments. (We typically only query once anyway.)	2023-07-24 04:55:37 +00:00
Charlie Marsh	9834c69c98	Remove `__all__` enforcement rules out of binding phase (#5897 ) ## Summary This PR moves two rules (`invalid-all-format` and `invalid-all-object`) out of the name-binding phase, and into the dedicated pass over all bindings that occurs at the end of the `Checker`. This is part of my continued quest to separate the semantic model-building logic from the actual rule enforcement.	2023-07-19 21:18:47 +00:00
Zanie Blue	b27f0fa433	Implement `any_over_expr` for type alias and type params (#5866 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-19 16:17:06 -05:00
Charlie Marsh	5f3da9955a	Rename `ruff_python_whitespace` to `ruff_python_trivia` (#5886 ) ## Summary This crate now contains utilities for dealing with trivia more broadly: whitespace, newlines, "simple" trivia lexing, etc. So renaming it to reflect its increased responsibilities. To avoid conflicts, I've also renamed `Token` and `TokenKind` to `SimpleToken` and `SimpleTokenKind`.	2023-07-19 11:48:27 -04:00
Charlie Marsh	626d8dc2cc	Use `.as_ref()` in lieu of `&**` (#5874 ) I find this less opaque (and often more succinct).	2023-07-19 00:49:13 +00:00
Charlie Marsh	2d505e2b04	Remove suite body tracking from `SemanticModel` (#5848 ) ## Summary The `SemanticModel` currently stores the "body" of a given `Suite`, along with the current statement index. This is used to support "next sibling" queries, but we only use this in exactly one place -- the rule that simplifies constructs like this to `any` or `all`: ```python for x in y: if x == 0: return True return False ``` Instead of tracking the state, we can just do a (slightly more expensive) traversal, by finding the node within its parent and returning the next node in the body. Note that we'll only have to do this extremely rarely -- namely, for functions that contain something like: ```python for x in y: if x == 0: return True ```	2023-07-18 18:58:31 -04:00
Zanie Blue	a93254f026	Implement `unparse` for type aliases and parameters (#5869 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 16:25:49 -05:00
Zanie Blue	41da52a61b	Implement `TokenKind` for type aliases (#5870 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 18:21:51 +00:00
Zanie Blue	d5c43a45b3	Implement `Comparable` for type aliases and parameters (#5865 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 17:18:14 +00:00
Zanie Blue	0eab4b3c22	Implement `AnyNode` and `AnyNodRef` for `StmtTypeAlias` (#5863 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 10:44:55 -05:00
Charlie Marsh	c868def374	Unroll `collect_call_path` to speed up common cases (#5792 ) ## Summary This PR just naively unrolls `collect_call_path` to handle attribute resolutions of up to eight segments. In profiling via Instruments, it seems to be about 4x faster for a very hot code path (4% of total execution time on `main`, 1% here). Profiling by running `RAYON_NUM_THREADS=1 cargo instruments -t time --profile release-debug --time-limit 10000 -p ruff_cli -o FromSlice.trace -- check crates/ruff/resources/test/cpython --silent -e --no-cache --select ALL`, and modifying the linter to loop infinitely up to the specified time (10 seconds) to increase sample size. Before: <img width="1792" alt="Screen Shot 2023-07-15 at 5 13 34 PM" src="`4a8b0b45`-8b67-43e9-af5e-65b326928a8e"> After: <img width="1792" alt="Screen Shot 2023-07-15 at 8 38 51 PM" src="`d8829159`-2c79-4a49-ab3c-9e4e86f5b2b1">	2023-07-18 11:29:59 -04:00
konsti	730e6b2b4c	Refactor `StmtIf`: Formatter and Linter (#5459 ) ## Summary Previously, `StmtIf` was defined recursively as ```rust pub struct StmtIf { pub range: TextRange, pub test: Box<Expr>, pub body: Vec<Stmt>, pub orelse: Vec<Stmt>, } ``` Every `elif` was represented as an `orelse` with a single `StmtIf`. This means that this representation couldn't differentiate between ```python if cond1: x = 1 else: if cond2: x = 2 ``` and ```python if cond1: x = 1 elif cond2: x = 2 ``` It also makes many checks harder than they need to be because we have to recurse just to iterate over an entire if-elif-else and because we're lacking nodes and ranges on the `elif` and `else` branches. We change the representation to a flat ```rust pub struct StmtIf { pub range: TextRange, pub test: Box<Expr>, pub body: Vec<Stmt>, pub elif_else_clauses: Vec<ElifElseClause>, } pub struct ElifElseClause { pub range: TextRange, pub test: Option<Expr>, pub body: Vec<Stmt>, } ``` where `test: Some(_)` represents an `elif` and `test: None` an else. This representation is different tradeoff, e.g. we need to allocate the `Vec<ElifElseClause>`, the `elif`s are now different than the `if`s (which matters in rules where want to check both `if`s and `elif`s) and the type system doesn't guarantee that the `test: None` else is actually last. We're also now a bit more inconsistent since all other `else`, those from `for`, `while` and `try`, still don't have nodes. With the new representation some things became easier, e.g. finding the `elif` token (we can use the start of the `ElifElseClause`) and formatting comments for if-elif-else (no more dangling comments splitting, we only have to insert the dangling comment after the colon manually and set `leading_alternate_branch_comments`, everything else is taken of by having nodes for each branch and the usual placement.rs fixups). ## Merge Plan This PR requires coordination between the parser repo and the main ruff repo. I've split the ruff part, into two stacked PRs which have to be merged together (only the second one fixes all tests), the first for the formatter to be reviewed by @michareiser and the second for the linter to be reviewed by @charliermarsh. * MH: Review and merge https://github.com/astral-sh/RustPython-Parser/pull/20 * MH: Review and merge or move later in stack https://github.com/astral-sh/RustPython-Parser/pull/21 * MH: Review and approve https://github.com/astral-sh/RustPython-Parser/pull/22 * MH: Review and approve formatter PR https://github.com/astral-sh/ruff/pull/5459 * CM: Review and approve linter PR https://github.com/astral-sh/ruff/pull/5460 * Merge linter PR in formatter PR, fix ecosystem checks (ecosystem checks can't run on the formatter PR and won't run on the linter PR, so we need to merge them first) * Merge https://github.com/astral-sh/RustPython-Parser/pull/22 * Create tag in the parser, update linter+formatter PR * Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459 --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2023-07-18 13:40:15 +02:00
David Szotten	52aa2fc875	upgrade rustpython to remove tuple-constants (#5840 ) c.f. https://github.com/astral-sh/RustPython-Parser/pull/28 Tests: No snapshots changed --------- Co-authored-by: Zanie <contact@zanie.dev>	2023-07-17 22:50:31 +00:00
Charlie Marsh	2cd117ba81	Remove `TryIdentifier` trait (#5816 ) ## Summary Last remaining usage here is for patterns, but we now have ranges on identifiers so it's unnecessary.	2023-07-16 21:24:16 -04:00
Charlie Marsh	01b05fe247	Remove `Identifier` usages for isolating exception names (#5797 ) ## Summary The motivating change here is to remove `let range = except_handler.try_identifier().unwrap();` and instead just do `name.range()`, since exception names now have ranges attached to them by the parse. This also required some refactors (which are improvements) to the built-in attribute shadowing rules, since at least one invocation relied on passing in the exception handler and calling `.try_identifier()`. Now that we have easy access to identifiers, we can remove the whole `AnyShadowing` abstraction.	2023-07-16 04:49:48 +00:00
Charlie Marsh	4782675bf9	Remove lexer-based comment range detection (#5785 ) ## Summary I'm doing some unrelated profiling, and I noticed that this method is actually measurable on the CPython benchmark -- it's > 1% of execution time. We don't need to lex here, we already know the ranges of all comments, so we can just do a simple binary search for overlap, which brings the method down to 0%. ## Test Plan `cargo test`	2023-07-16 01:03:27 +00:00
guillaumeLepape	6824b67f44	Include alias when formatting import-from structs (#5786 ) ## Summary When required-imports is set with the syntax from ... import ... as ..., autofix I002 is failing ## Test Plan Reuse the same python files as `crates/ruff/src/rules/isort/mod.rs:required_import` test.	2023-07-15 15:53:21 -04:00
Charlie Marsh	5a4516b812	Misc. stylistic changes from flipping through rules late at night (#5757 ) ## Summary This is really bad PR hygiene, but a mix of: using `Locator`-based fixes in a few places (in lieu of `Generator`-based fixes), using match syntax to avoid `.len() == 1` checks, using common helpers in more places, etc. ## Test Plan `cargo test`	2023-07-14 05:23:47 +00:00
Charlie Marsh	6dbc6d2e59	Use shared `Cursor` across crates (#5715 ) ## Summary We have two `Cursor` implementations. This PR moves the implementation from the formatter into `ruff_python_whitespace` (kind of a poorly-named crate now) and uses it for both use-cases.	2023-07-12 21:09:27 +00:00
Charlie Marsh	4dee49d6fa	Run nightly Clippy over the Ruff repo (#5670 ) ## Summary This is the result of running `cargo +nightly clippy --workspace --all-targets --all-features -- -D warnings` and fixing all violations. Just wanted to see if there were any interesting new checks on nightly 👀	2023-07-10 23:44:38 -04:00
konsti	0b9af031fb	Format ExprIfExp (ternary operator) (#5597 ) ## Summary Format `ExprIfExp`, also known as the ternary operator or inline `if`. It can look like ```python a1 = 1 if True else 2 ``` but also ```python b1 = ( # We return "a" ... "a" # that's our True value # ... if this condition matches ... if True # that's our test # ... otherwise we return "b§ else "b" # that's our False value ) ``` This also fixes a visitor order bug. The jaccard index on django goes from 0.911 to 0.915. ## Test Plan I added fixtures without and with comments in strange places.	2023-07-07 19:11:52 +00:00
konsti	8184235f93	Try statements have a body: Fix formatter instability (#5558 ) ## Summary The following code was previously leading to unstable formatting: ```python try: try: pass finally: print(1) # issue7208 except A: pass ``` The comment would be formatted as a trailing comment of `try` which is unstable as an end-of-line comment gets two extra whitespaces. This was originally found in `99b00efd5e/Lib/getpass.py (L68-L91)` ## Test Plan I added a regression test	2023-07-06 16:07:47 +02:00
Charlie Marsh	dadad0e9ed	Remove some allocations in argument detection (#5481 ) ## Summary Drive-by PR to remove some allocations around argument name matching.	2023-07-03 12:21:26 -04:00
Anders Kaseorg	df13e69c3c	Format let-else with rustfmt nightly (#5461 ) Support for `let…else` formatting was just merged to nightly (rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02 to pick this up. Followup to #939. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2023-07-03 02:13:35 +00:00
Charlie Marsh	fa1b85b3da	Remove prelude from `ruff_python_ast` (#5369 ) ## Summary Per @MichaReiser, this is causing more confusion than it is helpful.	2023-06-26 11:43:49 -04:00
Micha Reiser	6ba9d5d5a4	Upgrade RustPython (#5334 )	2023-06-23 20:39:47 +00:00
James Berry	f85eb709e2	Visit AugAssign target after value (#5325 ) ## Summary When visiting AugAssign in evaluation order, the AugAssign `target` should be visited after it's `value`. Based on my testing, the pseudo code for `a += b` is effectively: ```python tmp = a a = tmp.__iadd__(b) ``` That is, an ideal traversal order would look something like this: 1. load a 2. b 3. op 4. store a But, there is only a single AST node which captures `a` in the statement `a += b`, so it cannot be traversed both before and after the traversal of `b` and the `op`. Nonetheless, I think traversing `a` after `b` and the `op` makes the most sense for a number of reasons: 1. All the other assignment expressions traverse their `value`s before their `target`s. Having `AugAssign` traverse in the same order would be more consistent. 2. Within the AST, the `ctx` of the `target` for an `AugAssign` is `Store` (though technically this is a `Load` and `Store` operation, the AST only indicates it as a `Store`). Since the the store portion of the `AugAssign` occurs last, I think it makes sense to traverse the `target` last as well. The effect of this is marginal, but it may have an impact on the behavior of #5271.	2023-06-23 09:54:54 -04:00
Micha Reiser	c52aa8f065	Basic string formatting <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR implements formatting for non-f-string Strings that do not use implicit concatenation. Docstring formatting is out of the scope of this PR. <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan I added a few tests for simple string literals. ## Performance Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times: 1. To detect if it is an implicit string continuation 2. To detect if the string contains any new lines 3. To detect the preferred quote 4. To normalize the string Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time. We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.	2023-06-23 09:46:05 +02:00
James Berry	2142bf6141	Fix annotation and format spec visitors (#5324 ) ## Summary The `Visitor` and `preorder::Visitor` traits provide some convenience functions, `visit_annotation` and `visit_format_spec`, for handling annotation and format spec expressions respectively. Both of these functions accept an `&Expr` and have a default implementation which delegates to `walk_expr`. The problem with this approach is that any custom handling done in `visit_expr` will be skipped for annotations and format specs. Instead, to capture any custom logic implemented in `visit_expr`, both of these function's default implementations should delegate to `visit_expr` instead of `walk_expr`. ## Example Consider the below `Visitor` implementation: ```rust impl<'a> Visitor<'a> for Example<'a> { fn visit_expr(&mut self, expr: &'a Expr) { match expr { Expr::Name(ExprName { id, .. }) => println!("Visiting {:?}", id), _ => walk_expr(self, expr), } } } ``` Run on the following Python snippet: ```python a: b ``` I would expect such a visitor to print the following: ``` Visiting b Visiting a ``` But it instead prints the following: ``` Visiting a ``` Our custom `visit_expr` handler is not invoked for the annotation. ## Test Plan Tests added in #5271 caught this behavior.	2023-06-23 03:55:42 +00:00
James Berry	f194572be8	Remove visit_arg_with_default (#5265 ) ## Summary This is a follow up to #5221. Turns out it was easy to restructure the visitor to get the right order, I'm just dumb 🤷‍♂️ I've removed `visit_arg_with_default` entirely from the `Visitor`, although it still exists as part of `preorder::Visitor`.	2023-06-21 16:00:24 -04:00
James Berry	9b5fb8f38f	Fix AST visitor traversal order (#5221 ) ## Summary According to the AST visitor documentation, the AST visitor "visits all nodes in the AST recursively in evaluation-order". However, the current traversal fails to meet this specification in a few places. ### Function traversal ```python order = [] @(order.append("decorator") or (lambda x: x)) def f( posonly: order.append("posonly annotation") = order.append("posonly default"), /, arg: order.append("arg annotation") = order.append("arg default"), args: order.append("vararg annotation"), kwarg: order.append("kwarg annotation") = order.append("kwarg default"), *kwargs: order.append("kwarg annotation") ) -> order.append("return annotation"): pass print(order) ``` Executing the above snippet using CPython 3.10.6 prints the following result (formatted for readability): ```python [ 'decorator', 'posonly default', 'arg default', 'kwarg default', 'arg annotation', 'posonly annotation', 'vararg annotation', 'kwarg annotation', 'kwarg annotation', 'return annotation', ] ``` Here we can see that decorators are evaluated first, followed by argument defaults, and annotations are last. The current traversal of a function's AST does not align with this order. ### Annotated assignment traversal ```python order = [] x: order.append("annotation") = order.append("expression") print(order) ``` Executing the above snippet using CPython 3.10.6 prints the following result: ```python ['expression', 'annotation'] ``` Here we can see that an annotated assignments annotation gets evaluated after the assignment's expression. The current traversal of an annotated assignment's AST does not align with this order. ## Why? I'm slowly working on #3946 and porting over some of the logic and tests from ssort. ssort is very sensitive to AST traversal order, so ensuring the utmost correctness here is important. ## Test Plan There doesn't seem to be existing tests for the AST visitor, so I didn't bother adding tests for these very subtle changes. However, this behavior will be captured in the tests for the PR which addresses #3946.	2023-06-21 14:40:58 -04:00
Micha Reiser	e520a3a721	Fix ArgWithDefault comments handling (#5204 )	2023-06-20 20:48:07 +00:00
Charlie Marsh	7bc33a8d5f	Remove identifier lexing in favor of parser ranges (#5195 ) ## Summary Now that all identifiers include ranges (#5194), we can remove a ton of this "custom lexing" code that we have to sketchily extract identifier ranges from source. ## Test Plan `cargo test`	2023-06-20 12:07:29 -04:00
Charlie Marsh	6331598511	Upgrade `RustPython` to access ranged names (#5194 ) ## Summary In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified RustPython to include ranges for any identifiers that aren't `Expr::Name` (which already has an identifier). For example, the `e` in `except ValueError as e` was previously un-ranged. To extract its range, we had to do some lexing of our own. This change should improve performance and let us remove a bunch of code. ## Test Plan `cargo test`	2023-06-20 15:43:38 +00:00
Charlie Marsh	8e06140d1d	Remove continuations when deleting statements (#5198 ) ## Summary This PR modifies our statement deletion logic to delete any preceding continuation lines. For example, given: ```py x = 1; \ import os ``` We'll now rewrite to: ```py x = 1; ``` In addition, the logic can now handle multiple preceding continuations (which is unlikely, but valid).	2023-06-19 22:04:28 -04:00
Charlie Marsh	36e01ad6eb	Upgrade RustPython (#5192 ) ## Summary This PR upgrade RustPython to pull in the changes to `Arguments` (zip defaults with their identifiers) and all the renames to `CmpOp` and friends.	2023-06-19 21:09:53 +00:00
Thomas de Zeeuw	e3c12764f8	Only use a single cache file per Python package (#5117 ) ## Summary This changes the caching design from one cache file per source file, to one cache file per package. This greatly reduces the amount of cache files that are opened and written, while maintaining roughly the same (combined) size as bincode is very compact. Below are some very much not scientific performance tests. It uses projects/sources to check: * small.py: single, 31 bytes Python file with 2 errors. * test.py: single, 43k Python file with 8 errors. * fastapi: FastAPI repo, 1134 files checked, 0 errors. Source \| Before # files \| After # files \| Before size \| After size -------\|-------\|-------\|-------\|------- small.py \| 1 \| 1 \| 20 K \| 20 K test.py \| 1 \| 1 \| 60 K \| 60 K fastapi \| 1134 \| 518 \| 4.5 M \| 2.3 M One question that might come up is why fastapi still has 518 cache files and not 1? That is because this is using the existing package resolution, which sees examples, docs, etc. as separate from the "main" source code (in the fastapi directory in the repo). In this future it might be worth consider switching to a one cache file per repo strategy. This new design is not perfect and does have a number of known issues. First, like the old design it doesn't remove the cache for a source file that has been (re)moved until `ruff clean` is called. Second, this currently uses a large mutex around the mutation of the package cache (e.g. inserting result). This could be (or become) a bottleneck. It's future work to test and improve this (if needed). Third, currently the packages and opened and stored in a sequential loop, this could be done parallel. This is also future work. ## Test Plan Run `ruff check` (with caching enabled) twice on any Python source code and it should produce the same results.	2023-06-19 17:46:13 +02:00
Charlie Marsh	2b82caa163	Detect continuations at start-of-file (#5173 ) ## Summary Given: ```python \ import os ``` Deleting `import os` leaves a syntax error: a file can't end in a continuation. We have code to handle this case, but it failed to pick up continuations at the _very start_ of a file. Closes #5156.	2023-06-19 00:09:02 -04:00
Charlie Marsh	fab2a4adf7	Use `matches!` for insecure hash rule (#5141 )	2023-06-16 04:18:32 +00:00
Charlie Marsh	5ea3e42513	Always use identifier ranges to store bindings (#5110 ) ## Summary At present, when we store a binding, we include a `TextRange` alongside it. The `TextRange` _sometimes_ matches the exact range of the identifier to which the `Binding` is linked, but... not always. For example, given: ```python x = 1 ``` The binding we create _will_ use the range of `x`, because the left-hand side is an `Expr::Name`, which has a valid range on it. However, given: ```python try: pass except ValueError as e: pass ``` When we create a binding for `e`, we don't have a `TextRange`... The AST doesn't give us one. So we end up extracting it via lexing. This PR extends that pattern to the rest of the binding kinds, to ensure that whenever we create a binding, we always use the range of the bound name. This leads to better diagnostics in cases like pattern matching, whereby the diagnostic for "unused variable `x`" here used to include `x`, instead of just `x`: ```python def f(provided: int) -> int: match provided: case [_, x]: pass ``` This is _also_ required for symbol renames, since we track writes as bindings -- so we need to know the ranges of the bound symbols. By storing these bindings precisely, we can also remove the `binding.trimmed_range` abstraction -- since bindings already use the "trimmed range". To implement this behavior, I took some of our existing utilities (like the code we had for `except ValueError as e` above), migrated them from a full lexer to a zero-allocation lexer that _only_ identifies "identifiers", and moved the behavior into a trait, so we can now do `stmt.identifier(locator)` to get the range for the identifier. Honestly, we might end up discarding much of this if we decide to put ranges on all identifiers (https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we do, this will _still_ be a good change, because the lexer introduced here is useful beyond names (e.g., we use it find the `except` keyword in an exception handler, to find the `else` after a `for` loop, and so on). So, I'm fine committing this even if we end up changing our minds about the right approach. Closes #5090. ## Benchmarks No significant change, with one statistically significant improvement (-2.1654% on `linter/all-rules/large/dataset.py`): ``` linter/default-rules/numpy/globals.py time: [73.922 µs 73.955 µs 73.986 µs] thrpt: [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s] change: time: [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05) thrpt: [+0.3996% +0.4755% +0.5611%] Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild linter/default-rules/pydantic/types.py time: [1.4909 ms 1.4917 ms 1.4926 ms] thrpt: [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s] change: time: [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05) thrpt: [-0.3380% -0.2734% -0.2136%] Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe linter/default-rules/numpy/ctypeslib.py time: [688.97 µs 691.34 µs 694.15 µs] thrpt: [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s] change: time: [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05) thrpt: [+0.1468% +0.7351% +1.3461%] Change within noise threshold. Found 15 outliers among 100 measurements (15.00%) 1 (1.00%) low mild 2 (2.00%) high mild 12 (12.00%) high severe linter/default-rules/large/dataset.py time: [3.3872 ms 3.4032 ms 3.4191 ms] thrpt: [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s] change: time: [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05) thrpt: [-0.0905% +0.2642% +0.6469%] No change in performance detected. Found 20 outliers among 100 measurements (20.00%) 1 (1.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 13 (13.00%) high severe linter/all-rules/numpy/globals.py time: [148.99 µs 149.21 µs 149.42 µs] thrpt: [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s] change: time: [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05) thrpt: [+0.2785% +0.5094% +0.7395%] Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high severe linter/all-rules/pydantic/types.py time: [3.0362 ms 3.0396 ms 3.0441 ms] thrpt: [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s] change: time: [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05) thrpt: [-0.2121% -0.0618% +0.0958%] No change in performance detected. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 3 (3.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe linter/all-rules/numpy/ctypeslib.py time: [1.6879 ms 1.6894 ms 1.6909 ms] thrpt: [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s] change: time: [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05) thrpt: [-0.0435% +0.0889% +0.2284%] No change in performance detected. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) low mild 1 (1.00%) high severe linter/all-rules/large/dataset.py time: [7.1520 ms 7.1586 ms 7.1654 ms] thrpt: [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s] change: time: [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05) thrpt: [+1.8102% +2.2133% +2.6300%] Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild ```	2023-06-15 18:43:19 +00:00
konstin	66089e1a2e	Fix a number of formatter errors from the cpython repository (#5089 ) ## Summary This fixes a number of problems in the formatter that showed up with various files in the [cpython](https://github.com/python/cpython) repository. These problems surfaced as unstable formatting and invalid code. This is not the entirety of problems discovered through cpython, but a big enough chunk to separate it. Individual fixes are generally individual commits. They were discovered with #5055, which i update as i work through the output ## Test Plan I added regression tests with links to cpython for each entry, except for the two stubs that also got comment stubs since they'll be implemented properly later.	2023-06-15 11:24:14 +00:00
Charlie Marsh	716cab2f19	Run `rustfmt` on nightly to clean up erroneous comments (#5106 ) ## Summary This PR runs `rustfmt` with a few nightly options as a one-time fix to catch some malformatted comments. I ended up just running with: ```toml condense_wildcard_suffixes = true edition = "2021" max_width = 100 normalize_comments = true normalize_doc_attributes = true reorder_impl_items = true unstable_features = true use_field_init_shorthand = true ``` Since these all seem like reasonable things to fix, so may as well while I'm here.	2023-06-15 00:19:05 +00:00

... 4 5 6 7 8 ...

414 commits