mirrors/ruff - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/astral-sh/ruff.git synced 2025-08-18 17:40:37 +00:00

Author	SHA1	Message	Date
Victor Hugo Gomes	c0dbcb3434	[`flake8-pyi`] Implement PYI018 (#6018 ) ## Summary Check for unused private `TypeVar`. See [original implementation](`2a86db8271/pyi.py (L1958)`). ``` $ flake8 --select Y018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: Y018 TypeVar "_T" is not used crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: Y018 TypeVar "_P" is not used ``` ``` $ ./target/debug/ruff --select PYI018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi --no-cache crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: PYI018 TypeVar `_T` is never used crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: PYI018 TypeVar `_P` is never used Found 2 errors. ``` In the file `unused_private_type_declaration.rs`, I'm planning to add other rules that are similar to `PYI018` like the `PYI046`, `PYI047` and `PYI049`. ref #848 ## Test Plan Snapshots and manual runs of flake8.	2023-07-26 22:56:15 +00:00
Micha Reiser	2cf00fee96	Remove parser dependency from ruff-python-ast (#6096 )	2023-07-26 17:47:22 +02:00
Charlie Marsh	ed72c027a3	Replace `NoHashHasher` usages with `FxHashMap` (#6049 ) ## Summary I had always assumed that `NoHashHasher` would be faster when using integer keys, but benchmarking shows otherwise: ``` linter/default-rules/numpy/globals.py time: [66.544 µs 66.606 µs 66.678 µs] thrpt: [44.253 MiB/s 44.300 MiB/s 44.342 MiB/s] change: time: [-0.1843% +0.1087% +0.3718%] (p = 0.46 > 0.05) thrpt: [-0.3704% -0.1086% +0.1847%] No change in performance detected. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild linter/default-rules/pydantic/types.py time: [1.3787 ms 1.3811 ms 1.3837 ms] thrpt: [18.431 MiB/s 18.466 MiB/s 18.498 MiB/s] change: time: [-0.4827% -0.1074% +0.1927%] (p = 0.56 > 0.05) thrpt: [-0.1924% +0.1075% +0.4850%] No change in performance detected. linter/default-rules/numpy/ctypeslib.py time: [624.82 µs 625.96 µs 627.17 µs] thrpt: [26.550 MiB/s 26.601 MiB/s 26.650 MiB/s] change: time: [-0.7071% -0.4908% -0.2736%] (p = 0.00 < 0.05) thrpt: [+0.2744% +0.4932% +0.7122%] Change within noise threshold. linter/default-rules/large/dataset.py time: [3.1585 ms 3.1634 ms 3.1685 ms] thrpt: [12.840 MiB/s 12.861 MiB/s 12.880 MiB/s] change: time: [-1.5338% -1.3463% -1.1476%] (p = 0.00 < 0.05) thrpt: [+1.1610% +1.3647% +1.5577%] Performance has improved. linter/all-rules/numpy/globals.py time: [140.17 µs 140.37 µs 140.58 µs] thrpt: [20.989 MiB/s 21.020 MiB/s 21.051 MiB/s] change: time: [-0.1066% +0.3140% +0.7479%] (p = 0.14 > 0.05) thrpt: [-0.7423% -0.3130% +0.1067%] No change in performance detected. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe linter/all-rules/pydantic/types.py time: [2.7030 ms 2.7069 ms 2.7112 ms] thrpt: [9.4064 MiB/s 9.4216 MiB/s 9.4351 MiB/s] change: time: [-0.6721% -0.4874% -0.2974%] (p = 0.00 < 0.05) thrpt: [+0.2982% +0.4898% +0.6766%] Change within noise threshold. Found 14 outliers among 100 measurements (14.00%) 12 (12.00%) high mild 2 (2.00%) high severe linter/all-rules/numpy/ctypeslib.py time: [1.4709 ms 1.4727 ms 1.4749 ms] thrpt: [11.290 MiB/s 11.306 MiB/s 11.320 MiB/s] change: time: [-1.1617% -0.9766% -0.8094%] (p = 0.00 < 0.05) thrpt: [+0.8160% +0.9862% +1.1754%] Change within noise threshold. Found 12 outliers among 100 measurements (12.00%) 9 (9.00%) high mild 3 (3.00%) high severe linter/all-rules/large/dataset.py time: [5.8086 ms 5.8163 ms 5.8240 ms] thrpt: [6.9854 MiB/s 6.9946 MiB/s 7.0038 MiB/s] change: time: [-1.5651% -1.3536% -1.1584%] (p = 0.00 < 0.05) thrpt: [+1.1720% +1.3721% +1.5900%] Performance has improved. ``` My guess is that `NoHashHasher` underperforms because the keys are not randomly distributed... Anyway, it's a ~1% (significant) performance gain on some of the above, plus we get to remove a dependency.	2023-07-24 23:41:57 +00:00
Charlie Marsh	057faabcdd	Use `Flags::intersects` rather than `Flags::contains` (#6007 ) ## Summary This is equivalent for a single flag, but I think it's more likely to be correct when the bitflags are modified -- the primary reason being that we sometimes define flags as the union of other flags, e.g.: ```rust const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() \| Self::RUNTIME_ANNOTATION.bits(); ``` In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_ flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)` requires that _at least one_ flag is set.	2023-07-23 02:59:31 +00:00
Charlie Marsh	963f240e46	Track unresolved references in the semantic model (#5902 ) ## Summary As part of my continued quest to separate semantic model-building from diagnostic emission, this PR moves our unresolved-reference rules to a deferred pass. So, rather than emitting diagnostics as we encounter unresolved references, we now track those unresolved references on the semantic model (just like resolved references), and after traversal, emit the relevant rules for any unresolved references.	2023-07-19 18:19:55 -04:00
Charlie Marsh	9834c69c98	Remove `__all__` enforcement rules out of binding phase (#5897 ) ## Summary This PR moves two rules (`invalid-all-format` and `invalid-all-object`) out of the name-binding phase, and into the dedicated pass over all bindings that occurs at the end of the `Checker`. This is part of my continued quest to separate the semantic model-building logic from the actual rule enforcement.	2023-07-19 21:18:47 +00:00
Charlie Marsh	a75a6de577	Use a boxed slice for `Export` struct (#5887 ) ## Summary The vector of names here is immutable -- we never push to it after initialization. Boxing reduces the size of the variant from 32 bytes to 24 bytes. (See: https://nnethercote.github.io/perf-book/type-sizes.html#boxed-slices.) It doesn't make a difference here, since it's not the largest variant, but it still seems like a prudent change (and I was considering adding another field to this variant, though I may no longer do so).	2023-07-19 11:45:04 -04:00
Charlie Marsh	1181d25e5a	Move a few more candidate rules to the deferred `Binding`-only pass (#5853 ) ## Summary No behavior change, but this is in theory more efficient, since we can just iterate over the flat `Binding` vector rather than having to iterate over binding chains via the `Scope`.	2023-07-19 00:59:02 +00:00
Charlie Marsh	7ffcd93afd	Move unused deletion tracking to deferred analysis (#5852 ) ## Summary This PR moves the "unused exception" rule out of the visitor and into a deferred check. When we can base rules solely on the semantic model, we probably should, as it greatly simplifies the `Checker` itself.	2023-07-18 20:43:12 -04:00
Charlie Marsh	9e1039f823	Enable attribute lookups via semantic model (#5536 ) ## Summary This PR enables us to resolve attribute accesses within files, at least for static and class methods. For example, we can now detect that this is a function access (and avoid a false-positive): ```python class Class: @staticmethod def error(): return ValueError("Something") # OK raise Class.error() ``` Closes #5487. Closes #5416.	2023-07-05 15:19:14 -04:00
Charlie Marsh	ecf61d49fa	Restore existing bindings when unbinding caught exceptions (#5256 ) ## Summary In the latest release, we made some improvements to the semantic model, but our modifications to exception-unbinding are causing some false-positives. For example: ```py try: v = 3 except ImportError as v: print(v) else: print(v) ``` In the latest release, we started unbinding `v` after the `except` handler. (We used to restore the existing binding, the `v = 3`, but this was quite complicated.) Because we don't have full branch analysis, we can't then know that `v` is still bound in the `else` branch. The solution here modifies `resolve_read` to skip-lookup when hitting unbound exceptions. So when store the "unbind" for `except ImportError as v`, we save the binding that it shadowed `v = 3`, and skip to that. Closes #5249. Closes #5250.	2023-06-21 12:53:58 -04:00
Charlie Marsh	310abc769d	Move `StarImport` to its own module (#5186 )	2023-06-20 13:12:46 -04:00
Charlie Marsh	94abf7f088	Rename `Importation` structs to `Import` (#5185 ) ## Summary I find "Importation" a bit awkward, it may not even be grammatically correct here.	2023-06-19 12:09:10 -04:00
Charlie Marsh	b3240dbfa2	Avoid propagating `BindingKind::Global` and `BindingKind::Nonlocal` (#5136 ) ## Summary This PR fixes a small quirk in the semantic model. Typically, when we see an import, like `import foo`, we create a `BindingKind::Importation` for it. However, if `foo` has been declared as a `global`, then we propagate the kind forward. So given: ```python global foo import foo ``` We'd create two bindings for `foo`, both with type `global`. This was originally borrowed from Pyflakes, and it exists to help avoid false-positives like: ```python def f(): global foo # Don't mark `foo` as "assigned but unused"! It's a global! foo = 1 ``` This PR removes that behavior, and instead tracks "Does this binding refer to a global?" as a flag. This is much cleaner, since it means we don't "lose" the identity of various bindings. As a very strange example of why this matters, consider: ```python def foo(): global Member from module import Member x: Member = 1 ``` `Member` is only used in a typing context, so we should flag it and say "move it to a `TYPE_CHECKING` block". However, when we go to analyze `from module import Member`, it has `BindingKind::Global`. So we don't even know that it's an import!	2023-06-16 11:06:59 -04:00
Charlie Marsh	fd1dfc3bfa	Add support for global and nonlocal symbol renames (#5134 ) ## Summary In #5074, we introduced an abstraction to support local symbol renames ("local" here refers to "within a module"). However, that abstraction didn't support `global` and `nonlocal` symbols. This PR extends it to those cases. Broadly, there are considerations. First, if we're renaming a symbol in a scope in which it is declared `global` or `nonlocal`. For example, given: ```python x = 1 def foo(): global x ``` Then when renaming `x` in `foo`, we need to detect that it's `global` and instead perform the rename starting from the module scope. Second, when renaming a symbol, we need to determine the scopes in which it is declared `global` or `nonlocal`. This is effectively the inverse of the above: when renaming `x` in the module scope, we need to detect that we should _also_ rename `x` in `foo`. To support these cases, the renaming algorithm was adjusted as follows: - When we start a rename in a scope, determine whether the symbol is declared `global` or `nonlocal` by looking for a `global` or `nonlocal` binding. If it is, start the rename in the defining scope. (This requires storing the defining scope on the `nonlocal` binding, which is new.) - We then perform the rename in the defining scope. - We then check whether the symbol was declared as `global` or `nonlocal` in any scopes, and perform the rename in those scopes too. (Thankfully, this doesn't need to be done recursively.) Closes #5092. ## Test Plan Added some additional snapshot tests.	2023-06-16 14:35:10 +00:00
Charlie Marsh	b9754bd5c5	Add autofix for `Set`-to-`AbstractSet` rewrite using reference tracking (#5074 ) ## Summary This PR enables autofix behavior for the `flake8-pyi` rule that asks you to alias `Set` to `AbstractSet` when importing `collections.abc.Set`. It's not the most important rule, but it's a good isolated test-case for local symbol renaming. The renaming algorithm is outlined in-detail in the `renamer.rs` module. But to demonstrate the behavior, here's the diff when running this fix over a complex file that exercises a few edge cases: ```diff --- a/foo.pyi +++ b/foo.pyi @@ -1,16 +1,16 @@ if True: - from collections.abc import Set + from collections.abc import Set as AbstractSet else: - Set = 1 + AbstractSet = 1 -x: Set = set() +x: AbstractSet = set() -x: Set +x: AbstractSet -del Set +del AbstractSet def f(): - print(Set) + print(AbstractSet) def Set(): pass ``` Making this work required resolving a bunch of edge cases in the semantic model that were causing us to "lose track" of references. For example, the above wasn't possible with our previous approach to handling deletions (#5071). Similarly, the `x: Set` "delayed annotation" tracking was enabled via #5070. And many of these edits would've failed if we hadn't changed `BindingKind` to always match the identifier range (#5090). So it's really the culmination of a bunch of changes over the course of the week. The main outstanding TODO is that this doesn't support `global` or `nonlocal` usages. I'm going to take a look at that tonight, but I'm comfortable merging this as-is. Closes #1106. Closes #5091.	2023-06-16 14:12:33 +00:00
Charlie Marsh	5ea3e42513	Always use identifier ranges to store bindings (#5110 ) ## Summary At present, when we store a binding, we include a `TextRange` alongside it. The `TextRange` _sometimes_ matches the exact range of the identifier to which the `Binding` is linked, but... not always. For example, given: ```python x = 1 ``` The binding we create _will_ use the range of `x`, because the left-hand side is an `Expr::Name`, which has a valid range on it. However, given: ```python try: pass except ValueError as e: pass ``` When we create a binding for `e`, we don't have a `TextRange`... The AST doesn't give us one. So we end up extracting it via lexing. This PR extends that pattern to the rest of the binding kinds, to ensure that whenever we create a binding, we always use the range of the bound name. This leads to better diagnostics in cases like pattern matching, whereby the diagnostic for "unused variable `x`" here used to include `x`, instead of just `x`: ```python def f(provided: int) -> int: match provided: case [_, x]: pass ``` This is _also_ required for symbol renames, since we track writes as bindings -- so we need to know the ranges of the bound symbols. By storing these bindings precisely, we can also remove the `binding.trimmed_range` abstraction -- since bindings already use the "trimmed range". To implement this behavior, I took some of our existing utilities (like the code we had for `except ValueError as e` above), migrated them from a full lexer to a zero-allocation lexer that _only_ identifies "identifiers", and moved the behavior into a trait, so we can now do `stmt.identifier(locator)` to get the range for the identifier. Honestly, we might end up discarding much of this if we decide to put ranges on all identifiers (https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we do, this will _still_ be a good change, because the lexer introduced here is useful beyond names (e.g., we use it find the `except` keyword in an exception handler, to find the `else` after a `for` loop, and so on). So, I'm fine committing this even if we end up changing our minds about the right approach. Closes #5090. ## Benchmarks No significant change, with one statistically significant improvement (-2.1654% on `linter/all-rules/large/dataset.py`): ``` linter/default-rules/numpy/globals.py time: [73.922 µs 73.955 µs 73.986 µs] thrpt: [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s] change: time: [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05) thrpt: [+0.3996% +0.4755% +0.5611%] Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild linter/default-rules/pydantic/types.py time: [1.4909 ms 1.4917 ms 1.4926 ms] thrpt: [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s] change: time: [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05) thrpt: [-0.3380% -0.2734% -0.2136%] Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe linter/default-rules/numpy/ctypeslib.py time: [688.97 µs 691.34 µs 694.15 µs] thrpt: [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s] change: time: [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05) thrpt: [+0.1468% +0.7351% +1.3461%] Change within noise threshold. Found 15 outliers among 100 measurements (15.00%) 1 (1.00%) low mild 2 (2.00%) high mild 12 (12.00%) high severe linter/default-rules/large/dataset.py time: [3.3872 ms 3.4032 ms 3.4191 ms] thrpt: [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s] change: time: [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05) thrpt: [-0.0905% +0.2642% +0.6469%] No change in performance detected. Found 20 outliers among 100 measurements (20.00%) 1 (1.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 13 (13.00%) high severe linter/all-rules/numpy/globals.py time: [148.99 µs 149.21 µs 149.42 µs] thrpt: [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s] change: time: [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05) thrpt: [+0.2785% +0.5094% +0.7395%] Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high severe linter/all-rules/pydantic/types.py time: [3.0362 ms 3.0396 ms 3.0441 ms] thrpt: [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s] change: time: [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05) thrpt: [-0.2121% -0.0618% +0.0958%] No change in performance detected. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 3 (3.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe linter/all-rules/numpy/ctypeslib.py time: [1.6879 ms 1.6894 ms 1.6909 ms] thrpt: [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s] change: time: [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05) thrpt: [-0.0435% +0.0889% +0.2284%] No change in performance detected. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) low mild 1 (1.00%) high severe linter/all-rules/large/dataset.py time: [7.1520 ms 7.1586 ms 7.1654 ms] thrpt: [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s] change: time: [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05) thrpt: [+1.8102% +2.2133% +2.6300%] Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild ```	2023-06-15 18:43:19 +00:00
Charlie Marsh	bae183b823	Rename `semantic_model` and `model` usages to `semantic` (#5097 ) ## Summary As discussed in Discord, and similar to oxc, we're going to refer to this as `.semantic()` everywhere. While I was auditing usages of `model: &SemanticModel`, I also changed as many function signatures as I could find to consistently take the model as the _last_ argument, rather than the first.	2023-06-14 15:01:51 -04:00
Charlie Marsh	c74ef77e85	Move binding accesses into `SemanticModel` method (#5084 )	2023-06-14 14:07:46 +00:00
Charlie Marsh	aa41ffcfde	Add `BindingKind` variants to represent deleted bindings (#5071 ) ## Summary Our current mechanism for handling deletions (e.g., `del x`) is to remove the symbol from the scope's `bindings` table. This "does the right thing", in that if we then reference a deleted symbol, we're able to determine that it's unbound -- but it causes a variety of problems, mostly in that it makes certain bindings and references unreachable after-the-fact. Consider: ```python x = 1 print(x) del x ``` If we analyze this code _after_ running the semantic model over the AST, we'll have no way of knowing that `x` was ever introduced in the scope, much less that it was bound to a value, read, and then deleted -- because we effectively erased `x` from the model entirely when we hit the deletion. In practice, this will make it impossible for us to support local symbol renames. It also means that certain rules that we want to move out of the model-building phase and into the "check dead scopes" phase wouldn't work today, since we'll have lost important information about the source code. This PR introduces two new `BindingKind` variants to model deletions: - `BindingKind::Deletion`, which represents `x = 1; del x`. - `BindingKind::UnboundException`, which represents: ```python try: 1 / 0 except Exception as e: pass ``` In the latter case, `e` gets unbound after the exception handler (assuming it's triggered), so we want to handle it similarly to a deletion. The main challenge here is auditing all of our existing `Binding` and `Scope` usages to understand whether they need to accommodate deletions or otherwise behave differently. If you look one commit back on this branch, you'll see that the code is littered with `NOTE(charlie)` comments that describe the reasoning behind changing (or not) each of those call sites. I've also augmented our test suite in preparation for this change over a few prior PRs. ### Alternatives As an alternative, I considered introducing a flag to `BindingFlags`, like `BindingFlags::UNBOUND`, and setting that at the appropriate time. This turned out to be a much more difficult change, because we tend to match on `BindingKind` all over the place (e.g., we have a bunch of code blocks that only run when a `BindingKind` is `BindingKind::Importation`). As a result, introducing these new `BindingKind` variants requires only a few changes at the client sites. Adding a flag would've required a much wider-reaching change.	2023-06-14 09:27:24 -04:00
Charlie Marsh	5c502a3320	Add documentation for `BindingKind` variants (#4989 )	2023-06-09 18:32:50 +00:00
Charlie Marsh	7b0fb1a3b4	Respect noqa directives on `ImportFrom` parents for type-checking rules (#4889 )	2023-06-06 02:37:07 +00:00
Charlie Marsh	8938b2d555	Use `qualified_name` terminology in more structs for consistency (#4873 )	2023-06-05 19:06:48 +00:00
Charlie Marsh	935094c2ff	Move import-name matching into methods on `BindingKind` (#4818 )	2023-06-03 15:01:27 -04:00
Charlie Marsh	26b1dd0ca2	Remove `name` field from import binding kinds (#4817 )	2023-06-02 23:02:47 -04:00
Charlie Marsh	fcdc7bdd33	Remove separate `ReferenceContext` enum (#4631 )	2023-05-24 15:12:38 +00:00
Charlie Marsh	5cedf0f724	Remove `ReferenceContext::Synthetic` (#4612 )	2023-05-24 14:30:35 +00:00
Charlie Marsh	8961d8eb6f	Track all read references in semantic model (#4610 )	2023-05-24 14:14:27 +00:00
Micha Reiser	652c644c2a	Introduce `ruff_index` crate (#4597 )	2023-05-23 17:40:35 +02:00
Micha Reiser	daadd24bde	Include decorators in `Function` and `Class` definition ranges (#4467 )	2023-05-22 17:50:42 +02:00
Jeong, YunWon	be6e00ef6e	Re-integrate RustPython parser repository (#4359 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2023-05-11 07:47:17 +00:00
Charlie Marsh	a9fc648faf	Use `NodeId` for `Binding` source (#4234 )	2023-05-06 16:20:08 +00:00
Charlie Marsh	c1f0661225	Replace `parents` statement stack with a `Nodes` abstraction (#4233 )	2023-05-06 16:12:41 +00:00
Charlie Marsh	64b7280eb8	Respect parent-scoping rules for `NamedExpr` assignments (#4145 )	2023-04-29 22:45:30 +00:00
Micha Reiser	cab65b25da	Replace row/column based `Location` with byte-offsets. (#3931 )	2023-04-26 18:11:02 +00:00
Micha Reiser	ba4f4f4672	Upgrade dependencies (#4064 )	2023-04-22 18:04:01 +01:00
Charlie Marsh	d919adc13c	Introduce a `ruff_python_semantic` crate (#3865 )	2023-04-04 16:50:47 +00:00

37 commits