## Summary
Check for unused private `TypeVar`. See [original
implementation](2a86db8271/pyi.py (L1958)).
```
$ flake8 --select Y018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: Y018 TypeVar "_T" is not used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: Y018 TypeVar "_P" is not used
```
```
$ ./target/debug/ruff --select PYI018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: PYI018 TypeVar `_T` is never used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: PYI018 TypeVar `_P` is never used
Found 2 errors.
```
In the file `unused_private_type_declaration.rs`, I'm planning to add
other rules that are similar to `PYI018` like the `PYI046`, `PYI047` and
`PYI049`.
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
This is equivalent for a single flag, but I think it's more likely to be
correct when the bitflags are modified -- the primary reason being that
we sometimes define flags as the union of other flags, e.g.:
```rust
const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() | Self::RUNTIME_ANNOTATION.bits();
```
In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_
flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)`
requires that _at least one_ flag is set.
## Summary
As part of my continued quest to separate semantic model-building from
diagnostic emission, this PR moves our unresolved-reference rules to a
deferred pass. So, rather than emitting diagnostics as we encounter
unresolved references, we now track those unresolved references on the
semantic model (just like resolved references), and after traversal,
emit the relevant rules for any unresolved references.
## Summary
This PR moves two rules (`invalid-all-format` and `invalid-all-object`)
out of the name-binding phase, and into the dedicated pass over all
bindings that occurs at the end of the `Checker`. This is part of my
continued quest to separate the semantic model-building logic from the
actual rule enforcement.
## Summary
The vector of names here is immutable -- we never push to it after
initialization. Boxing reduces the size of the variant from 32 bytes to
24 bytes. (See:
https://nnethercote.github.io/perf-book/type-sizes.html#boxed-slices.)
It doesn't make a difference here, since it's not the largest variant,
but it still seems like a prudent change (and I was considering adding
another field to this variant, though I may no longer do so).
## Summary
No behavior change, but this is in theory more efficient, since we can
just iterate over the flat `Binding` vector rather than having to
iterate over binding chains via the `Scope`.
## Summary
This PR moves the "unused exception" rule out of the visitor and into a
deferred check. When we can base rules solely on the semantic model, we
probably should, as it greatly simplifies the `Checker` itself.
## Summary
This PR enables us to resolve attribute accesses within files, at least
for static and class methods. For example, we can now detect that this
is a function access (and avoid a false-positive):
```python
class Class:
@staticmethod
def error():
return ValueError("Something")
# OK
raise Class.error()
```
Closes#5487.
Closes#5416.
## Summary
In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:
```py
try:
v = 3
except ImportError as v:
print(v)
else:
print(v)
```
In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.
The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.
Closes#5249.
Closes#5250.
## Summary
This PR fixes a small quirk in the semantic model. Typically, when we
see an import, like `import foo`, we create a `BindingKind::Importation`
for it. However, if `foo` has been declared as a `global`, then we
propagate the kind forward. So given:
```python
global foo
import foo
```
We'd create two bindings for `foo`, both with type `global`.
This was originally borrowed from Pyflakes, and it exists to help avoid
false-positives like:
```python
def f():
global foo
# Don't mark `foo` as "assigned but unused"! It's a global!
foo = 1
```
This PR removes that behavior, and instead tracks "Does this binding
refer to a global?" as a flag. This is much cleaner, since it means we
don't "lose" the identity of various bindings.
As a very strange example of why this matters, consider:
```python
def foo():
global Member
from module import Member
x: Member = 1
```
`Member` is only used in a typing context, so we should flag it and say
"move it to a `TYPE_CHECKING` block". However, when we go to analyze
`from module import Member`, it has `BindingKind::Global`. So we don't
even know that it's an import!
## Summary
In #5074, we introduced an abstraction to support local symbol renames
("local" here refers to "within a module"). However, that abstraction
didn't support `global` and `nonlocal` symbols. This PR extends it to
those cases.
Broadly, there are considerations.
First, if we're renaming a symbol in a scope in which it is declared
`global` or `nonlocal`. For example, given:
```python
x = 1
def foo():
global x
```
Then when renaming `x` in `foo`, we need to detect that it's `global`
and instead perform the rename starting from the module scope.
Second, when renaming a symbol, we need to determine the scopes in which
it is declared `global` or `nonlocal`. This is effectively the inverse
of the above: when renaming `x` in the module scope, we need to detect
that we should _also_ rename `x` in `foo`.
To support these cases, the renaming algorithm was adjusted as follows:
- When we start a rename in a scope, determine whether the symbol is
declared `global` or `nonlocal` by looking for a `global` or `nonlocal`
binding. If it is, start the rename in the defining scope. (This
requires storing the defining scope on the `nonlocal` binding, which is
new.)
- We then perform the rename in the defining scope.
- We then check whether the symbol was declared as `global` or
`nonlocal` in any scopes, and perform the rename in those scopes too.
(Thankfully, this doesn't need to be done recursively.)
Closes#5092.
## Test Plan
Added some additional snapshot tests.
## Summary
This PR enables autofix behavior for the `flake8-pyi` rule that asks you
to alias `Set` to `AbstractSet` when importing `collections.abc.Set`.
It's not the most important rule, but it's a good isolated test-case for
local symbol renaming.
The renaming algorithm is outlined in-detail in the `renamer.rs` module.
But to demonstrate the behavior, here's the diff when running this fix
over a complex file that exercises a few edge cases:
```diff
--- a/foo.pyi
+++ b/foo.pyi
@@ -1,16 +1,16 @@
if True:
- from collections.abc import Set
+ from collections.abc import Set as AbstractSet
else:
- Set = 1
+ AbstractSet = 1
-x: Set = set()
+x: AbstractSet = set()
-x: Set
+x: AbstractSet
-del Set
+del AbstractSet
def f():
- print(Set)
+ print(AbstractSet)
def Set():
pass
```
Making this work required resolving a bunch of edge cases in the
semantic model that were causing us to "lose track" of references. For
example, the above wasn't possible with our previous approach to
handling deletions (#5071). Similarly, the `x: Set` "delayed annotation"
tracking was enabled via #5070. And many of these edits would've failed
if we hadn't changed `BindingKind` to always match the identifier range
(#5090). So it's really the culmination of a bunch of changes over the
course of the week.
The main outstanding TODO is that this doesn't support `global` or
`nonlocal` usages. I'm going to take a look at that tonight, but I'm
comfortable merging this as-is.
Closes#1106.
Closes#5091.
## Summary
At present, when we store a binding, we include a `TextRange` alongside
it. The `TextRange` _sometimes_ matches the exact range of the
identifier to which the `Binding` is linked, but... not always.
For example, given:
```python
x = 1
```
The binding we create _will_ use the range of `x`, because the left-hand
side is an `Expr::Name`, which has a valid range on it.
However, given:
```python
try:
pass
except ValueError as e:
pass
```
When we create a binding for `e`, we don't have a `TextRange`... The AST
doesn't give us one. So we end up extracting it via lexing.
This PR extends that pattern to the rest of the binding kinds, to ensure
that whenever we create a binding, we always use the range of the bound
name. This leads to better diagnostics in cases like pattern matching,
whereby the diagnostic for "unused variable `x`" here used to include
`*x`, instead of just `x`:
```python
def f(provided: int) -> int:
match provided:
case [_, *x]:
pass
```
This is _also_ required for symbol renames, since we track writes as
bindings -- so we need to know the ranges of the bound symbols.
By storing these bindings precisely, we can also remove the
`binding.trimmed_range` abstraction -- since bindings already use the
"trimmed range".
To implement this behavior, I took some of our existing utilities (like
the code we had for `except ValueError as e` above), migrated them from
a full lexer to a zero-allocation lexer that _only_ identifies
"identifiers", and moved the behavior into a trait, so we can now do
`stmt.identifier(locator)` to get the range for the identifier.
Honestly, we might end up discarding much of this if we decide to put
ranges on all identifiers
(https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we
do, this will _still_ be a good change, because the lexer introduced
here is useful beyond names (e.g., we use it find the `except` keyword
in an exception handler, to find the `else` after a `for` loop, and so
on). So, I'm fine committing this even if we end up changing our minds
about the right approach.
Closes#5090.
## Benchmarks
No significant change, with one statistically significant improvement
(-2.1654% on `linter/all-rules/large/dataset.py`):
```
linter/default-rules/numpy/globals.py
time: [73.922 µs 73.955 µs 73.986 µs]
thrpt: [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s]
change:
time: [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05)
thrpt: [+0.3996% +0.4755% +0.5611%]
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
linter/default-rules/pydantic/types.py
time: [1.4909 ms 1.4917 ms 1.4926 ms]
thrpt: [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s]
change:
time: [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05)
thrpt: [-0.3380% -0.2734% -0.2136%]
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
linter/default-rules/numpy/ctypeslib.py
time: [688.97 µs 691.34 µs 694.15 µs]
thrpt: [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s]
change:
time: [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05)
thrpt: [+0.1468% +0.7351% +1.3461%]
Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
2 (2.00%) high mild
12 (12.00%) high severe
linter/default-rules/large/dataset.py
time: [3.3872 ms 3.4032 ms 3.4191 ms]
thrpt: [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s]
change:
time: [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05)
thrpt: [-0.0905% +0.2642% +0.6469%]
No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
1 (1.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
13 (13.00%) high severe
linter/all-rules/numpy/globals.py
time: [148.99 µs 149.21 µs 149.42 µs]
thrpt: [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s]
change:
time: [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05)
thrpt: [+0.2785% +0.5094% +0.7395%]
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
linter/all-rules/pydantic/types.py
time: [3.0362 ms 3.0396 ms 3.0441 ms]
thrpt: [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s]
change:
time: [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05)
thrpt: [-0.2121% -0.0618% +0.0958%]
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
3 (3.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
time: [1.6879 ms 1.6894 ms 1.6909 ms]
thrpt: [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s]
change:
time: [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05)
thrpt: [-0.0435% +0.0889% +0.2284%]
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) low mild
1 (1.00%) high severe
linter/all-rules/large/dataset.py
time: [7.1520 ms 7.1586 ms 7.1654 ms]
thrpt: [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s]
change:
time: [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05)
thrpt: [+1.8102% +2.2133% +2.6300%]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild
```
## Summary
As discussed in Discord, and similar to oxc, we're going to refer to
this as `.semantic()` everywhere.
While I was auditing usages of `model: &SemanticModel`, I also changed
as many function signatures as I could find to consistently take the
model as the _last_ argument, rather than the first.
## Summary
Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.
Consider:
```python
x = 1
print(x)
del x
```
If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.
In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.
This PR introduces two new `BindingKind` variants to model deletions:
- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:
```python
try:
1 / 0
except Exception as e:
pass
```
In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.
The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.
### Alternatives
As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.
This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.