When attempting to determine whether `import foo.bar.baz` is a known
first-party import relative to [user-provided source
paths](https://docs.astral.sh/ruff/settings/#src), when `preview` is
enabled we now check that `SRC/foo/bar/baz` is a directory or
`SRC/foo/bar/baz.py` or `SRC/foo/bar/baz.pyi` exist.
Previously, we just checked the analogous thing for `SRC/foo`, but this
can be misleading in situations with disjoint namespace packages that
share a common base name (e.g. we may be working inside the namespace
package `foo.buzz` and importing `foo.bar` from elsewhere).
Supersedes #12987Closes#12984
## Summary
This PR updates `F811` rule to include assignment as possible shadowed
binding. This will fix issue: #11828 .
## Test Plan
Add a test file, F811_30.py, which includes a redefinition after an
assignment and a verified snapshot file.
* Potentially resolves#11619 (nondeterministic hashmap order across
different architectures) in F401 by replacing a hashmap with
nondeterministic traversal order with an ordered mapping.
I'm not sure how to test this with our CI/CD. I don't have an s390x
machine at home. Should I try it in Qemu?
## Summary
This PR ensures that if a variable is bound via `global`, and then the
`global` is read, the originating variable is also marked as read. It's
not perfect, in that it won't detect _rebindings_, like:
```python
from app import redis_connection
def func():
global redis_connection
redis_connection = 1
redis_connection()
```
So, above, `redis_connection` is still marked as unused.
But it does avoid flagging `redis_connection` as unused in:
```python
from app import redis_connection
def func():
global redis_connection
redis_connection()
```
Closes https://github.com/astral-sh/ruff/issues/11518.
## Summary
Given `del X`, we'll typically add a `BindingKind::Deletion` to `X` to
shadow the current binding. However, if the deletion is inside of a
conditional operation, we _won't_, as in:
```python
def f():
global X
if X > 0:
del X
```
We will, however, track it as a reference to the binding. This PR adds
the expression context to those resolved references, so that we can
detect that the `X` in `global X` was "assigned to".
Closes https://github.com/astral-sh/ruff/issues/10397.
## Summary
When you try to remove an internal representation leaking into another
type and end up rewriting a simple version of `smallvec`.
The goal of this PR is to replace the `Box<[&'a str]>` with
`Box<QualifiedName>` to avoid that the internal `QualifiedName`
representation leaks (and it gives us a nicer API too). However, doing
this when `QualifiedName` uses `SmallVec` internally gives us all sort
of funny lifetime errors. I was lost but @BurntSushi came to rescue me.
He figured out that `smallvec` has a variance problem which is already
tracked in https://github.com/servo/rust-smallvec/issues/146
To fix the variants problem, I could use the smallvec-2-alpha-4 or
implement our own smallvec. I went with implementing our own small vec
for this specific problem. It obviously isn't as sophisticated as
smallvec (only uses safe code), e.g. it doesn't perform any size
optimizations, but it does its job.
Other changes:
* Removed `Imported::qualified_name` (the version that returns a
`String`). This can be replaced by calling `ToString` on the qualified
name.
* Renamed `Imported::call_path` to `qualified_name` and changed its
return type to `&QualifiedName`.
* Renamed `QualifiedName::imported` to `user_defined` which is the more
common term when talking about builtins vs the rest/user defined
functions.
## Test plan
`cargo test`
## Summary
Charlie can probably explain this better than I but it turns out,
`CallPath` is used for two different things:
* To represent unqualified names like `version` where `version` can be a
local variable or imported (e.g. `from sys import version` where the
full qualified name is `sys.version`)
* To represent resolved, full qualified names
This PR splits `CallPath` into two types to make this destinction clear.
> Note: I haven't renamed all `call_path` variables to `qualified_name`
or `unqualified_name`. I can do that if that's welcomed but I first want
to get feedback on the approach and naming overall.
## Test Plan
`cargo test`
## Summary
This PR changes the `CallPath` type alias to a newtype wrapper.
A newtype wrapper allows us to limit the API and to experiment with
alternative ways to implement matching on `CallPath`s.
## Test Plan
`cargo test`
## Summary
I was surprised to learn that we treat `x` in `[_ for x in y]` as an
"assignment" binding kind, rather than a dedicated comprehension
variable.
## Summary
An assignment can be _both_ (e.g.) a loop variable _and_ assigned via
unpacking. In other words, unpacking is a quality of an assignment, not
a _kind_.
## Summary
The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.
Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.
## Test Plan
`cargo test`
## Summary
This PR is a follow-up to the suggestion in
https://github.com/astral-sh/ruff/pull/6345#discussion_r1285470953 to
use a single stack to store all statements and expressions, rather than
using separate vectors for each, which gives us something closer to a
full-fidelity chain. (We can then generalize this concept to include all
other AST nodes too.)
This is in part made possible by the removal of the hash map from
`&Stmt` to `StatementId` (#6694), which makes it much cheaper to store
these using a single interface (since doing so no longer introduces the
requirement that we hash all expressions).
I'll follow-up with some profiling, but a few notes on how the data
requirements have changed:
- We now store a `BranchId` for every expression, not just every
statement, so that's an extra `u32`.
- We now store a single `NodeId` on every snapshot, rather than separate
`StatementId` and `ExpressionId` IDs, so that's one fewer `u32` for each
snapshot.
- We're probably doing a few more lookups in general, since any calls to
`current_statement()` etc. now have to iterate up the node hierarchy
until they identify the first statement.
## Test Plan
`cargo test`
## Summary
I noticed some inconsistencies around uses of `.range.start()`, structs
that have a `TextRange` field but don't implement `Ranged`, etc.
## Test Plan
`cargo test`
## Summary
This PR fixes the performance degradation introduced in
https://github.com/astral-sh/ruff/pull/6345. Instead of using the
generic `Nodes` structs, we now use separate `Statement` and
`Expression` structs. Importantly, we can avoid tracking a bunch of
state for expressions that we need for parents: we don't need to track
reference-to-ID pointers (we just have no use-case for this -- I'd
actually like to remove this from statements too, but we need it for
branch detection right now), we don't need to track depth, etc.
In my testing, this entirely removes the regression on all-rules, and
gets us down to 2ms slower on the default rules (as a crude hyperfine
benchmark, so this is within margin of error IMO).
No behavioral changes.
## Summary
This PR attempts to draw a clearer divide between "methods that take
(e.g.) an expression or statement as input" and "methods that rely on
the _current_ expression or statement" in the semantic model, by
renaming methods like `stmt()` to `current_statement()`.
This had led to confusion in the past. For example, prior to this PR, we
had `scope()` (which returns the current scope), and `parent_scope`,
which returns the parent _of a scope that's passed in_. Now, the API is
clearer: `current_scope` returns the current scope, and `parent_scope`
takes a scope as argument and returns its parent.
Per above, I also changed `stmt` to `statement` and `expr` to
`expression`.
## Summary
Historically, we've stored "qualified names" on our
`BindingKind::Import`, `BindingKind::SubmoduleImport`, and
`BindingKind::ImportFrom` structs. In Ruff, a "qualified name" is a
dot-separated path to a symbol. For example, given `import foo.bar`, the
"qualified name" would be `"foo.bar"`; and given `from foo.bar import
baz`, the "qualified name" would be `foo.bar.baz`.
This PR modifies the `BindingKind` structs to instead store _call paths_
rather than qualified names. So in the examples above, we'd store
`["foo", "bar"]` and `["foo", "bar", "baz"]`. It turns out that this
more efficient given our data access patterns. Namely, we frequently
need to convert the qualified name to a call path (whenever we call
`resolve_call_path`), and it turns out that we do this operation enough
that those conversations show up on benchmarks.
There are a few other advantages to using call paths, rather than
qualified names:
1. The size of `BindingKind` is reduced from 32 to 24 bytes, since we no
longer need to store a `String` (only a boxed slice).
2. All three import types are more consistent, since they now all store
a boxed slice, rather than some storing an `&str` and some storing a
`String` (for `BindingKind::ImportFrom`, we needed to allocate a
`String` to create the qualified name, but the call path is a slice of
static elements that don't require that allocation).
3. A lot of code gets simpler, in part because we now do call path
resolution "earlier". Most notably, for relative imports (`from .foo
import bar`), we store the _resolved_ call path rather than the relative
call path, so the semantic model doesn't have to deal with that
resolution. (See that `resolve_call_path` is simpler, fewer branches,
etc.)
In my testing, this change improves the all-rules benchmark by another
4-5% on top of the improvements mentioned in #6047.
Requires https://github.com/astral-sh/RustPython-Parser/pull/42
Related https://github.com/PyCQA/pyflakes/pull/778
[PEP-695](https://peps.python.org/pep-0695)
Part of #5062
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Adds a scope for type parameters, a type parameter binding kind, and
checker visitation of type parameters in type alias statements, function
definitions, and class definitions.
A few changes were necessary to ensure correctness following the
insertion of a new scope between function and class scopes and their
parent.
## Test Plan
<!-- How was it tested? -->
Undefined name snapshots.
Unused type parameter rule will be added as follow-up.
## Summary
Check for unused private `TypeVar`. See [original
implementation](2a86db8271/pyi.py (L1958)).
```
$ flake8 --select Y018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: Y018 TypeVar "_T" is not used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: Y018 TypeVar "_P" is not used
```
```
$ ./target/debug/ruff --select PYI018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: PYI018 TypeVar `_T` is never used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: PYI018 TypeVar `_P` is never used
Found 2 errors.
```
In the file `unused_private_type_declaration.rs`, I'm planning to add
other rules that are similar to `PYI018` like the `PYI046`, `PYI047` and
`PYI049`.
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
This is equivalent for a single flag, but I think it's more likely to be
correct when the bitflags are modified -- the primary reason being that
we sometimes define flags as the union of other flags, e.g.:
```rust
const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() | Self::RUNTIME_ANNOTATION.bits();
```
In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_
flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)`
requires that _at least one_ flag is set.
## Summary
As part of my continued quest to separate semantic model-building from
diagnostic emission, this PR moves our unresolved-reference rules to a
deferred pass. So, rather than emitting diagnostics as we encounter
unresolved references, we now track those unresolved references on the
semantic model (just like resolved references), and after traversal,
emit the relevant rules for any unresolved references.
## Summary
This PR moves two rules (`invalid-all-format` and `invalid-all-object`)
out of the name-binding phase, and into the dedicated pass over all
bindings that occurs at the end of the `Checker`. This is part of my
continued quest to separate the semantic model-building logic from the
actual rule enforcement.
## Summary
The vector of names here is immutable -- we never push to it after
initialization. Boxing reduces the size of the variant from 32 bytes to
24 bytes. (See:
https://nnethercote.github.io/perf-book/type-sizes.html#boxed-slices.)
It doesn't make a difference here, since it's not the largest variant,
but it still seems like a prudent change (and I was considering adding
another field to this variant, though I may no longer do so).
## Summary
No behavior change, but this is in theory more efficient, since we can
just iterate over the flat `Binding` vector rather than having to
iterate over binding chains via the `Scope`.
## Summary
This PR moves the "unused exception" rule out of the visitor and into a
deferred check. When we can base rules solely on the semantic model, we
probably should, as it greatly simplifies the `Checker` itself.
## Summary
This PR enables us to resolve attribute accesses within files, at least
for static and class methods. For example, we can now detect that this
is a function access (and avoid a false-positive):
```python
class Class:
@staticmethod
def error():
return ValueError("Something")
# OK
raise Class.error()
```
Closes#5487.
Closes#5416.
## Summary
In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:
```py
try:
v = 3
except ImportError as v:
print(v)
else:
print(v)
```
In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.
The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.
Closes#5249.
Closes#5250.