ruff/crates/ruff_python_semantic
Charlie Marsh a33bbe6335
Track "delayed" annotations in the semantic model (#5070)
## Summary

This PR tackles a corner case that we'll need to support local symbol
renaming. It relates to a nuance in how we want handle annotations
(i.e., `AnnAssign` statements with no value, like `x: int` in a function
body).

When we see a statement like:

```python
x: int
```

We create a `BindingKind::Annotation` for `x`. This is a special
`BindingKind` that the resolver isn't allowed to return. For example,
given:

```python
x: int
print(x)
```

The second line will yield an `undefined-name` error.

So why does this `BindingKind` exist at all? In Pyflakes, to support the
`unused-annotation` lint:

```python
def f():
    x: int  # unused-annotation
```

If we don't track `BindingKind::Annotation`, we can't lint for unused
variables that are only "defined" via annotations.

There are a few other wrinkles to `BindingKind::Annotation`. One is
that, if a binding already exists in the scope, we actually just discard
the `BindingKind`. So in this case:

```python
x = 1
x: int
```

When we go to create the `BindingKind::Annotation` for the second
statement, we notice that (1) we're creating an annotation but (2) the
scope already has binding for the name -- so we just drop the binding on
the floor. This has the nice property that annotations aren't considered
to "shadow" another binding, which is important in a bunch of places
(e.g., if we have `import os; os: int`, we still consider `os` to be an
import, as we should). But it also means that these "delayed"
annotations are one of the few remaining references that we don't track
anywhere in the semantic model.

This PR adds explicit support for these via a new `delayed_annotations`
attribute on the semantic model. These should be extremely rare, but we
do need to track them if we want to support local symbol renaming.

### This isn't the right way to model this

This isn't the right way to model this.

Here's an alternative:

- Remove `BindingKind::Annotation`, and treat annotations as their own,
separate concept.
- Instead of storing a map from name to `BindingId` on each `Scope`,
store a map from name to... `SymbolId`.
- Introduce a `Symbol` abstraction, where a symbol can point to a
current binding, and a list of annotations, like:

```rust
pub struct Symbol {
  binding: Option<BindingId>,
  annotations: Vec<AnnotationId>
}
```

If we did this, we could appropriately model the semantics described
above. When we go to resolve a binding, we ignore annotations (always).
When we try to find unused variables, we look through the list of
symbols, and have sufficient information to discriminate between
annotations and bound variables. Etc.

The main downside of this `Symbol`-based approach is that it's going to
take a lot more work to implement, and it'll be less performant (we'll
be storing more data per symbol, and our binding lookups will have an
added layer of indirection).
2023-06-14 17:54:35 +00:00
..
src Track "delayed" annotations in the semantic model (#5070) 2023-06-14 17:54:35 +00:00
Cargo.toml Use consistent Cargo.toml metadata in all crates (#5015) 2023-06-12 00:02:40 +00:00