ruff/crates/ruff_python_semantic at a33bbe63350791238b3ba1b33dec9467531c4ea1 - mirrors/ruff

mirror of https://github.com/astral-sh/ruff.git synced 2025-10-17 05:47:49 +00:00

History

Charlie Marsh a33bbe6335 Track "delayed" annotations in the semantic model (#5070 ) ## Summary This PR tackles a corner case that we'll need to support local symbol renaming. It relates to a nuance in how we want handle annotations (i.e., `AnnAssign` statements with no value, like `x: int` in a function body). When we see a statement like: ```python x: int ``` We create a `BindingKind::Annotation` for `x`. This is a special `BindingKind` that the resolver isn't allowed to return. For example, given: ```python x: int print(x) ``` The second line will yield an `undefined-name` error. So why does this `BindingKind` exist at all? In Pyflakes, to support the `unused-annotation` lint: ```python def f(): x: int # unused-annotation ``` If we don't track `BindingKind::Annotation`, we can't lint for unused variables that are only "defined" via annotations. There are a few other wrinkles to `BindingKind::Annotation`. One is that, if a binding already exists in the scope, we actually just discard the `BindingKind`. So in this case: ```python x = 1 x: int ``` When we go to create the `BindingKind::Annotation` for the second statement, we notice that (1) we're creating an annotation but (2) the scope already has binding for the name -- so we just drop the binding on the floor. This has the nice property that annotations aren't considered to "shadow" another binding, which is important in a bunch of places (e.g., if we have `import os; os: int`, we still consider `os` to be an import, as we should). But it also means that these "delayed" annotations are one of the few remaining references that we don't track anywhere in the semantic model. This PR adds explicit support for these via a new `delayed_annotations` attribute on the semantic model. These should be extremely rare, but we do need to track them if we want to support local symbol renaming. ### This isn't the right way to model this This isn't the right way to model this. Here's an alternative: - Remove `BindingKind::Annotation`, and treat annotations as their own, separate concept. - Instead of storing a map from name to `BindingId` on each `Scope`, store a map from name to... `SymbolId`. - Introduce a `Symbol` abstraction, where a symbol can point to a current binding, and a list of annotations, like: ```rust pub struct Symbol { binding: Option<BindingId>, annotations: Vec<AnnotationId> } ``` If we did this, we could appropriately model the semantics described above. When we go to resolve a binding, we ignore annotations (always). When we try to find unused variables, we look through the list of symbols, and have sufficient information to discriminate between annotations and bound variables. Etc. The main downside of this `Symbol`-based approach is that it's going to take a lot more work to implement, and it'll be less performant (we'll be storing more data per symbol, and our binding lookups will have an added layer of indirection).	2023-06-14 17:54:35 +00:00
..
src	Track "delayed" annotations in the semantic model (#5070 )	2023-06-14 17:54:35 +00:00
Cargo.toml	Use consistent `Cargo.toml` metadata in all crates (#5015 )	2023-06-12 00:02:40 +00:00

Track "delayed" annotations in the semantic model (#5070 )

## Summary

This PR tackles a corner case that we'll need to support local symbol
renaming. It relates to a nuance in how we want handle annotations
(i.e., `AnnAssign` statements with no value, like `x: int` in a function
body).

When we see a statement like:

```python
x: int
```

We create a `BindingKind::Annotation` for `x`. This is a special
`BindingKind` that the resolver isn't allowed to return. For example,
given:

```python
x: int
print(x)
```

The second line will yield an `undefined-name` error.

So why does this `BindingKind` exist at all? In Pyflakes, to support the
`unused-annotation` lint:

```python
def f():
    x: int  # unused-annotation
```

If we don't track `BindingKind::Annotation`, we can't lint for unused
variables that are only "defined" via annotations.

There are a few other wrinkles to `BindingKind::Annotation`. One is
that, if a binding already exists in the scope, we actually just discard
the `BindingKind`. So in this case:

```python
x = 1
x: int
```

When we go to create the `BindingKind::Annotation` for the second
statement, we notice that (1) we're creating an annotation but (2) the
scope already has binding for the name -- so we just drop the binding on
the floor. This has the nice property that annotations aren't considered
to "shadow" another binding, which is important in a bunch of places
(e.g., if we have `import os; os: int`, we still consider `os` to be an
import, as we should). But it also means that these "delayed"
annotations are one of the few remaining references that we don't track
anywhere in the semantic model.

This PR adds explicit support for these via a new `delayed_annotations`
attribute on the semantic model. These should be extremely rare, but we
do need to track them if we want to support local symbol renaming.

### This isn't the right way to model this

This isn't the right way to model this.

Here's an alternative:

- Remove `BindingKind::Annotation`, and treat annotations as their own,
separate concept.
- Instead of storing a map from name to `BindingId` on each `Scope`,
store a map from name to... `SymbolId`.
- Introduce a `Symbol` abstraction, where a symbol can point to a
current binding, and a list of annotations, like:

```rust
pub struct Symbol {
  binding: Option<BindingId>,
  annotations: Vec<AnnotationId>
}
```

If we did this, we could appropriately model the semantics described
above. When we go to resolve a binding, we ignore annotations (always).
When we try to find unused variables, we look through the list of
symbols, and have sufficient information to discriminate between
annotations and bound variables. Etc.

The main downside of this `Symbol`-based approach is that it's going to
take a lot more work to implement, and it'll be less performant (we'll
be storing more data per symbol, and our binding lookups will have an
added layer of indirection).

2023-06-14 17:54:35 +00:00

src Track "delayed" annotations in the semantic model (#5070 ) 2023-06-14 17:54:35 +00:00

Cargo.toml Use consistent Cargo.toml metadata in all crates (#5015 ) 2023-06-12 00:02:40 +00:00