Commit graph

361 commits

Author SHA1 Message Date
Charlie Marsh
2d505e2b04
Remove suite body tracking from SemanticModel (#5848)
## Summary

The `SemanticModel` currently stores the "body" of a given `Suite`,
along with the current statement index. This is used to support "next
sibling" queries, but we only use this in exactly one place -- the rule
that simplifies constructs like this to `any` or `all`:

```python
for x in y:
    if x == 0:
        return True
return False
```

Instead of tracking the state, we can just do a (slightly more
expensive) traversal, by finding the node within its parent and
returning the next node in the body.

Note that we'll only have to do this extremely rarely -- namely, for
functions that contain something like:

```python
for x in y:
    if x == 0:
        return True
```
2023-07-18 18:58:31 -04:00
konsti
730e6b2b4c
Refactor StmtIf: Formatter and Linter (#5459)
## Summary

Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
    pub range: TextRange,
    pub test: Box<Expr>,
    pub body: Vec<Stmt>,
    pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
    x = 1
else:
    if cond2:
        x = 2
```
and 
```python
if cond1:
    x = 1
elif cond2:
    x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.

We change the representation to a flat

```rust
pub struct StmtIf {
    pub range: TextRange,
    pub test: Box<Expr>,
    pub body: Vec<Stmt>,
    pub elif_else_clauses: Vec<ElifElseClause>,
}

pub struct ElifElseClause {
    pub range: TextRange,
    pub test: Option<Expr>,
    pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.

This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).

## Merge Plan

This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.

* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
 * Merge https://github.com/astral-sh/RustPython-Parser/pull/22
 * Create tag in the parser, update linter+formatter PR
 * Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-07-18 13:40:15 +02:00
David Szotten
52aa2fc875
upgrade rustpython to remove tuple-constants (#5840)
c.f. https://github.com/astral-sh/RustPython-Parser/pull/28

Tests: No snapshots changed

---------

Co-authored-by: Zanie <contact@zanie.dev>
2023-07-17 22:50:31 +00:00
Charlie Marsh
be6c744856
Include function name in undocumented-param message (#5818)
Closes #5814.
2023-07-16 22:51:34 -04:00
Charlie Marsh
06b5c6c06f
Use SmallVec#extend_from_slice in lieu of SmallVec#extend (#5793)
## Summary

There's a note in the docs that suggests this can be faster, and in the
benchmarks it... seems like it is? Might just be noise but held up over
a few runs.

Before:

<img width="1792" alt="Screen Shot 2023-07-15 at 9 10 06 PM"
src="973cd955-d4e6-4ae3-898e-90b7eb52ecf2">

After:

<img width="1792" alt="Screen Shot 2023-07-15 at 9 10 09 PM"
src="1491b391-d219-48e9-aa47-110bc7dc7f90">
2023-07-15 21:25:12 -04:00
Charlie Marsh
3dc73395ea
Move Literal flag detection into recurse phase (#5768)
## Summary

The AST pass is broken up into three phases: pre-visit (which includes
analysis), recurse (visit all members), and post-visit (clean-up). We're
not supposed to edit semantic model flags in the pre-visit phase, but it
looks like we were for literal detection. This didn't matter in
practice, but I'm looking into some AST refactors for which this _does_
cause issues.

No behavior changes expected.

## Test Plan

Good test coverage on these.
2023-07-15 02:04:15 +00:00
Charlie Marsh
932c9a4789
Extend PEP 604 rewrites to support some quoted annotations (#5725)
## Summary

Python doesn't allow `"Foo" | None` if the annotation will be evaluated
at runtime (see the comments in the PR, or the semantic model
documentation for more on what this means and when it is true), but it
_does_ allow it if the annotation is typing-only.

This, for example, is invalid, as Python will evaluate `"Foo" | None` at
runtime in order to
populate the function's `__annotations__`:

```python
def f(x: "Foo" | None): ...
```

This, however, is valid:

```python
def f():
    x: "Foo" | None
```

As is this:

```python
from __future__ import annotations

def f(x: "Foo" | None): ...
```

Closes #5706.
2023-07-13 07:34:04 -04:00
Charlie Marsh
bf4b96c5de
Differentiate between runtime and typing-time annotations (#5575)
## Summary

In Python, the annotations on `x` and `y` here have very different
treatment:

```python
def foo(x: int):
  y: int
```

The `int` in `x: int` is a runtime-required annotation, because `x` gets
added to the function's `__annotations__`. You'll notice, for example,
that this fails:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
  from foo import Bar

def f(x: Bar):
  ...
```

Because `Bar` is required to be available at runtime, not just at typing
time. Meanwhile, this succeeds:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
  from foo import Bar

def f():
  x: Bar = 1

f()
```

(Both cases are fine if you use `from __future__ import annotations`.)

Historically, we've tracked those annotations that are _not_
runtime-required via the semantic model's `ANNOTATION` flag. But
annotations that _are_ runtime-required have been treated as "type
definitions" that aren't annotations.

This causes problems for the flake8-future-annotations rules, which try
to detect whether adding `from __future__ import annotations` would
_allow_ you to rewrite a type annotation. We need to know whether we're
in _any_ type annotation, runtime-required or not, since adding `from
__future__ import annotations` will convert any runtime-required
annotation to a typing-only annotation.

This PR adds separate state to track these runtime-required annotations.
The changes in the test fixtures are correct -- these were false
negatives before.

Closes https://github.com/astral-sh/ruff/issues/5574.
2023-07-07 00:21:44 -04:00
Charlie Marsh
9e1039f823
Enable attribute lookups via semantic model (#5536)
## Summary

This PR enables us to resolve attribute accesses within files, at least
for static and class methods. For example, we can now detect that this
is a function access (and avoid a false-positive):

```python
class Class:
    @staticmethod
    def error():
        return ValueError("Something")


# OK
raise Class.error()
```

Closes #5487.

Closes #5416.
2023-07-05 15:19:14 -04:00
Charlie Marsh
00fbbe4223
Remove some additional manual iterator matches (#5482)
## Summary

I've done a few of these PRs, I thought I'd caught them all, but missed
this pattern.
2023-07-03 16:29:59 +00:00
Anders Kaseorg
df13e69c3c
Format let-else with rustfmt nightly (#5461)
Support for `let…else` formatting was just merged to nightly
(rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02
to pick this up. Followup to #939.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2023-07-03 02:13:35 +00:00
Charlie Marsh
0e89c94947
Run shadowed-variable analyses in deferred handlers (#5181)
## Summary

This PR extracts a bunch of complex logic from `add_binding`, instead
running the the shadowing rules in the deferred handler, thereby
decoupling the binding phase (during which we build up the semantic
model) from the analysis phase, and generally making `add_binding` much
more focused.

This was made possible by improving the semantic model to better handle
deletions -- previously, we'd "lose track" of bindings if they were
deleted, which made this kind of refactor impossible.

## Test Plan

We have good automated coverage for this, but I want to benchmark it
separately.
2023-06-29 00:08:18 +00:00
Charlie Marsh
cdbd0bd5cd
Respect abc decorators when classifying function types (#5315)
Closes #5307.
2023-06-22 19:52:36 +00:00
Charlie Marsh
ecf61d49fa
Restore existing bindings when unbinding caught exceptions (#5256)
## Summary

In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:

```py
try:
    v = 3
except ImportError as v:
    print(v)
else:
    print(v)
```

In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.

The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.

Closes #5249.

Closes #5250.
2023-06-21 12:53:58 -04:00
Charlie Marsh
310abc769d
Move StarImport to its own module (#5186) 2023-06-20 13:12:46 -04:00
Charlie Marsh
36e01ad6eb
Upgrade RustPython (#5192)
## Summary

This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
2023-06-19 21:09:53 +00:00
Charlie Marsh
94abf7f088
Rename *Importation structs to *Import (#5185)
## Summary

I find "Importation" a bit awkward, it may not even be grammatically
correct here.
2023-06-19 12:09:10 -04:00
Charlie Marsh
a6cf31cc89
Move dead_scopes to deferred.scopes (#5171)
## Summary

This is more consistent with the rest of the `deferred` patterns.
2023-06-18 15:57:38 +00:00
Charlie Marsh
d0ad1ed0af
Replace static CallPath vectors with matches! macros (#5148)
## Summary

After #5140, I audited the codebase for similar patterns (defining a
list of `CallPath` entities in a static vector, then looping over them
to pattern-match). This PR migrates all other such cases to use `match`
and `matches!` where possible.

There are a few benefits to this:

1. It more clearly denotes the intended semantics (branches are
exclusive).
2. The compiler can help deduplicate the patterns and detect unreachable
branches.
3. Performance: in the benchmark below, the all-rules performance is
increased by nearly 10%...

## Benchmarks

I decided to benchmark against a large file in the Airflow repository
with a lot of type annotations
([`views.py`](https://raw.githubusercontent.com/apache/airflow/f03f73100e8a7d6019249889de567cb00e71e457/airflow/www/views.py)):

```
linter/default-rules/airflow/views.py
                        time:   [10.871 ms 10.882 ms 10.894 ms]
                        thrpt:  [19.739 MiB/s 19.761 MiB/s 19.781 MiB/s]
                 change:
                        time:   [-2.7182% -2.5687% -2.4204%] (p = 0.00 < 0.05)
                        thrpt:  [+2.4805% +2.6364% +2.7942%]
                        Performance has improved.

linter/all-rules/airflow/views.py
                        time:   [24.021 ms 24.038 ms 24.062 ms]
                        thrpt:  [8.9373 MiB/s 8.9461 MiB/s 8.9527 MiB/s]
                 change:
                        time:   [-8.9537% -8.8516% -8.7527%] (p = 0.00 < 0.05)
                        thrpt:  [+9.5923% +9.7112% +9.8342%]
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
```

The impact is dramatic -- nearly a 10% improvement for `all-rules`.
2023-06-16 17:34:42 +00:00
Charlie Marsh
b3240dbfa2
Avoid propagating BindingKind::Global and BindingKind::Nonlocal (#5136)
## Summary

This PR fixes a small quirk in the semantic model. Typically, when we
see an import, like `import foo`, we create a `BindingKind::Importation`
for it. However, if `foo` has been declared as a `global`, then we
propagate the kind forward. So given:

```python
global foo

import foo
```

We'd create two bindings for `foo`, both with type `global`.

This was originally borrowed from Pyflakes, and it exists to help avoid
false-positives like:

```python
def f():
    global foo

    # Don't mark `foo` as "assigned but unused"! It's a global!
    foo = 1
```

This PR removes that behavior, and instead tracks "Does this binding
refer to a global?" as a flag. This is much cleaner, since it means we
don't "lose" the identity of various bindings.

As a very strange example of why this matters, consider:

```python
def foo():
    global Member

    from module import Member

    x: Member = 1
```

`Member` is only used in a typing context, so we should flag it and say
"move it to a `TYPE_CHECKING` block". However, when we go to analyze
`from module import Member`, it has `BindingKind::Global`. So we don't
even know that it's an import!
2023-06-16 11:06:59 -04:00
Charlie Marsh
fd1dfc3bfa
Add support for global and nonlocal symbol renames (#5134)
## Summary

In #5074, we introduced an abstraction to support local symbol renames
("local" here refers to "within a module"). However, that abstraction
didn't support `global` and `nonlocal` symbols. This PR extends it to
those cases.

Broadly, there are considerations.

First, if we're renaming a symbol in a scope in which it is declared
`global` or `nonlocal`. For example, given:

```python
x = 1

def foo():
    global x
```

Then when renaming `x` in `foo`, we need to detect that it's `global`
and instead perform the rename starting from the module scope.

Second, when renaming a symbol, we need to determine the scopes in which
it is declared `global` or `nonlocal`. This is effectively the inverse
of the above: when renaming `x` in the module scope, we need to detect
that we should _also_ rename `x` in `foo`.

To support these cases, the renaming algorithm was adjusted as follows:

- When we start a rename in a scope, determine whether the symbol is
declared `global` or `nonlocal` by looking for a `global` or `nonlocal`
binding. If it is, start the rename in the defining scope. (This
requires storing the defining scope on the `nonlocal` binding, which is
new.)
- We then perform the rename in the defining scope.
- We then check whether the symbol was declared as `global` or
`nonlocal` in any scopes, and perform the rename in those scopes too.
(Thankfully, this doesn't need to be done recursively.)

Closes #5092.

## Test Plan

Added some additional snapshot tests.
2023-06-16 14:35:10 +00:00
Charlie Marsh
b9754bd5c5
Add autofix for Set-to-AbstractSet rewrite using reference tracking (#5074)
## Summary

This PR enables autofix behavior for the `flake8-pyi` rule that asks you
to alias `Set` to `AbstractSet` when importing `collections.abc.Set`.
It's not the most important rule, but it's a good isolated test-case for
local symbol renaming.

The renaming algorithm is outlined in-detail in the `renamer.rs` module.
But to demonstrate the behavior, here's the diff when running this fix
over a complex file that exercises a few edge cases:

```diff
--- a/foo.pyi
+++ b/foo.pyi
@@ -1,16 +1,16 @@
 if True:
-    from collections.abc import Set
+    from collections.abc import Set as AbstractSet
 else:
-    Set = 1
+    AbstractSet = 1

-x: Set = set()
+x: AbstractSet = set()

-x: Set
+x: AbstractSet

-del Set
+del AbstractSet

 def f():
-    print(Set)
+    print(AbstractSet)

     def Set():
         pass
```

Making this work required resolving a bunch of edge cases in the
semantic model that were causing us to "lose track" of references. For
example, the above wasn't possible with our previous approach to
handling deletions (#5071). Similarly, the `x: Set` "delayed annotation"
tracking was enabled via #5070. And many of these edits would've failed
if we hadn't changed `BindingKind` to always match the identifier range
(#5090). So it's really the culmination of a bunch of changes over the
course of the week.

The main outstanding TODO is that this doesn't support `global` or
`nonlocal` usages. I'm going to take a look at that tonight, but I'm
comfortable merging this as-is.

Closes #1106.

Closes #5091.
2023-06-16 14:12:33 +00:00
Charlie Marsh
5526699535
Use const-singleton helpers in more rules (#5142) 2023-06-16 04:28:35 +00:00
Charlie Marsh
5ea3e42513
Always use identifier ranges to store bindings (#5110)
## Summary

At present, when we store a binding, we include a `TextRange` alongside
it. The `TextRange` _sometimes_ matches the exact range of the
identifier to which the `Binding` is linked, but... not always.

For example, given:

```python
x = 1
```

The binding we create _will_ use the range of `x`, because the left-hand
side is an `Expr::Name`, which has a valid range on it.

However, given:

```python
try:
  pass
except ValueError as e:
  pass
```

When we create a binding for `e`, we don't have a `TextRange`... The AST
doesn't give us one. So we end up extracting it via lexing.

This PR extends that pattern to the rest of the binding kinds, to ensure
that whenever we create a binding, we always use the range of the bound
name. This leads to better diagnostics in cases like pattern matching,
whereby the diagnostic for "unused variable `x`" here used to include
`*x`, instead of just `x`:

```python
def f(provided: int) -> int:
    match provided:
        case [_, *x]:
            pass
```

This is _also_ required for symbol renames, since we track writes as
bindings -- so we need to know the ranges of the bound symbols.

By storing these bindings precisely, we can also remove the
`binding.trimmed_range` abstraction -- since bindings already use the
"trimmed range".

To implement this behavior, I took some of our existing utilities (like
the code we had for `except ValueError as e` above), migrated them from
a full lexer to a zero-allocation lexer that _only_ identifies
"identifiers", and moved the behavior into a trait, so we can now do
`stmt.identifier(locator)` to get the range for the identifier.

Honestly, we might end up discarding much of this if we decide to put
ranges on all identifiers
(https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we
do, this will _still_ be a good change, because the lexer introduced
here is useful beyond names (e.g., we use it find the `except` keyword
in an exception handler, to find the `else` after a `for` loop, and so
on). So, I'm fine committing this even if we end up changing our minds
about the right approach.

Closes #5090.

## Benchmarks

No significant change, with one statistically significant improvement
(-2.1654% on `linter/all-rules/large/dataset.py`):

```
linter/default-rules/numpy/globals.py
                        time:   [73.922 µs 73.955 µs 73.986 µs]
                        thrpt:  [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s]
                 change:
                        time:   [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05)
                        thrpt:  [+0.3996% +0.4755% +0.5611%]
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
linter/default-rules/pydantic/types.py
                        time:   [1.4909 ms 1.4917 ms 1.4926 ms]
                        thrpt:  [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s]
                 change:
                        time:   [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05)
                        thrpt:  [-0.3380% -0.2734% -0.2136%]
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
linter/default-rules/numpy/ctypeslib.py
                        time:   [688.97 µs 691.34 µs 694.15 µs]
                        thrpt:  [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s]
                 change:
                        time:   [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05)
                        thrpt:  [+0.1468% +0.7351% +1.3461%]
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  12 (12.00%) high severe
linter/default-rules/large/dataset.py
                        time:   [3.3872 ms 3.4032 ms 3.4191 ms]
                        thrpt:  [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s]
                 change:
                        time:   [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05)
                        thrpt:  [-0.0905% +0.2642% +0.6469%]
                        No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  4 (4.00%) high mild
  13 (13.00%) high severe

linter/all-rules/numpy/globals.py
                        time:   [148.99 µs 149.21 µs 149.42 µs]
                        thrpt:  [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s]
                 change:
                        time:   [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2785% +0.5094% +0.7395%]
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high severe
linter/all-rules/pydantic/types.py
                        time:   [3.0362 ms 3.0396 ms 3.0441 ms]
                        thrpt:  [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s]
                 change:
                        time:   [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05)
                        thrpt:  [-0.2121% -0.0618% +0.0958%]
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low severe
  3 (3.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
                        time:   [1.6879 ms 1.6894 ms 1.6909 ms]
                        thrpt:  [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s]
                 change:
                        time:   [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05)
                        thrpt:  [-0.0435% +0.0889% +0.2284%]
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) low mild
  1 (1.00%) high severe
linter/all-rules/large/dataset.py
                        time:   [7.1520 ms 7.1586 ms 7.1654 ms]
                        thrpt:  [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s]
                 change:
                        time:   [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05)
                        thrpt:  [+1.8102% +2.2133% +2.6300%]
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
```
2023-06-15 18:43:19 +00:00
Charlie Marsh
716cab2f19
Run rustfmt on nightly to clean up erroneous comments (#5106)
## Summary

This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:

```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```

Since these all seem like reasonable things to fix, so may as well while
I'm here.
2023-06-15 00:19:05 +00:00
Charlie Marsh
56476dfd61
Use matches! for CallPath comparisons (#5099)
## Summary

This PR consistently uses `matches! for static `CallPath` comparisons.
In some cases, we can significantly reduce the number of cases or
checks.

## Test Plan

`cargo test `
2023-06-14 17:06:34 -04:00
Charlie Marsh
bae183b823
Rename semantic_model and model usages to semantic (#5097)
## Summary

As discussed in Discord, and similar to oxc, we're going to refer to
this as `.semantic()` everywhere.

While I was auditing usages of `model: &SemanticModel`, I also changed
as many function signatures as I could find to consistently take the
model as the _last_ argument, rather than the first.
2023-06-14 15:01:51 -04:00
Charlie Marsh
65dbfd2556
Improve names and documentation on scope API (#5095)
## Summary

Just minor improvements to improve consistency of method names and
availability.
2023-06-14 18:28:55 +00:00
Charlie Marsh
86ff1febea
Re-export ruff_python_semantic members (#5094)
## Summary

This PR adds a more unified public API to `ruff_python_semantic`, so
that we don't need to do deeply nested imports all over the place.
2023-06-14 18:23:38 +00:00
Charlie Marsh
a33bbe6335
Track "delayed" annotations in the semantic model (#5070)
## Summary

This PR tackles a corner case that we'll need to support local symbol
renaming. It relates to a nuance in how we want handle annotations
(i.e., `AnnAssign` statements with no value, like `x: int` in a function
body).

When we see a statement like:

```python
x: int
```

We create a `BindingKind::Annotation` for `x`. This is a special
`BindingKind` that the resolver isn't allowed to return. For example,
given:

```python
x: int
print(x)
```

The second line will yield an `undefined-name` error.

So why does this `BindingKind` exist at all? In Pyflakes, to support the
`unused-annotation` lint:

```python
def f():
    x: int  # unused-annotation
```

If we don't track `BindingKind::Annotation`, we can't lint for unused
variables that are only "defined" via annotations.

There are a few other wrinkles to `BindingKind::Annotation`. One is
that, if a binding already exists in the scope, we actually just discard
the `BindingKind`. So in this case:

```python
x = 1
x: int
```

When we go to create the `BindingKind::Annotation` for the second
statement, we notice that (1) we're creating an annotation but (2) the
scope already has binding for the name -- so we just drop the binding on
the floor. This has the nice property that annotations aren't considered
to "shadow" another binding, which is important in a bunch of places
(e.g., if we have `import os; os: int`, we still consider `os` to be an
import, as we should). But it also means that these "delayed"
annotations are one of the few remaining references that we don't track
anywhere in the semantic model.

This PR adds explicit support for these via a new `delayed_annotations`
attribute on the semantic model. These should be extremely rare, but we
do need to track them if we want to support local symbol renaming.

### This isn't the right way to model this

This isn't the right way to model this.

Here's an alternative:

- Remove `BindingKind::Annotation`, and treat annotations as their own,
separate concept.
- Instead of storing a map from name to `BindingId` on each `Scope`,
store a map from name to... `SymbolId`.
- Introduce a `Symbol` abstraction, where a symbol can point to a
current binding, and a list of annotations, like:

```rust
pub struct Symbol {
  binding: Option<BindingId>,
  annotations: Vec<AnnotationId>
}
```

If we did this, we could appropriately model the semantics described
above. When we go to resolve a binding, we ignore annotations (always).
When we try to find unused variables, we look through the list of
symbols, and have sufficient information to discriminate between
annotations and bound variables. Etc.

The main downside of this `Symbol`-based approach is that it's going to
take a lot more work to implement, and it'll be less performant (we'll
be storing more data per symbol, and our binding lookups will have an
added layer of indirection).
2023-06-14 17:54:35 +00:00
Charlie Marsh
c992cfa76e
Make some of ruff_python_semantic pub(crate) (#5093) 2023-06-14 17:49:37 +00:00
Charlie Marsh
6f10aeebaa
Remove unused Scope#delete method (#5085)
## Summary

This is now intentionally unused and is now made impossible (via this
PR).
2023-06-14 14:15:14 +00:00
Charlie Marsh
c74ef77e85
Move binding accesses into SemanticModel method (#5084) 2023-06-14 14:07:46 +00:00
Charlie Marsh
1e497162d1
Add a dedicated read result for unbound locals (#5083)
## Summary

Small follow-up to #4888 to add a dedicated `ResolvedRead` case for
unbound locals, mostly for clarity and documentation purposes (no
behavior changes).

## Test Plan

`cargo test`
2023-06-14 09:58:48 -04:00
Charlie Marsh
aa41ffcfde
Add BindingKind variants to represent deleted bindings (#5071)
## Summary

Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.

Consider:

```python
x = 1
print(x)
del x
```

If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.

In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.

This PR introduces two new `BindingKind` variants to model deletions:

- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:

```python
try:
  1 / 0
except Exception as e:
  pass
```

In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.

The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.

### Alternatives

As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.

This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.
2023-06-14 09:27:24 -04:00
Charlie Marsh
1895011ac2
Document some attributes on the semantic model (#5064) 2023-06-13 20:45:24 +00:00
Charlie Marsh
364bd82aee
Don't treat annotations as resolved in forward references (#5060)
## Summary

This behavior dates back to a Pyflakes commit (5fc37cbd), which was used
to allow this test to pass:

```py
from __future__ import annotations
T: object
def f(t: T): pass
def g(t: 'T'): pass
```

But, I think this is an error. Mypy and Pyright don't accept it -- you
can only use variables as type annotations if they're type aliases
(i.e., annotated with `TypeAlias`), in which case, there has to be an
assignment on the right-hand side (see: [PEP
613](https://peps.python.org/pep-0613/)).
2023-06-13 14:47:29 -04:00
Charlie Marsh
19f972a305
Use Scope#has in lieu of Scope#get (#5051)
## Summary

These usages don't actually need the `BindingId`.
2023-06-13 15:59:53 +00:00
Charlie Marsh
68b6d30c46
Use consistent Cargo.toml metadata in all crates (#5015) 2023-06-12 00:02:40 +00:00
Charlie Marsh
e86f12a1ec
Rename some methods on SemanticModel (#4990) 2023-06-09 19:36:59 +00:00
Charlie Marsh
5c502a3320
Add documentation for BindingKind variants (#4989) 2023-06-09 18:32:50 +00:00
Davide Canton
63fdcea29e
Handled dict and set inside f-string (#4249) (#4563) 2023-06-09 04:53:13 +00:00
Micha Reiser
39a1f3980f
Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
Charlie Marsh
ae75b303f0
Avoid attributing runtime references to module-level imports (#4942) 2023-06-07 21:56:03 +00:00
Charlie Marsh
780d153ae8
Replace one-off locals property with ScopeFlags (#4912) 2023-06-06 21:22:21 -04:00
Charlie Marsh
8c048b463c
Track symbol deletions separately from bindings (#4888) 2023-06-06 18:49:36 +00:00
Charlie Marsh
7b0fb1a3b4
Respect noqa directives on ImportFrom parents for type-checking rules (#4889) 2023-06-06 02:37:07 +00:00
Charlie Marsh
8938b2d555
Use qualified_name terminology in more structs for consistency (#4873) 2023-06-05 19:06:48 +00:00
Charlie Marsh
d31eb87877
Extract shared simple AST node inference utility (#4871) 2023-06-05 18:23:37 +00:00
Charlie Marsh
a0721912a4
Invert structure of Scope#shadowed_bindings (#4855) 2023-06-05 02:03:21 +00:00