Commit graph

414 commits

Author SHA1 Message Date
Charlie Marsh
aa41ffcfde
Add BindingKind variants to represent deleted bindings (#5071)
## Summary

Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.

Consider:

```python
x = 1
print(x)
del x
```

If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.

In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.

This PR introduces two new `BindingKind` variants to model deletions:

- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:

```python
try:
  1 / 0
except Exception as e:
  pass
```

In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.

The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.

### Alternatives

As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.

This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.
2023-06-14 09:27:24 -04:00
Charlie Marsh
fc6580592d
Use Expr::is_* methods at more call sites (#5075) 2023-06-14 04:02:39 +00:00
Charlie Marsh
3f6584b74f
Fix erroneous kwarg reference (#5068) 2023-06-14 00:01:52 +00:00
Charlie Marsh
c2fa568b46
Use dedicated structs for excepthandler variants (#5065)
## Summary

Oversight from #5042.
2023-06-13 22:37:06 +00:00
Charlie Marsh
cc44349401
Use dedicated structs in comparable.rs (#5042)
## Summary

Updating to match the updated AST structure, for consistency.
2023-06-13 03:57:34 +00:00
Charlie Marsh
780336db0a
Include f-string prefixes in quote-stripping utilities (#5039)
Mentioned here:
https://github.com/astral-sh/ruff/pull/4853#discussion_r1217560348.

Generated with this hacky script:
https://gist.github.com/charliermarsh/8ecc4e55bc87d51dc27340402f33b348.
2023-06-12 18:25:47 -04:00
Charlie Marsh
7e37d8916c
Remove lexer dependency from identifier_range (#5036)
## Summary

We run this quite a bit -- the new version is zero-allocation, though
it's not quite as nice as the lexer we have in the formatter.
2023-06-12 22:06:03 +00:00
Charlie Marsh
ab11dd08df
Improve TypedDict conversion logic for shadowed builtins and dunder methods (#5038)
## Summary

This PR (1) avoids flagging `TypedDict` and `NamedTuple` conversions
when attributes are dunder methods, like `__dict__`, and (2) avoids
flagging the `A003` shadowed-attribute rule for `TypedDict` classes at
all, where it doesn't really apply (since those attributes are only
accessed via subscripting anyway).

Closes #5027.
2023-06-12 21:23:39 +00:00
Addison Crump
70e6c212d9
Improve ruff_parse_simple to find UTF-8 violations (#5008)
Improves the `ruff_parse_simple` fuzz harness by adding checks for
parsed locations to ensure they all lie on UTF-8 character boundaries.
This will allow for faster identification of issues like #5004.

This also adds additional details for Apple M1 users and clarifies the
importance of using `init-fuzzer.sh` (thanks for the feedback,
@jasikpark 🙂).
2023-06-12 12:10:23 -04:00
Charlie Marsh
445e1723ab
Use Stmt::parse in lieu of Suite unwraps (#5002) 2023-06-10 04:55:31 +00:00
Charlie Marsh
2d597bc1fb
Parenthesize expressions prior to lexing in F632 (#5001) 2023-06-10 04:23:43 +00:00
Charlie Marsh
02b8ce82af
Refactor RET504 to only enforce assignment-then-return pattern (#4997)
## Summary

The `RET504` rule, which looks for unnecessary assignments before return
statements, is a frequent source of issues (#4173, #4236, #4242, #1606,
#2950). Over time, we've tried to refine the logic to handle more cases.
For example, we now avoid analyzing any functions that contain any
function calls or attribute assignments, since those operations can
contain side effects (and so we mark them as a "read" on all variables
in the function -- we could do a better job with code graph analysis to
handle this limitation, but that'd be a more involved change.) We also
avoid flagging any variables that are the target of multiple
assignments. Ultimately, though, I'm not happy with the implementation
-- we just can't do sufficiently reliable analysis of arbitrary code
flow given the limited logic herein, and the existing logic is very hard
to reason about and maintain.

This PR refocuses the rule to only catch cases of the form:

```py
def f():
    x = 1
    return x
```

That is, we now only flag returns that are immediately preceded by an
assignment to the returned variable. While this is more limiting, in
some ways, it lets us flag more cases vis-a-vis the previous
implementation, since we no longer "fully eject" when functions contain
function calls and other effect-ful operations.

Closes #4173.

Closes #4236.

Closes #4242.
2023-06-10 00:05:01 -04:00
Charlie Marsh
f401050878
Introduce PythonWhitespace to confine trim operations to Python whitespace (#4994)
## Summary

We use `.trim()` and friends in a bunch of places, to strip whitespace
from source code. However, not all Unicode whitespace characters are
considered "whitespace" in Python, which only supports the standard
space, tab, and form-feed characters.

This PR audits our usages of `.trim()`, `.trim_start()`, `.trim_end()`,
and `char::is_whitespace`, and replaces them as appropriate with a new
`.trim_whitespace()` analogues, powered by a `PythonWhitespace` trait.

In general, the only place that should continue to use `.trim()` is
content within docstrings, which don't need to adhere to Python's
semantic definitions of whitespace.

Closes #4991.
2023-06-09 21:44:50 -04:00
Charlie Marsh
1d756dc3a7
Move Python whitespace utilities into new ruff_python_whitespace crate (#4993)
## Summary

`ruff_newlines` becomes `ruff_python_whitespace`, and includes the
existing "universal newline" handlers alongside the Python
whitespace-specific utilities.
2023-06-10 00:59:57 +00:00
Charlie Marsh
16d1e63a5e
Respect 'is not' operators split across newlines (#4977) 2023-06-09 05:07:45 +00:00
Charlie Marsh
58d08219e8
Allow re-assignments to __all__ (#4967) 2023-06-08 17:19:56 +00:00
Micha Reiser
68969240c5
Format Function definitions (#4951) 2023-06-08 16:07:33 +00:00
Micha Reiser
39a1f3980f
Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
Micha Reiser
19abee086b
Introduce AnyFunctionDefinition Node (#4898) 2023-06-06 20:37:46 +02:00
Charlie Marsh
d1b8fe6af2
Fix round-tripping of nested functions (#4875) 2023-06-05 16:13:08 -04:00
Charlie Marsh
8938b2d555
Use qualified_name terminology in more structs for consistency (#4873) 2023-06-05 19:06:48 +00:00
Ryan Yang
72245960a1
implement E307 for pylint invalid str return type (#4854) 2023-06-05 17:54:15 +00:00
Micha Reiser
c89d2f835e
Add to AnyNode and AnyNodeRef conversion methods to AstNode (#4783) 2023-06-02 08:10:41 +02:00
qdegraaf
fcbf5c3fae
Add PYI034 for flake8-pyi plugin (#4764) 2023-06-02 02:15:57 +00:00
Charlie Marsh
ab26f2dc9d
Use saturating_sub in more token-walking methods (#4773) 2023-06-01 17:16:32 -04:00
Micha Reiser
be31d71849
Correctly associate own-line comments in bodies (#4671) 2023-06-01 08:12:53 +02:00
Charlie Marsh
9d0ffd33ca
Move universal newline handling into its own crate (#4729) 2023-05-31 12:00:47 -04:00
Micha Reiser
6c1ff6a85f
Upgrade RustPython (#4747) 2023-05-31 08:26:35 +00:00
Charlie Marsh
f47a517e79
Enable callers to specify import-style preferences in Importer (#4717) 2023-05-30 16:46:19 +00:00
Micha Reiser
0cd453bdf0
Generic "comment to node" association logic (#4642) 2023-05-30 09:28:01 +00:00
Charlie Marsh
9741f788c7
Remove globals table from Scope (#4686) 2023-05-27 22:35:20 -04:00
Micha Reiser
33a7ed058f
Create PreorderVisitor trait (#4658) 2023-05-26 06:14:08 +00:00
Charlie Marsh
0f610f2cf7
Remove dedicated ScopeKind structs in favor of AST nodes (#4648) 2023-05-25 19:31:02 +00:00
Micha Reiser
85f094f592
Improve Message sorting performance (#4624) 2023-05-24 16:34:48 +02:00
Micha Reiser
2681c0e633
Add missing nodes to AnyNodeRef and AnyNode (#4608) 2023-05-23 18:30:27 +02:00
Micha Reiser
154439728a
Add AnyNode and AnyNodeRef unions (#4578) 2023-05-23 08:53:22 +02:00
Micha Reiser
daadd24bde
Include decorators in Function and Class definition ranges (#4467) 2023-05-22 17:50:42 +02:00
Charlie Marsh
19c4b7bee6
Rename ruff_python_semantic's Context struct to SemanticModel (#4565) 2023-05-22 02:35:03 +00:00
Ville Skyttä
2e2ba2cb16
Avoid some false positives in dunder variable assigments (#4508) 2023-05-19 02:11:20 +00:00
Charlie Marsh
e9c6f16c56
Move unparse utility methods onto Generator (#4497) 2023-05-18 15:00:46 +00:00
Charlie Marsh
d3b18345c5
Move triple-quoted string detection into Indexer method (#4495) 2023-05-18 14:42:05 +00:00
Charlie Marsh
73efbeb581
Invert quote-style when generating code within f-strings (#4487) 2023-05-18 14:33:33 +00:00
Charlie Marsh
e8e66f3824
Remove unnecessary path prefixes (#4492) 2023-05-18 10:19:09 -04:00
Charlie Marsh
14c6419bc1
Bring pycodestyle rules into full compatibility (on SciPy) (#4472) 2023-05-17 16:51:55 +00:00
Charlie Marsh
d9c3f8e249
Avoid flagging missing whitespace for decorators (#4454) 2023-05-16 13:15:01 -04:00
Charlie Marsh
7e0d018b35
Avoid emitting empty logical lines (#4452) 2023-05-16 16:33:33 +00:00
Jeong, YunWon
4b05ca1198
Specialize ConversionFlag (#4450) 2023-05-16 18:00:13 +02:00
Charlie Marsh
f0465bf106
Emit non-logical newlines for "empty" lines (#4444) 2023-05-16 14:58:56 +00:00
Jeong, YunWon
badade3ccc
Impl Default for SourceLocation (#4328)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-16 07:03:43 +00:00
Micha Reiser
fa26860296
Refactor range from Attributed to Nodes (#4422) 2023-05-16 06:36:32 +00:00