Commit graph

1995 commits

Author SHA1 Message Date
Charlie Marsh
30734f06fd
Support parenthesized expressions when splitting compound assertions (#5219)
## Summary

I'm looking into the Black stability tests, and here's one failing case.

We split `assert a and (b and c)` into:

```python
assert a
assert (b and c)
```

We fail to split `assert (b and c)` due to the parentheses. But Black
then removes then, and when running Ruff again, we get:

```python
assert a
assert b
assert c
```

This PR just enables us to fix to this in one pass.
2023-06-20 13:47:01 -04:00
Charlie Marsh
4547002eb7
Remove defaults from fixtures/pyproject.toml (#5217)
## Summary

These should be encoded in the tests themselves, rather than here. In
fact, I think they're all unused?
2023-06-20 13:16:00 -04:00
Charlie Marsh
310abc769d
Move StarImport to its own module (#5186) 2023-06-20 13:12:46 -04:00
Micha Reiser
b369288833
Accept any Into<AnyNodeRef> as Comments arguments (#5205) 2023-06-20 16:49:21 +00:00
Dhruv Manilawala
6f7d3cc798
Add option (-o/--output-file) to write output to a file (#4950)
## Summary

A new CLI option (`-o`/`--output-file`) to write output to a file
instead of stdout.

Major change is to remove the lock acquired on stdout. The argument is
that the output is buffered and thus the lock is acquired only when
writing a block (8kb). As per the benchmark below there is a slight
performance penalty.

Reference:
https://rustmagazine.org/issue-3/javascript-compiler/#printing-is-slow

## Benchmarks

_Output is truncated to only contain useful information:_

Command: `check --isolated --no-cache --select=ALL --show-source
./test-repos/cpython"`

Latest HEAD (361d45f2b2) with and without
the manual lock on stdout:

```console
Benchmark 1: With lock
  Time (mean ± σ):      5.687 s ±  0.075 s    [User: 17.110 s, System: 0.486 s]
  Range (min … max):    5.615 s …  5.860 s    10 runs

Benchmark 2: Without lock
  Time (mean ± σ):      5.719 s ±  0.064 s    [User: 17.095 s, System: 0.491 s]
  Range (min … max):    5.640 s …  5.865 s    10 runs

Summary
  (1) ran 1.01 ± 0.02 times faster than (2)
```

This PR:

```console
Benchmark 1: This PR
  Time (mean ± σ):      5.855 s ±  0.058 s    [User: 17.197 s, System: 0.491 s]
  Range (min … max):    5.786 s …  5.987 s    10 runs
 
Benchmark 2: Latest HEAD with lock
  Time (mean ± σ):      5.645 s ±  0.033 s    [User: 16.922 s, System: 0.495 s]
  Range (min … max):    5.600 s …  5.712 s    10 runs
 
Summary
  (2) ran 1.04 ± 0.01 times faster than (1)
```

## Test Plan

Run all of the commands which gives output with and without the
`--output-file=ruff.out` option:
* `--show-settings`
* `--show-files`
* `--show-fixes`
* `--diff`
* `--select=ALL`
* `--select=All --show-source`
* `--watch` (only stdout allowed)

resolves: #4754
2023-06-20 22:16:49 +05:30
Micha Reiser
d9e59b21cd
Add BestFittingMode (#5184)
## Summary
Black supports for layouts when it comes to breaking binary expressions:

```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
    /// Put each operand on their own line if either side expands
    Default,

    /// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
    ///
    ///```python
    /// [
    ///     a,
    ///     b
    /// ] & c
    ///```
    ExpandLeft,

    /// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
    ///
    /// ```python
    /// a & [
    ///     b,
    ///     c
    /// ]
    /// ```
    ExpandRight,

    /// Both the left and right side can be expanded. Try in the following order:
    /// * expand the right side
    /// * expand the left side
    /// * expand both sides
    ///
    /// to make the expression fit
    ///
    /// ```python
    /// [
    ///     a,
    ///     b
    /// ] & [
    ///     c,
    ///     d
    /// ]
    /// ```
    ExpandRightThenLeft,
}
```

Our current implementation only handles `ExpandRight` and `Default` correctly. `ExpandLeft` turns out to be surprisingly hard. This PR adds a new `BestFittingMode` parameter to `BestFitting` to support `ExpandLeft`.

There are 3 variants that `ExpandLeft` must support:

**Variant 1**: Everything fits on the line (easy)

```python
[a, b] + c
```

**Variant 2**: Left breaks, but right fits on the line. Doesn't need parentheses

```python
[
	a,
	b
] + c
```

**Variant 3**: The left breaks, but there's still not enough space for the right hand side. Parenthesize the whole expression:

```python
(
	[
		a, 
		b
	]
	+ ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
)
```

Solving Variant 1 and 2 on their own is straightforward The printer gives us this behavior by nesting right inside of the group of left:

```
group(&format_args![
	if_group_breaks(&text("(")),
	soft_block_indent(&group(&format_args![
		left, 
		soft_line_break_or_space(), 
		op, 
		space(), 
		group(&right)
	])),
	if_group_breaks(&text(")"))
])
```

The fundamental problem is that the outer group, which adds the parentheses, always breaks if the left side breaks. That means, we end up with

```python
(
	[
		a,
		b
	] + c
)
```

which is not what we want (we only want parentheses if the right side doesn't fit). 

Okay, so nesting groups don't work because of the outer parentheses. Sequencing groups doesn't work because it results in a right-to-left breaking which is the opposite of what we want. 

Could we use best fitting? Almost! 

```
best_fitting![
	// All flat
	format_args![left, space(), op, space(), right],
	// Break left
	format_args!(group(&left).should_expand(true), space(), op, space(), right],
	// Break all
	format_args![
		text("("), 
		block_indent!(&format_args![
			left, 
			hard_line_break(), 
			op,
			space()
			right
		])
	]
]
```

I hope I managed to write this up correctly. The problem is that the printer never reaches the 3rd variant because the second variant always fits:

* The `group(&left).should_expand(true)` changes the group so that all `soft_line_breaks` are turned into hard line breaks. This is necessary because we want to test if the content fits if we break after the `[`. 
* Now, the whole idea of `best_fitting` is that you can pretend that some content fits on the line when it actually does not. The way this works is that the printer **only** tests if all the content of the variant **up to** the first line break fits on the line (we insert that line break by using `should_expand(true))`. The printer doesn't care whether the rest `a\n, b\n ] + c` all fits on (multiple?) lines. 

Why does breaking right work but not breaking the left? The difference is that we can make the decision whether to parenthesis the expression based on the left expression. We can't do this for breaking left because the decision whether to insert parentheses or not would depend on a lookahead: will the right side break. We simply don't know this yet when printing the parentheses (it would work for the right parentheses but not for the left and indent).

What we kind of want here is to tell the printer: Look, what comes here may or may not fit on a single line but we don't care. Simply test that what comes **after** fits on a line. 

This PR adds a new `BestFittingMode` that has a new `AllLines` option that gives us the desired behavior of testing all content and not just up to the first line break. 

## Test Plan

I added a new example to  `BestFitting::with_mode`
2023-06-20 18:16:01 +02:00
Tom Kuson
6929fcc55f
Complete flake8-bugbear documentation (#5178)
## Summary

Completes the documentation for the `flake8-bugbear` ruleset. Related to
#2646.

## Test Plan

`python scripts/check_docs_formatted.py`

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-06-20 12:10:58 -04:00
Charlie Marsh
7bc33a8d5f
Remove identifier lexing in favor of parser ranges (#5195)
## Summary

Now that all identifiers include ranges (#5194), we can remove a ton of
this "custom lexing" code that we have to sketchily extract identifier
ranges from source.

## Test Plan

`cargo test`
2023-06-20 12:07:29 -04:00
Charlie Marsh
6331598511
Upgrade RustPython to access ranged names (#5194)
## Summary

In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).

For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.

## Test Plan

`cargo test`
2023-06-20 15:43:38 +00:00
Thomas de Zeeuw
17f1ecd56e
Open cache files in parallel (#5120)
## Summary

Open cache files in parallel (again), brings the performance back to be roughly equal to the old implementation.

## Test Plan

Existing tests should keep working.
2023-06-20 17:43:09 +02:00
Dhruv Manilawala
062b6e5c2b
Handle trailing newline in Jupyter notebook JSON string (#5202)
## Summary

Handle trailing newline in Jupyter Notebook JSON string similar to how
`black`
does it.

## Test Plan

Add test cases when the JSON string for notebook ends with and without a
newline.

resolves: #5190
2023-06-20 10:19:11 +00:00
David Szotten
773e79b481
basic formatting for ExprDict (#5167) 2023-06-20 09:25:08 +00:00
Charlie Marsh
4cc3cdba16
Use some more wildcard imports in rules (#5201) 2023-06-20 03:21:08 +00:00
Charlie Marsh
a797e05602
Use a consistent argument ordering for Indexer (#5200) 2023-06-20 02:59:51 +00:00
Evan Rittenhouse
62aa77df31
Fix corner case involving terminal backslash after fixing W293 (#5172)
## Summary

Fixes #4404. 

Consider this file:
```python
if True:
    x = 1; \
<space><space><space>
```

The current implementation of W293 removes the 3 spaces on line 2. This
fix changes the file to:
```python
if True:
    x = 1; \
```
A file can't end in a `\`, according to Python's [lexical
analysis](https://docs.python.org/3/reference/lexical_analysis.html), so
subsequent iterations of the autofixer fail (the AST-based ones
specifically, since they depend on a valid syntax tree and get
re-parsed).

This patch examines the line before the line checked in `W293`. If its
first non-whitespace character is a `\`, the patch will extend the
diagnostic's fix range to all whitespace up until the previous line's
*second* non-whitespace character; that is, it deletes all spaces and
potential `\`s up until the next non-whitespace character on the
previous line.

## Test Plan
Ran `cargo run -p ruff_cli -- ~/Downloads/aa.py --fix --select W293,D100
--no-cache` against the above file. This resulted in:
```
/Users/evan/Downloads/aa.py:1:1: D100 Missing docstring in public module
Found 2 errors (1 fixed, 1 remaining).
```
The file's contents, after the fix:
```python
if True:
    x = 1;<space>
```
The `\` was removed, leaving the terminal space. The space should be
handled by `Rule::TrailingWhitespace`, not `BlankLineWithWhitespace`.
2023-06-20 02:57:24 +00:00
Charlie Marsh
64bd955c58
Remove continuations before trailing semicolons (#5199)
## Summary

Closes #4828.
2023-06-20 02:22:32 +00:00
Charlie Marsh
8e06140d1d
Remove continuations when deleting statements (#5198)
## Summary

This PR modifies our statement deletion logic to delete any preceding
continuation lines.

For example, given:

```py
x = 1; \
  import os
```

We'll now rewrite to:

```py
x = 1;
```

In addition, the logic can now handle multiple preceding continuations
(which is unlikely, but valid).
2023-06-19 22:04:28 -04:00
Charlie Marsh
015895bcae
Move copyright rule to nursery (#5197)
## Summary

I want this to be explicitly opted-into.
2023-06-19 21:41:47 -04:00
Charlie Marsh
36e01ad6eb
Upgrade RustPython (#5192)
## Summary

This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
2023-06-19 21:09:53 +00:00
Charlie Marsh
ddfdc3bb01
Add rule documentation URL to JSON output (#5187)
## Summary

I want to include URLs to the rule documentation in the LSP (the LSP has
a native `code_description` field for this, which, if specified, causes
the source to be rendered as a link to the docs). This PR exposes the
URL to the documentation in the Ruff JSON output.
2023-06-19 21:09:15 +00:00
Dhruv Manilawala
48f4f2d63d
Maintain consistency when deserializing to JSON (#5114)
## Summary

Maintain consistency while deserializing Jupyter notebook to JSON. The
following changes were made:

1. Use string array to store the source value as that's the default
(5781720423/nbformat/v4/nbjson.py (L56-L57))
2. Remove unused structs and enums
3. Reorder the keys in alphabetical order as that's the default.
(5781720423/nbformat/v4/nbjson.py (L51))

### Side effect

Removing the `preserve_order` feature means that the order of keys in
JSON output (`--format json`) will be in alphabetical order. This is
because the value is represented using `serde_json::Value` which
internally is a `BTreeMap`, thus sorting it as per the string key. For
posterity if this turns out to be not ideal, then we could define a
struct representing the JSON object and the order of struct fields will
determine the order in the JSON string.

## Test Plan

Add a test case to assert the raw JSON string.
2023-06-19 23:47:56 +05:30
Charlie Marsh
94abf7f088
Rename *Importation structs to *Import (#5185)
## Summary

I find "Importation" a bit awkward, it may not even be grammatically
correct here.
2023-06-19 12:09:10 -04:00
Thomas de Zeeuw
e3c12764f8
Only use a single cache file per Python package (#5117)
## Summary

This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.

Below are some very much not scientific performance tests. It uses
projects/sources to check:

* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.

Source   | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1              | 1             | 20 K        | 20 K
test.py  | 1              | 1             | 60 K        | 60 K
fastapi  | 1134           | 518           | 4.5 M       | 2.3 M

One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.

This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.

Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).

Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.


## Test Plan

Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
2023-06-19 17:46:13 +02:00
konstin
b8d378b0a3
Add a script that tests formatter stability on repositories (#5055)
## Summary

We want to ensure that once formatted content stays the same when
formatted again, which is known as formatter stability or formatter
idempotency, and that the formatter prints syntactically valid code. As
our test cases cover only a limited amount of code, this allows checking
entire repositories.

This adds a new subcommand to `ruff_dev` which can be invoked as `cargo
run --bin ruff_dev -- check-formatter-stability <repo>`. While initially
only intended to check stability, it has also found cases where the
formatter printed invalid syntax or panicked.

 ## Test Plan

Running this on cpython is already identifying bugs
(https://github.com/astral-sh/ruff/pull/5089)
2023-06-19 14:13:38 +00:00
konstin
0e028142f4
Explain dangling comments in the formatter (#5170)
This documentation change improves the section on dangling comments in
the formatter.

---------

Co-authored-by: David Szotten <davidszotten@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-06-19 14:24:45 +02:00
konstin
361d45f2b2
Add cargo dev repeat for profiling (#5144)
## Summary

This adds a new subcommand that can be used as

```shell
cargo build --bin ruff_dev --profile=release-debug
perf record -g -F 999 target/release-debug/ruff_dev repeat --repeat 30 --exit-zero --no-cache path/to/cpython > /dev/null
flamegraph --perfdata perf.data
```

## Test Plan

This is a ruff internal script. I successfully used it to profile
cpython with the instructions above
2023-06-19 11:40:09 +02:00
Charlie Marsh
be11cae619
Fix allowed-ellipsis detection (#5174)
## Summary

We weren't resetting the `allow_ellipsis` flag properly, which
ultimately caused us to treat the semicolon as "unnecessary" rather than
"creating a multi-statement line".

Closes #5154.
2023-06-19 04:19:41 +00:00
Charlie Marsh
2b82caa163
Detect continuations at start-of-file (#5173)
## Summary

Given:

```python
\
import os
```

Deleting `import os` leaves a syntax error: a file can't end in a
continuation. We have code to handle this case, but it failed to pick up
continuations at the _very start_ of a file.

Closes #5156.
2023-06-19 00:09:02 -04:00
Charlie Marsh
a6cf31cc89
Move dead_scopes to deferred.scopes (#5171)
## Summary

This is more consistent with the rest of the `deferred` patterns.
2023-06-18 15:57:38 +00:00
Charlie Marsh
524a2045ba
Enable autofix for unconventional imports rule (#5152)
## Summary

We can now automatically rewrite `import pandas` to `import pandas as
pd`, with minimal changes needed.
2023-06-18 15:56:42 +00:00
Charlie Marsh
a0b750f74b
Move unconventional import rule to post-binding phase (#5151)
## Summary

This PR moves the "unconventional import alias" rule (which enforces,
e.g., that `pandas` is imported as `pd`) to the "dead scopes" phase,
after the main linter pass. This (1) avoids an allocation since we no
longer need to create the qualified name in the linter pass; and (2)
will allow us to autofix it, since we'll have access to all references.

## Test Plan

`cargo test` -- all changes are to ranges (which are improvements IMO).
2023-06-18 15:23:40 +00:00
Chris Pryer
195b36c429
Format continue statement (#5165)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Format `continue` statement.

## Test Plan

`continue` is used already in some tests, but if a new test is needed I
could add it.

---------

Co-authored-by: konstin <konstin@mailbox.org>
2023-06-18 11:25:59 +00:00
konstin
5c416e4d9b
Pre commit without cargo and other pre-PR improvements (#5146)
This tackles three problems:
* pre-commit was slow because it ran cargo commands
* Improve the clarity on what you need to run to get your PR pass on CI
(and make those fast)
* You had to compile and run `cargo dev generate-all` separately, which
was slow

The first change is to remove all cargo commands except running ruff
itself from pre-commit. With `cargo run --bin ruff` already compiled it
takes about 7s on my machine. It would make sense to also use the ruff
pre-commit action here even if we're then lagging a release behind for
checking ruff on ruff.

The contributing guide is now clear about what you need to run:

```shell
cargo clippy --workspace --all-targets --all-features -- -D warnings  # Linting...
RUFF_UPDATE_SCHEMA=1 cargo test  # Testing and updating ruff.schema.json
pre-commit run --all-files  # rust and python formatting, markdown and python linting, etc.
```

Example timings from my machine:

`cargo clippy --workspace --all-targets --all-features -- -D warnings`:
23s
`RUFF_UPDATE_SCHEMA=1 cargo test`: 2min (recompiling), 1min (no code
changes, this is mainly doc tests)
`pre-commit run --all-files`: 7s

The exact numbers don't matter so much as the approximate experience (6s
is easier to just wait than 1min, esp if you need to fix and rerun). The
biggest remaining block seems to be doc tests, i'm surprised i didn't
find any solution to speeding them up (nextest simply doesn't run them
at all). Also note that the formatter has it's own tests which are much
faster since they avoid linking ruff (`cargo test
ruff_python_formatter`).

The third change is to enable `cargo test` to update the schema. Similar
to `INSTA_UPDATE=always`, i've added `RUFF_UPDATE_SCHEMA=1` (name open
to bikeshedding), so `RUFF_UPDATE_SCHEMA=1 cargo test` updates the
schema, while `cargo test` still fails as expected if the repo isn't
up-to-date.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2023-06-18 11:00:42 +00:00
konstin
763d38cafb
Refactor top llvm-lines entry (#5147)
## Summary

This refactors the top entry in terms of llvm lines,
`RuleCodePrefix::iter()`. It's only used for generating the schema and
the clap completion so no effect on performance.

I've confirmed with
```
CARGO_TARGET_DIR=target-llvm-lines RUSTFLAGS="-Csymbol-mangling-version=v0" cargo llvm-lines -p ruff --lib | head -n 20
```
that this indeed remove the method from the list of heaviest symbols in
terms of llvm-lines

Before:
```
  Lines                  Copies               Function name
  -----                  ------               -------------
  1768469                40538                (TOTAL)
    10391 (0.6%,  0.6%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::RuleCodePrefix>::iter
     8250 (0.5%,  1.1%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::Rule>::noqa_code
     7427 (0.4%,  1.5%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::checkers::ast::Checker as ruff_python_ast[c4c9eadfa5741dd4]::visitor::Visitor>::visit_stmt
     6536 (0.4%,  1.8%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:spanned::SpannedDeserializer<toml_edit[de4ca26332d39787]:🇩🇪:value::ValueDeserializer>>
     6536 (0.4%,  2.2%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:table::TableMapAccess>
     6533 (0.4%,  2.6%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:datetime::DatetimeDeserializer>
     5727 (0.3%,  2.9%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::checkers::ast::Checker as ruff_python_ast[c4c9eadfa5741dd4]::visitor::Visitor>::visit_expr
     4453 (0.3%,  3.2%)      1 (0.0%,  0.0%)  ruff[fa0f2e8ef07114da]::flake8_to_ruff::converter::convert
     3790 (0.2%,  3.4%)      1 (0.0%,  0.0%)  <&ruff[fa0f2e8ef07114da]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     3416 (0.2%,  3.6%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::registry::Linter>::code_for_rule
     3187 (0.2%,  3.7%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
     3185 (0.2%,  3.9%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<&ruff[fa0f2e8ef07114da]::codes::Rule>>::from
     3185 (0.2%,  4.1%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<ruff[fa0f2e8ef07114da]::codes::Rule>>::from
     3185 (0.2%,  4.3%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
     3183 (0.2%,  4.5%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::RuleIter>::get
     2718 (0.2%,  4.6%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_seq::<toml_edit[de4ca26332d39787]:🇩🇪:array::ArraySeqAccess>
     2706 (0.2%,  4.8%)      1 (0.0%,  0.0%)  <&ruff[fa0f2e8ef07114da]::codes::Pylint as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
```
After:
```
  Lines                  Copies               Function name
  -----                  ------               -------------
  1763380                40806                (TOTAL)
     8250 (0.5%,  0.5%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::Rule>::noqa_code
     7427 (0.4%,  0.9%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::checkers::ast::Checker as ruff_python_ast[c4c9eadfa5741dd4]::visitor::Visitor>::visit_stmt
     6536 (0.4%,  1.3%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:spanned::SpannedDeserializer<toml_edit[de4ca26332d39787]:🇩🇪:value::ValueDeserializer>>
     6536 (0.4%,  1.6%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:table::TableMapAccess>
     6533 (0.4%,  2.0%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:datetime::DatetimeDeserializer>
     5727 (0.3%,  2.3%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::checkers::ast::Checker as ruff_python_ast[c4c9eadfa5741dd4]::visitor::Visitor>::visit_expr
     4453 (0.3%,  2.6%)      1 (0.0%,  0.0%)  ruff[fa0f2e8ef07114da]::flake8_to_ruff::converter::convert
     3790 (0.2%,  2.8%)      1 (0.0%,  0.0%)  <&ruff[fa0f2e8ef07114da]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     3416 (0.2%,  3.0%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::registry::Linter>::code_for_rule
     3187 (0.2%,  3.2%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
     3185 (0.2%,  3.3%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<&ruff[fa0f2e8ef07114da]::codes::Rule>>::from
     3185 (0.2%,  3.5%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<ruff[fa0f2e8ef07114da]::codes::Rule>>::from
     3185 (0.2%,  3.7%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
     3183 (0.2%,  3.9%)      1 (0.0%,  0.0%)  <ruff[fa0f2e8ef07114da]::codes::RuleIter>::get
     2718 (0.2%,  4.0%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::settings::options::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_seq::<toml_edit[de4ca26332d39787]:🇩🇪:array::ArraySeqAccess>
     2706 (0.2%,  4.2%)      1 (0.0%,  0.0%)  <&ruff[fa0f2e8ef07114da]::codes::Pylint as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     2573 (0.1%,  4.3%)      1 (0.0%,  0.0%)  <<ruff[fa0f2e8ef07114da]::rules::isort::settings::Options as serde[1a28808d63625aed]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[1a28808d63625aed]:🇩🇪:Visitor>::visit_map::<toml_edit[de4ca26332d39787]:🇩🇪:spanned::SpannedDeserializer<toml_edit[de4ca26332d39787]:🇩🇪:value::ValueDeserializer>>
```
I didn't measure the effect on binary size this time.

## Testing

`cargo test` which uses this to generate the schema didn't change
2023-06-18 12:39:06 +02:00
Evan Rittenhouse
653a0ebf2d
Add Applicability to pyupgrade (#5162)
## Summary

Fixes some of #4184.
2023-06-17 19:33:11 +00:00
Evan Rittenhouse
95448ba669
Add Applicability to isort (#5161)
## Summary

Fixes some of #4184.
2023-06-17 19:08:11 +00:00
Charlie Marsh
f18e10183f
Add some minor tweaks to latest docs (#5164) 2023-06-17 17:04:50 +00:00
Tom Kuson
98920909c6
Complete documentation for flake8-blind-except and flake8-raise rules (#5143)
## Summary

Completes the documentation for the `flake8-blind-except` and
`flake8-raise` rules.

Related to #2646.

## Test Plan

`python scripts/check_docs_formatted.py`
2023-06-17 12:56:27 -04:00
Evan Rittenhouse
e1e1d2d341
Add Applicability to flynt (#5160)
## Summary

Fixes some of #4184.
2023-06-17 12:05:43 -04:00
David Szotten
4b9b6829dc
format StmtBreak (#5158)
## Summary

format `StmtBreak`

trying to learn how to help out with the formatter. starting simple

## Test Plan

new snapshot test
2023-06-17 10:31:29 +02:00
Charlie Marsh
d0ad1ed0af
Replace static CallPath vectors with matches! macros (#5148)
## Summary

After #5140, I audited the codebase for similar patterns (defining a
list of `CallPath` entities in a static vector, then looping over them
to pattern-match). This PR migrates all other such cases to use `match`
and `matches!` where possible.

There are a few benefits to this:

1. It more clearly denotes the intended semantics (branches are
exclusive).
2. The compiler can help deduplicate the patterns and detect unreachable
branches.
3. Performance: in the benchmark below, the all-rules performance is
increased by nearly 10%...

## Benchmarks

I decided to benchmark against a large file in the Airflow repository
with a lot of type annotations
([`views.py`](https://raw.githubusercontent.com/apache/airflow/f03f73100e8a7d6019249889de567cb00e71e457/airflow/www/views.py)):

```
linter/default-rules/airflow/views.py
                        time:   [10.871 ms 10.882 ms 10.894 ms]
                        thrpt:  [19.739 MiB/s 19.761 MiB/s 19.781 MiB/s]
                 change:
                        time:   [-2.7182% -2.5687% -2.4204%] (p = 0.00 < 0.05)
                        thrpt:  [+2.4805% +2.6364% +2.7942%]
                        Performance has improved.

linter/all-rules/airflow/views.py
                        time:   [24.021 ms 24.038 ms 24.062 ms]
                        thrpt:  [8.9373 MiB/s 8.9461 MiB/s 8.9527 MiB/s]
                 change:
                        time:   [-8.9537% -8.8516% -8.7527%] (p = 0.00 < 0.05)
                        thrpt:  [+9.5923% +9.7112% +9.8342%]
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
```

The impact is dramatic -- nearly a 10% improvement for `all-rules`.
2023-06-16 17:34:42 +00:00
Charlie Marsh
b3240dbfa2
Avoid propagating BindingKind::Global and BindingKind::Nonlocal (#5136)
## Summary

This PR fixes a small quirk in the semantic model. Typically, when we
see an import, like `import foo`, we create a `BindingKind::Importation`
for it. However, if `foo` has been declared as a `global`, then we
propagate the kind forward. So given:

```python
global foo

import foo
```

We'd create two bindings for `foo`, both with type `global`.

This was originally borrowed from Pyflakes, and it exists to help avoid
false-positives like:

```python
def f():
    global foo

    # Don't mark `foo` as "assigned but unused"! It's a global!
    foo = 1
```

This PR removes that behavior, and instead tracks "Does this binding
refer to a global?" as a flag. This is much cleaner, since it means we
don't "lose" the identity of various bindings.

As a very strange example of why this matters, consider:

```python
def foo():
    global Member

    from module import Member

    x: Member = 1
```

`Member` is only used in a typing context, so we should flag it and say
"move it to a `TYPE_CHECKING` block". However, when we go to analyze
`from module import Member`, it has `BindingKind::Global`. So we don't
even know that it's an import!
2023-06-16 11:06:59 -04:00
Charlie Marsh
fd1dfc3bfa
Add support for global and nonlocal symbol renames (#5134)
## Summary

In #5074, we introduced an abstraction to support local symbol renames
("local" here refers to "within a module"). However, that abstraction
didn't support `global` and `nonlocal` symbols. This PR extends it to
those cases.

Broadly, there are considerations.

First, if we're renaming a symbol in a scope in which it is declared
`global` or `nonlocal`. For example, given:

```python
x = 1

def foo():
    global x
```

Then when renaming `x` in `foo`, we need to detect that it's `global`
and instead perform the rename starting from the module scope.

Second, when renaming a symbol, we need to determine the scopes in which
it is declared `global` or `nonlocal`. This is effectively the inverse
of the above: when renaming `x` in the module scope, we need to detect
that we should _also_ rename `x` in `foo`.

To support these cases, the renaming algorithm was adjusted as follows:

- When we start a rename in a scope, determine whether the symbol is
declared `global` or `nonlocal` by looking for a `global` or `nonlocal`
binding. If it is, start the rename in the defining scope. (This
requires storing the defining scope on the `nonlocal` binding, which is
new.)
- We then perform the rename in the defining scope.
- We then check whether the symbol was declared as `global` or
`nonlocal` in any scopes, and perform the rename in those scopes too.
(Thankfully, this doesn't need to be done recursively.)

Closes #5092.

## Test Plan

Added some additional snapshot tests.
2023-06-16 14:35:10 +00:00
Charlie Marsh
b9754bd5c5
Add autofix for Set-to-AbstractSet rewrite using reference tracking (#5074)
## Summary

This PR enables autofix behavior for the `flake8-pyi` rule that asks you
to alias `Set` to `AbstractSet` when importing `collections.abc.Set`.
It's not the most important rule, but it's a good isolated test-case for
local symbol renaming.

The renaming algorithm is outlined in-detail in the `renamer.rs` module.
But to demonstrate the behavior, here's the diff when running this fix
over a complex file that exercises a few edge cases:

```diff
--- a/foo.pyi
+++ b/foo.pyi
@@ -1,16 +1,16 @@
 if True:
-    from collections.abc import Set
+    from collections.abc import Set as AbstractSet
 else:
-    Set = 1
+    AbstractSet = 1

-x: Set = set()
+x: AbstractSet = set()

-x: Set
+x: AbstractSet

-del Set
+del AbstractSet

 def f():
-    print(Set)
+    print(AbstractSet)

     def Set():
         pass
```

Making this work required resolving a bunch of edge cases in the
semantic model that were causing us to "lose track" of references. For
example, the above wasn't possible with our previous approach to
handling deletions (#5071). Similarly, the `x: Set` "delayed annotation"
tracking was enabled via #5070. And many of these edits would've failed
if we hadn't changed `BindingKind` to always match the identifier range
(#5090). So it's really the culmination of a bunch of changes over the
course of the week.

The main outstanding TODO is that this doesn't support `global` or
`nonlocal` usages. I'm going to take a look at that tonight, but I'm
comfortable merging this as-is.

Closes #1106.

Closes #5091.
2023-06-16 14:12:33 +00:00
Charlie Marsh
307f7a735c
Avoid allocations in lowercase comparisons (#5137)
## Summary

I noticed that we have a few hot comparisons that involve called
`s.to_lowercase()`. We can avoid an allocation by comparing characters
directly.
2023-06-16 08:57:43 -04:00
Charlie Marsh
3af9dfeb0a
Rewrite suspicious_function_call as a match statement (#5140)
## Summary

@konstin mentioned that in profiling, this function accounted for a
non-trivial amount of time (0.33% of total execution, the most of any
rule). This PR attempts to rewrite it as a match statement for better
performance over a looping comparison.

## Test Plan

`cargo test`
2023-06-16 08:57:20 -04:00
Charlie Marsh
5526699535
Use const-singleton helpers in more rules (#5142) 2023-06-16 04:28:35 +00:00
Charlie Marsh
fab2a4adf7
Use matches! for insecure hash rule (#5141) 2023-06-16 04:18:32 +00:00
Charlie Marsh
13813dc1b1
Skip DJ008 enforcement in stub files (#5139)
Closes #5138.
2023-06-16 03:49:40 +00:00
Charlie Marsh
70c01257ca
Minor formatting changes to Checker (#5135) 2023-06-15 22:42:21 -04:00