<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for unary expressions.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new `unary.py` with custom test cases
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. This PR adds support for `ExpandRightThenLeft` and `ExpandLeft`.
## Test Plan
I added tests that play through all 4 binary expression layouts.
## Summary
The
[release](9655224008)
failed due to an inability to find `pyproject.toml`. This PR moves that
validation into its own step (so we can fail fast) and ensures we clone
the repo.
## Summary
I initially wanted this category to be more general and decoupled from
the plugin, but I got some feedback that the titling felt inconsistent
with others.
## Summary
This is a proper fix for the issue patched-over in
https://github.com/astral-sh/ruff/pull/5229, thanks to an extremely
helpful repro from @tlambert03 in that thread. It looks like we were
using the keys of `package_roots` rather than the values to initialize
the cache -- but it's a map from package to package root.
## Test Plan
Reverted #5229, then ran through the plan that @tlambert03 included in
https://github.com/astral-sh/ruff/pull/5229#issuecomment-1599723226.
Verified the panic before but not after this change.
## Summary
This PR reverts #4971 (aba073a791). It
turns out that `f"{str(x)}"` and `f"{x}"` are often but not exactly
equivalent, and performing that conversion automatically can lead to
subtle bugs, See the discussion in
https://github.com/astral-sh/ruff/issues/4958.
## Summary
I haven't been able to determine why / when this is happening, but in
some cases, users are reporting that this `unwrap()` is causing a panic.
It's fine to just return `None` here and fallback to "No cache",
certainly better than panicking (while we figure out the edge case).
Closes#5225.
Closes#5228.
## Summary
Currently, it is possible to create a tag and then have the release
fail, which is a problem since we can't edit the tag
(https://github.com/charliermarsh/ruff/issues/4468). This change the
release process so that the tag is created inside the release workflow.
This leaves as a failure mode that we have published to pypi but then
creating the tag or GitHub release doesn't work, but in this case we can
restart and the pypi upload is just skipped because we use the skip
existing option.
The release workflow is started by a workflow dispatch with the tag
instead of creating the tag yourself. You can start the release workflow
without a tag to do a dry run which does not publish an artifacts. You
can optionally add a git sha to the workflow run and it will verify that
the release runs on the mentioned commit.
This also adds docs on how to release and a small style improvement for
the maturin integration.
## Test Plan
Testing is hard since we can't do real releases, i've tested a minimized
workflow in a separate dummy repository.
## Summary
From the [subprocess
docs](https://docs.python.org/3/library/subprocess.html#subprocess.Popen):
> Changed in version 3.6: args parameter accepts a path-like object if
shell is False and a sequence containing path-like objects on POSIX.
>
> Changed in version 3.8: args parameter accepts a path-like object if
shell is False and a sequence containing bytes and path-like objects on
Windows.
We want to support python 3.7 on windows, so we need to convert the
`Path` into a `str`
## Summary
I'm looking into the Black stability tests, and here's one failing case.
We split `assert a and (b and c)` into:
```python
assert a
assert (b and c)
```
We fail to split `assert (b and c)` due to the parentheses. But Black
then removes then, and when running Ruff again, we get:
```python
assert a
assert b
assert c
```
This PR just enables us to fix to this in one pass.
## Summary
A new CLI option (`-o`/`--output-file`) to write output to a file
instead of stdout.
Major change is to remove the lock acquired on stdout. The argument is
that the output is buffered and thus the lock is acquired only when
writing a block (8kb). As per the benchmark below there is a slight
performance penalty.
Reference:
https://rustmagazine.org/issue-3/javascript-compiler/#printing-is-slow
## Benchmarks
_Output is truncated to only contain useful information:_
Command: `check --isolated --no-cache --select=ALL --show-source
./test-repos/cpython"`
Latest HEAD (361d45f2b2) with and without
the manual lock on stdout:
```console
Benchmark 1: With lock
Time (mean ± σ): 5.687 s ± 0.075 s [User: 17.110 s, System: 0.486 s]
Range (min … max): 5.615 s … 5.860 s 10 runs
Benchmark 2: Without lock
Time (mean ± σ): 5.719 s ± 0.064 s [User: 17.095 s, System: 0.491 s]
Range (min … max): 5.640 s … 5.865 s 10 runs
Summary
(1) ran 1.01 ± 0.02 times faster than (2)
```
This PR:
```console
Benchmark 1: This PR
Time (mean ± σ): 5.855 s ± 0.058 s [User: 17.197 s, System: 0.491 s]
Range (min … max): 5.786 s … 5.987 s 10 runs
Benchmark 2: Latest HEAD with lock
Time (mean ± σ): 5.645 s ± 0.033 s [User: 16.922 s, System: 0.495 s]
Range (min … max): 5.600 s … 5.712 s 10 runs
Summary
(2) ran 1.04 ± 0.01 times faster than (1)
```
## Test Plan
Run all of the commands which gives output with and without the
`--output-file=ruff.out` option:
* `--show-settings`
* `--show-files`
* `--show-fixes`
* `--diff`
* `--select=ALL`
* `--select=All --show-source`
* `--watch` (only stdout allowed)
resolves: #4754
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. `ExpandLeft` turns out to be surprisingly hard. This PR adds a new `BestFittingMode` parameter to `BestFitting` to support `ExpandLeft`.
There are 3 variants that `ExpandLeft` must support:
**Variant 1**: Everything fits on the line (easy)
```python
[a, b] + c
```
**Variant 2**: Left breaks, but right fits on the line. Doesn't need parentheses
```python
[
a,
b
] + c
```
**Variant 3**: The left breaks, but there's still not enough space for the right hand side. Parenthesize the whole expression:
```python
(
[
a,
b
]
+ ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
)
```
Solving Variant 1 and 2 on their own is straightforward The printer gives us this behavior by nesting right inside of the group of left:
```
group(&format_args![
if_group_breaks(&text("(")),
soft_block_indent(&group(&format_args![
left,
soft_line_break_or_space(),
op,
space(),
group(&right)
])),
if_group_breaks(&text(")"))
])
```
The fundamental problem is that the outer group, which adds the parentheses, always breaks if the left side breaks. That means, we end up with
```python
(
[
a,
b
] + c
)
```
which is not what we want (we only want parentheses if the right side doesn't fit).
Okay, so nesting groups don't work because of the outer parentheses. Sequencing groups doesn't work because it results in a right-to-left breaking which is the opposite of what we want.
Could we use best fitting? Almost!
```
best_fitting![
// All flat
format_args![left, space(), op, space(), right],
// Break left
format_args!(group(&left).should_expand(true), space(), op, space(), right],
// Break all
format_args![
text("("),
block_indent!(&format_args![
left,
hard_line_break(),
op,
space()
right
])
]
]
```
I hope I managed to write this up correctly. The problem is that the printer never reaches the 3rd variant because the second variant always fits:
* The `group(&left).should_expand(true)` changes the group so that all `soft_line_breaks` are turned into hard line breaks. This is necessary because we want to test if the content fits if we break after the `[`.
* Now, the whole idea of `best_fitting` is that you can pretend that some content fits on the line when it actually does not. The way this works is that the printer **only** tests if all the content of the variant **up to** the first line break fits on the line (we insert that line break by using `should_expand(true))`. The printer doesn't care whether the rest `a\n, b\n ] + c` all fits on (multiple?) lines.
Why does breaking right work but not breaking the left? The difference is that we can make the decision whether to parenthesis the expression based on the left expression. We can't do this for breaking left because the decision whether to insert parentheses or not would depend on a lookahead: will the right side break. We simply don't know this yet when printing the parentheses (it would work for the right parentheses but not for the left and indent).
What we kind of want here is to tell the printer: Look, what comes here may or may not fit on a single line but we don't care. Simply test that what comes **after** fits on a line.
This PR adds a new `BestFittingMode` that has a new `AllLines` option that gives us the desired behavior of testing all content and not just up to the first line break.
## Test Plan
I added a new example to `BestFitting::with_mode`
## Summary
Completes the documentation for the `flake8-bugbear` ruleset. Related to
#2646.
## Test Plan
`python scripts/check_docs_formatted.py`
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Now that all identifiers include ranges (#5194), we can remove a ton of
this "custom lexing" code that we have to sketchily extract identifier
ranges from source.
## Test Plan
`cargo test`
## Summary
In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).
For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.
## Test Plan
`cargo test`
## Summary
Open cache files in parallel (again), brings the performance back to be roughly equal to the old implementation.
## Test Plan
Existing tests should keep working.
## Summary
Handle trailing newline in Jupyter Notebook JSON string similar to how
`black`
does it.
## Test Plan
Add test cases when the JSON string for notebook ends with and without a
newline.
resolves: #5190
## Summary
Fixes#4404.
Consider this file:
```python
if True:
x = 1; \
<space><space><space>
```
The current implementation of W293 removes the 3 spaces on line 2. This
fix changes the file to:
```python
if True:
x = 1; \
```
A file can't end in a `\`, according to Python's [lexical
analysis](https://docs.python.org/3/reference/lexical_analysis.html), so
subsequent iterations of the autofixer fail (the AST-based ones
specifically, since they depend on a valid syntax tree and get
re-parsed).
This patch examines the line before the line checked in `W293`. If its
first non-whitespace character is a `\`, the patch will extend the
diagnostic's fix range to all whitespace up until the previous line's
*second* non-whitespace character; that is, it deletes all spaces and
potential `\`s up until the next non-whitespace character on the
previous line.
## Test Plan
Ran `cargo run -p ruff_cli -- ~/Downloads/aa.py --fix --select W293,D100
--no-cache` against the above file. This resulted in:
```
/Users/evan/Downloads/aa.py:1:1: D100 Missing docstring in public module
Found 2 errors (1 fixed, 1 remaining).
```
The file's contents, after the fix:
```python
if True:
x = 1;<space>
```
The `\` was removed, leaving the terminal space. The space should be
handled by `Rule::TrailingWhitespace`, not `BlankLineWithWhitespace`.
## Summary
This PR modifies our statement deletion logic to delete any preceding
continuation lines.
For example, given:
```py
x = 1; \
import os
```
We'll now rewrite to:
```py
x = 1;
```
In addition, the logic can now handle multiple preceding continuations
(which is unlikely, but valid).
## Summary
This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
## Summary
I want to include URLs to the rule documentation in the LSP (the LSP has
a native `code_description` field for this, which, if specified, causes
the source to be rendered as a link to the docs). This PR exposes the
URL to the documentation in the Ruff JSON output.
## Summary
Maintain consistency while deserializing Jupyter notebook to JSON. The
following changes were made:
1. Use string array to store the source value as that's the default
(5781720423/nbformat/v4/nbjson.py (L56-L57))
2. Remove unused structs and enums
3. Reorder the keys in alphabetical order as that's the default.
(5781720423/nbformat/v4/nbjson.py (L51))
### Side effect
Removing the `preserve_order` feature means that the order of keys in
JSON output (`--format json`) will be in alphabetical order. This is
because the value is represented using `serde_json::Value` which
internally is a `BTreeMap`, thus sorting it as per the string key. For
posterity if this turns out to be not ideal, then we could define a
struct representing the JSON object and the order of struct fields will
determine the order in the JSON string.
## Test Plan
Add a test case to assert the raw JSON string.
## Summary
This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.
Below are some very much not scientific performance tests. It uses
projects/sources to check:
* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.
Source | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1 | 1 | 20 K | 20 K
test.py | 1 | 1 | 60 K | 60 K
fastapi | 1134 | 518 | 4.5 M | 2.3 M
One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.
This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.
Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).
Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.
## Test Plan
Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
## Summary
We want to ensure that once formatted content stays the same when
formatted again, which is known as formatter stability or formatter
idempotency, and that the formatter prints syntactically valid code. As
our test cases cover only a limited amount of code, this allows checking
entire repositories.
This adds a new subcommand to `ruff_dev` which can be invoked as `cargo
run --bin ruff_dev -- check-formatter-stability <repo>`. While initially
only intended to check stability, it has also found cases where the
formatter printed invalid syntax or panicked.
## Test Plan
Running this on cpython is already identifying bugs
(https://github.com/astral-sh/ruff/pull/5089)
This documentation change improves the section on dangling comments in
the formatter.
---------
Co-authored-by: David Szotten <davidszotten@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This adds a new subcommand that can be used as
```shell
cargo build --bin ruff_dev --profile=release-debug
perf record -g -F 999 target/release-debug/ruff_dev repeat --repeat 30 --exit-zero --no-cache path/to/cpython > /dev/null
flamegraph --perfdata perf.data
```
## Test Plan
This is a ruff internal script. I successfully used it to profile
cpython with the instructions above
## Summary
We weren't resetting the `allow_ellipsis` flag properly, which
ultimately caused us to treat the semicolon as "unnecessary" rather than
"creating a multi-statement line".
Closes#5154.
## Summary
Given:
```python
\
import os
```
Deleting `import os` leaves a syntax error: a file can't end in a
continuation. We have code to handle this case, but it failed to pick up
continuations at the _very start_ of a file.
Closes#5156.
## Summary
This PR moves the "unconventional import alias" rule (which enforces,
e.g., that `pandas` is imported as `pd`) to the "dead scopes" phase,
after the main linter pass. This (1) avoids an allocation since we no
longer need to create the qualified name in the linter pass; and (2)
will allow us to autofix it, since we'll have access to all references.
## Test Plan
`cargo test` -- all changes are to ranges (which are improvements IMO).
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format `continue` statement.
## Test Plan
`continue` is used already in some tests, but if a new test is needed I
could add it.
---------
Co-authored-by: konstin <konstin@mailbox.org>