This solves an instability when formatting cpython. It also introduces
another one, but i think it's still a worthwhile change for now.
There's no proper testing since this is just a dummy.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
format StmtFor
still trying to learn how to help out with the formatter. trying
something slightly more advanced than [break](#5158)
mostly copied form StmtWhile
## Test Plan
snapshots
## Summary
Ensures that `--select PL` and `--select PLC` don't include `PLC1901`.
Previously, `--select PL` _did_, because it's a "linter-level selector"
(`--select PLC` is viewed as selecting the `C` prefix from `PL`), and we
were missing this filtering path.
## Summary
This is a follow up to #5221. Turns out it was easy to restructure the
visitor to get the right order, I'm just dumb 🤷♂️ I've
removed `visit_arg_with_default` entirely from the `Visitor`, although
it still exists as part of `preorder::Visitor`.
## Motivation
While black keeps parentheses nearly everywhere, the notable exception
is in the body of for loops:
```python
for (a, b) in x:
pass
```
becomes
```python
for a, b in x:
pass
```
This currently blocks #5163, which this PR should unblock.
## Solution
This changes the `ExprTuple` formatting option to include one additional
option that removes the parentheses when not using magic trailing comma
and not breaking. It is supposed to be used through
```rust
#[derive(Debug)]
struct ExprTupleWithoutParentheses<'a>(&'a Expr);
impl Format<PyFormatContext<'_>> for ExprTupleWithoutParentheses<'_> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
match self.0 {
Expr::Tuple(expr_tuple) => expr_tuple
.format()
.with_options(TupleParentheses::StripInsideForLoop)
.fmt(f),
other => other.format().with_options(Parenthesize::IfBreaks).fmt(f),
}
}
}
```
## Testing
The for loop formatting isn't merged due to missing this (and i didn't
want to create more git weirdness across two people), but I've confirmed
that when applying this to while loops instead of for loops, then
```rust
write!(
f,
[
text("while"),
space(),
ExprTupleWithoutParentheses(test.as_ref()),
text(":"),
trailing_comments(trailing_condition_comments),
block_indent(&body.format())
]
)?;
```
makes
```python
while (a, b):
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (a,b,):
pass
```
formatted as
```python
while a, b:
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (
a,
b,
):
pass
```
## Summary
According to the AST visitor documentation, the AST visitor "visits all
nodes in the AST recursively in evaluation-order". However, the current
traversal fails to meet this specification in a few places.
### Function traversal
```python
order = []
@(order.append("decorator") or (lambda x: x))
def f(
posonly: order.append("posonly annotation") = order.append("posonly default"),
/,
arg: order.append("arg annotation") = order.append("arg default"),
*args: order.append("vararg annotation"),
kwarg: order.append("kwarg annotation") = order.append("kwarg default"),
**kwargs: order.append("kwarg annotation")
) -> order.append("return annotation"):
pass
print(order)
```
Executing the above snippet using CPython 3.10.6 prints the following
result (formatted for readability):
```python
[
'decorator',
'posonly default',
'arg default',
'kwarg default',
'arg annotation',
'posonly annotation',
'vararg annotation',
'kwarg annotation',
'kwarg annotation',
'return annotation',
]
```
Here we can see that decorators are evaluated first, followed by
argument defaults, and annotations are last. The current traversal of a
function's AST does not align with this order.
### Annotated assignment traversal
```python
order = []
x: order.append("annotation") = order.append("expression")
print(order)
```
Executing the above snippet using CPython 3.10.6 prints the following
result:
```python
['expression', 'annotation']
```
Here we can see that an annotated assignments annotation gets evaluated
after the assignment's expression. The current traversal of an annotated
assignment's AST does not align with this order.
## Why?
I'm slowly working on #3946 and porting over some of the logic and tests
from ssort. ssort is very sensitive to AST traversal order, so ensuring
the utmost correctness here is important.
## Test Plan
There doesn't seem to be existing tests for the AST visitor, so I didn't
bother adding tests for these very subtle changes. However, this
behavior will be captured in the tests for the PR which addresses #3946.
## Summary
This is a complete rewrite of the handling of `/` and `*` comment
handling in function signatures. The key problem is that slash and star
don't have a note. We now parse out the positions of slash and star and
their respective preceding and following note. I've left code comments
for each possible case of function signature structure and comment
placement
## Test Plan
I extended the function statement fixtures with cases that i found. If
you have more weird edge cases your input would be appreciated.
## Summary
This fixes two problems discovered when trying to format the cpython
repo with `cargo run --bin ruff_dev -- check-formatter-stability
projects/cpython`:
The first is to ignore try/except trailing comments for now since they
lead to unstable formatting on the dummy.
The second is to avoid dropping trailing if comments through placement:
This changes the placement to keep a comment trailing an if-elif or
if-elif-else to keep the comment a trailing comment on the entire if.
Previously the last comment would have been lost.
```python
if "first if":
pass
elif "first elif":
pass
```
The last remaining problem in cpython so far is function signature
argument separator comment placement which is its own PR on top of this.
## Test Plan
I added test fixtures of minimized examples with links back to the
original cpython location
## Summary
While fixing https://github.com/astral-sh/ruff/pull/5233, I noticed that
in FastAPI, 343 out of 823 files weren't hitting the cache. It turns out
these are standalone files in the documentation that lack a "package
root". Later, when looking up the cache entries, we fallback to the
package directory.
This PR ensures that we initialize the cache for both kinds of files:
those that are in a package, and those that aren't.
The total size of the FastAPI cache for me is now 388K. I also suspect
that this approach is much faster than as initially written, since
before, we were probably initializing one cache per _directory_.
## Test Plan
Ran `cargo run -p ruff_cli -- check ../fastapi --verbose`; verified
that, on second execution, there were no "Checking" entries in the logs.
## Summary
In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:
```py
try:
v = 3
except ImportError as v:
print(v)
else:
print(v)
```
In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.
The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.
Closes#5249.
Closes#5250.
This formats slice expressions and subscript expressions.
Spaces around the colons follows the same rules as black
(https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#slices):
```python
e00 = "e"[:]
e01 = "e"[:1]
e02 = "e"[: a()]
e10 = "e"[1:]
e11 = "e"[1:1]
e12 = "e"[1 : a()]
e20 = "e"[a() :]
e21 = "e"[a() : 1]
e22 = "e"[a() : a()]
e200 = "e"[a() : :]
e201 = "e"[a() :: 1]
e202 = "e"[a() :: a()]
e210 = "e"[a() : 1 :]
```
Comment placement is different due to our very different infrastructure.
If we have explicit bounds (e.g. `x[1:2]`) all comments get assigned as
leading or trailing to the bound expression. If a bound is missing
`[:]`, comments get marked as dangling and placed in the same section as
they were originally in:
```python
x = "x"[ # a
# b
: # c
# d
]
```
to
```python
x = "x"[
# a
# b
:
# c
# d
]
```
Except for the potential trailing end-of-line comments, all comments get
formatted on their own line. This can be improved by keeping end-of-line
comments after the opening bracket or after a colon as such but the
changes were already complex enough.
I added tests for comment placement and spaces.
## Summary
I found it hard to figure out which function decides placement for a
specific comment. An explicit loop makes this easier to debug
## Test Plan
There should be no functional changes, no changes to the formatting of
the fixtures.
This moves all docs about benchmarking and profiling into
CONTRIBUTING.md by moving the readme of `ruff_benchmark` and adding more
information on profiling.
We need to somehow consolidate that documentation, but i'm not convinced
that this is the best way (i tried subpages in mkdocs, but that didn't
seem good either), so i'm happy to take suggestions.
## Summary
Previously, `DecoratedComment` used `text_position()` and
`SourceComment` used `position()`. This PR unifies this to
`line_position` everywhere.
## Test Plan
This is a rename refactoring.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for unary expressions.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new `unary.py` with custom test cases
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. This PR adds support for `ExpandRightThenLeft` and `ExpandLeft`.
## Test Plan
I added tests that play through all 4 binary expression layouts.
## Summary
I initially wanted this category to be more general and decoupled from
the plugin, but I got some feedback that the titling felt inconsistent
with others.
## Summary
This is a proper fix for the issue patched-over in
https://github.com/astral-sh/ruff/pull/5229, thanks to an extremely
helpful repro from @tlambert03 in that thread. It looks like we were
using the keys of `package_roots` rather than the values to initialize
the cache -- but it's a map from package to package root.
## Test Plan
Reverted #5229, then ran through the plan that @tlambert03 included in
https://github.com/astral-sh/ruff/pull/5229#issuecomment-1599723226.
Verified the panic before but not after this change.
## Summary
This PR reverts #4971 (aba073a791). It
turns out that `f"{str(x)}"` and `f"{x}"` are often but not exactly
equivalent, and performing that conversion automatically can lead to
subtle bugs, See the discussion in
https://github.com/astral-sh/ruff/issues/4958.
## Summary
I haven't been able to determine why / when this is happening, but in
some cases, users are reporting that this `unwrap()` is causing a panic.
It's fine to just return `None` here and fallback to "No cache",
certainly better than panicking (while we figure out the edge case).
Closes#5225.
Closes#5228.
## Summary
I'm looking into the Black stability tests, and here's one failing case.
We split `assert a and (b and c)` into:
```python
assert a
assert (b and c)
```
We fail to split `assert (b and c)` due to the parentheses. But Black
then removes then, and when running Ruff again, we get:
```python
assert a
assert b
assert c
```
This PR just enables us to fix to this in one pass.
## Summary
A new CLI option (`-o`/`--output-file`) to write output to a file
instead of stdout.
Major change is to remove the lock acquired on stdout. The argument is
that the output is buffered and thus the lock is acquired only when
writing a block (8kb). As per the benchmark below there is a slight
performance penalty.
Reference:
https://rustmagazine.org/issue-3/javascript-compiler/#printing-is-slow
## Benchmarks
_Output is truncated to only contain useful information:_
Command: `check --isolated --no-cache --select=ALL --show-source
./test-repos/cpython"`
Latest HEAD (361d45f2b2) with and without
the manual lock on stdout:
```console
Benchmark 1: With lock
Time (mean ± σ): 5.687 s ± 0.075 s [User: 17.110 s, System: 0.486 s]
Range (min … max): 5.615 s … 5.860 s 10 runs
Benchmark 2: Without lock
Time (mean ± σ): 5.719 s ± 0.064 s [User: 17.095 s, System: 0.491 s]
Range (min … max): 5.640 s … 5.865 s 10 runs
Summary
(1) ran 1.01 ± 0.02 times faster than (2)
```
This PR:
```console
Benchmark 1: This PR
Time (mean ± σ): 5.855 s ± 0.058 s [User: 17.197 s, System: 0.491 s]
Range (min … max): 5.786 s … 5.987 s 10 runs
Benchmark 2: Latest HEAD with lock
Time (mean ± σ): 5.645 s ± 0.033 s [User: 16.922 s, System: 0.495 s]
Range (min … max): 5.600 s … 5.712 s 10 runs
Summary
(2) ran 1.04 ± 0.01 times faster than (1)
```
## Test Plan
Run all of the commands which gives output with and without the
`--output-file=ruff.out` option:
* `--show-settings`
* `--show-files`
* `--show-fixes`
* `--diff`
* `--select=ALL`
* `--select=All --show-source`
* `--watch` (only stdout allowed)
resolves: #4754
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. `ExpandLeft` turns out to be surprisingly hard. This PR adds a new `BestFittingMode` parameter to `BestFitting` to support `ExpandLeft`.
There are 3 variants that `ExpandLeft` must support:
**Variant 1**: Everything fits on the line (easy)
```python
[a, b] + c
```
**Variant 2**: Left breaks, but right fits on the line. Doesn't need parentheses
```python
[
a,
b
] + c
```
**Variant 3**: The left breaks, but there's still not enough space for the right hand side. Parenthesize the whole expression:
```python
(
[
a,
b
]
+ ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
)
```
Solving Variant 1 and 2 on their own is straightforward The printer gives us this behavior by nesting right inside of the group of left:
```
group(&format_args![
if_group_breaks(&text("(")),
soft_block_indent(&group(&format_args![
left,
soft_line_break_or_space(),
op,
space(),
group(&right)
])),
if_group_breaks(&text(")"))
])
```
The fundamental problem is that the outer group, which adds the parentheses, always breaks if the left side breaks. That means, we end up with
```python
(
[
a,
b
] + c
)
```
which is not what we want (we only want parentheses if the right side doesn't fit).
Okay, so nesting groups don't work because of the outer parentheses. Sequencing groups doesn't work because it results in a right-to-left breaking which is the opposite of what we want.
Could we use best fitting? Almost!
```
best_fitting![
// All flat
format_args![left, space(), op, space(), right],
// Break left
format_args!(group(&left).should_expand(true), space(), op, space(), right],
// Break all
format_args![
text("("),
block_indent!(&format_args![
left,
hard_line_break(),
op,
space()
right
])
]
]
```
I hope I managed to write this up correctly. The problem is that the printer never reaches the 3rd variant because the second variant always fits:
* The `group(&left).should_expand(true)` changes the group so that all `soft_line_breaks` are turned into hard line breaks. This is necessary because we want to test if the content fits if we break after the `[`.
* Now, the whole idea of `best_fitting` is that you can pretend that some content fits on the line when it actually does not. The way this works is that the printer **only** tests if all the content of the variant **up to** the first line break fits on the line (we insert that line break by using `should_expand(true))`. The printer doesn't care whether the rest `a\n, b\n ] + c` all fits on (multiple?) lines.
Why does breaking right work but not breaking the left? The difference is that we can make the decision whether to parenthesis the expression based on the left expression. We can't do this for breaking left because the decision whether to insert parentheses or not would depend on a lookahead: will the right side break. We simply don't know this yet when printing the parentheses (it would work for the right parentheses but not for the left and indent).
What we kind of want here is to tell the printer: Look, what comes here may or may not fit on a single line but we don't care. Simply test that what comes **after** fits on a line.
This PR adds a new `BestFittingMode` that has a new `AllLines` option that gives us the desired behavior of testing all content and not just up to the first line break.
## Test Plan
I added a new example to `BestFitting::with_mode`
## Summary
Completes the documentation for the `flake8-bugbear` ruleset. Related to
#2646.
## Test Plan
`python scripts/check_docs_formatted.py`
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>