Show header for formatter comment decoration info
**Summary** Show a header in the formatter comment decoration debug
output that shows which node is preceding/following/enclosing
(https://github.com/astral-sh/ruff/pull/6813#issuecomment-1708119550). I
kept this intentionally condensed to make it easy to use this is a small
sidebar without vertical scrolling.
```console
$ cargo run --bin ruff_python_formatter -- --emit stdout --print-comments scratch.py
# Comment decoration: Range, Preceding, Following, Enclosing, Comment
17..20, Some((ParameterWithDefault, 6..10)), None, (Parameters, 5..22), "# a"
44..47, Some((StmtExpr, 28..39)), Some((StmtExpr, 52..60)), (StmtFunctionDef, 0..60), "# b"
77..80, None, None, (ExprList, 71..82), "# c"
{
Node {
kind: ParameterWithDefault,
range: 6..10,
source: `x=[]`,
}: {
...
```
**Test Plan** It's debug output.
**Summary** The comment visitor used to rebuild the locator for every
comment. Instead, we now keep the locator on the builder. Follow-up to
#6813.
**Test Plan** No formatting changes.
## Summary
This PR modifies our between-statement comment handling such that
comments that are not separated by a statement by any newlines continue
to be treated as leading comments on the statement, but comments that
_are_ separated are instead formatted as trailing comments on the
preceding statement.
See, e.g., the originating snippet:
```python
DEFAULT_TEMPLATE = "flatpages/default.html"
# This view is called from FlatpageFallbackMiddleware.process_response
# when a 404 is raised, which often means CsrfViewMiddleware.process_view
# has not been called even if CsrfViewMiddleware is installed. So we need
# to use @csrf_protect, in case the template needs {% csrf_token %}.
# However, we can't just wrap this view; if no matching flatpage exists,
# or a redirect is required for authentication, the 404 needs to be returned
# without any CSRF checks. Therefore, we only
# CSRF protect the internal implementation.
def flatpage(request, url):
pass
```
Here, we need to ensure that the `def flatpage` is precede by two empty
lines. However, we want those two empty lines to be enforced from the
_end_ of the comment block, _unless_ the comments are directly atop the
`def flatpage`.
I played with this a bit, and I think the simplest conceptual model and
implementation is to instead treat those as trailing comments on the
preceding node. The main difficulty with this approach is that, in order
to be fully compatible with Black, we'd sometimes need to insert
newlines _between_ the preceding node and its trailing comments. See,
e.g.:
```python
def func():
...
# comment
x = 1
```
In this case, we'd need to insert two blank lines between `def func():
...` and `# comment`, but `# comment` is trailing comment on `def
func(): ...`. So, we'd need to take this case into account in the
various nodes that _require_ newlines after them: functions, classes,
and imports. After some discussion, we've opted _not_ to support this,
and just treat these as trailing comments -- so we won't insert newlines
there. This means our handling is still identical to Black's on
Black-formatted code, but avoids moving such trailing comments on
unformatted code.
I dislike that the empty handling is so complex, and that it's split
between so many different nodes, but this is really tricky. Continuing
to treat these as leading comments is very difficult too, since we'd
need to do similar tricks for the leading comment handling in those
nodes, and influencing leading comments is even harder, since they're
all formatted _before_ the node itself.
Closes https://github.com/astral-sh/ruff/issues/6761.
## Test Plan
`cargo test`
Surprisingly, it doesn't change the similarity at all (apart from a
0.00001 change in CPython), but I manually confirmed that it did fix the
originating issue in Django.
Before:
| project | similarity index |
|--------------|------------------|
| cpython | 0.76082 |
| django | 0.99921 |
| transformers | 0.99854 |
| twine | 0.99982 |
| typeshed | 0.99953 |
| warehouse | 0.99648 |
| zulip | 0.99928 |
After:
| project | similarity index |
|--------------|------------------|
| cpython | 0.76081 |
| django | 0.99921 |
| transformers | 0.99854 |
| twine | 0.99982 |
| typeshed | 0.99953 |
| warehouse | 0.99648 |
| zulip | 0.99928 |
## Summary
This PR adds comment handling for comments between the `=` and the
`value` for keywords, as in the following cases:
```python
func(
x # dangling
= # dangling
# dangling
1,
** # dangling
y
)
```
(Comments after the `**` were already handled in some cases, but I've
unified the handling with the `=` handling.)
Note that, previously, comments between the `**` and its value were
rendered as trailing comments on the value (so they'd appear after `y`).
This struck me as odd since it effectively re-ordered the comment with
respect to its closest AST node (the value). I've made them leading
comments, though I don't know that that's a significant improvement. I
could also imagine us leaving them where they are.
## Summary
As a small quality-of-life improvement, the locator can now slice like
`locator.slice(stmt)` instead of requiring
`locator.slice(stmt.range())`.
## Test Plan
`cargo test`
**Summary** Add recursive formatting based on `ruff check` file
discovery for `ruff format`, as a prototype for the formatter alpha.
This allows e.g. `format ../projects/django/`. It's still lacking
support for any settings except line length.
Note just like the existing `ruff format` this will become part of the
production build, i.e. you'll be able to use it - hidden by default and
with a prominent warning - with `ruff format .` after the next release.
Error handling works in my manual tests (the colors do also work):
```
$ target/debug/ruff format scripts/
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended for internal use only.
```
(the above changes `add_rule.py` where we have the wrong bin op
breaking)
```
$ target/debug/ruff format ../projects/django/
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended for internal use only.
Failed to format /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors: ParseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
```
```
$ target/debug/ruff format a
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended for internal use only.
Failed to read /home/konsti/ruff/a/d.py: Permission denied (os error 13)
```
**Test Plan** Missing! I'm not sure if it's worth building tests at this
stage or how they should look like.
## Summary
The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.
Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.
## Test Plan
`cargo test`
## Summary
This PR introduces two new AST nodes to improve the representation of
`PatternMatchClass`. As a reminder, `PatternMatchClass` looks like this:
```python
case Point2D(0, 0, x=1, y=2):
...
```
Historically, this was represented as a vector of patterns (for the `0,
0` portion) and parallel vectors of keyword names (for `x` and `y`) and
values (for `1` and `2`). This introduces a bunch of challenges for the
formatter, but importantly, it's also really different from how we
represent similar nodes, like arguments (`func(0, 0, x=1, y=2)`) or
parameters (`def func(x, y)`).
So, firstly, we now use a single node (`PatternArguments`) for the
entire parenthesized region, making it much more consistent with our
other nodes. So, above, `PatternArguments` would be `(0, 0, x=1, y=2)`.
Secondly, we now have a `PatternKeyword` node for `x=1` and `y=2`. This
is much more similar to the how `Keyword` is represented within
`Arguments` for call expressions.
Closes https://github.com/astral-sh/ruff/issues/6866.
Closes https://github.com/astral-sh/ruff/issues/6880.