Treat empty-line separated comments as trailing statement comments (#6999)

## Summary

This PR modifies our between-statement comment handling such that
comments that are not separated by a statement by any newlines continue
to be treated as leading comments on the statement, but comments that
_are_ separated are instead formatted as trailing comments on the
preceding statement.

See, e.g., the originating snippet:

```python
DEFAULT_TEMPLATE = "flatpages/default.html"

# This view is called from FlatpageFallbackMiddleware.process_response
# when a 404 is raised, which often means CsrfViewMiddleware.process_view
# has not been called even if CsrfViewMiddleware is installed. So we need
# to use @csrf_protect, in case the template needs {% csrf_token %}.
# However, we can't just wrap this view; if no matching flatpage exists,
# or a redirect is required for authentication, the 404 needs to be returned
# without any CSRF checks. Therefore, we only
# CSRF protect the internal implementation.


def flatpage(request, url):
    pass
```

Here, we need to ensure that the `def flatpage` is precede by two empty
lines. However, we want those two empty lines to be enforced from the
_end_ of the comment block, _unless_ the comments are directly atop the
`def flatpage`.

I played with this a bit, and I think the simplest conceptual model and
implementation is to instead treat those as trailing comments on the
preceding node. The main difficulty with this approach is that, in order
to be fully compatible with Black, we'd sometimes need to insert
newlines _between_ the preceding node and its trailing comments. See,
e.g.:

```python
def func():
    ...
# comment

x = 1
```

In this case, we'd need to insert two blank lines between `def func():
...` and `# comment`, but `# comment` is trailing comment on `def
func(): ...`. So, we'd need to take this case into account in the
various nodes that _require_ newlines after them: functions, classes,
and imports. After some discussion, we've opted _not_ to support this,
and just treat these as trailing comments -- so we won't insert newlines
there. This means our handling is still identical to Black's on
Black-formatted code, but avoids moving such trailing comments on
unformatted code.

I dislike that the empty handling is so complex, and that it's split
between so many different nodes, but this is really tricky. Continuing
to treat these as leading comments is very difficult too, since we'd
need to do similar tricks for the leading comment handling in those
nodes, and influencing leading comments is even harder, since they're
all formatted _before_ the node itself.

Closes https://github.com/astral-sh/ruff/issues/6761.

## Test Plan

`cargo test`

Surprisingly, it doesn't change the similarity at all (apart from a
0.00001 change in CPython), but I manually confirmed that it did fix the
originating issue in Django.

Before:

| project      | similarity index |
|--------------|------------------|
| cpython      | 0.76082          |
| django       | 0.99921          |
| transformers | 0.99854          |
| twine        | 0.99982          |
| typeshed     | 0.99953          |
| warehouse    | 0.99648          |
| zulip        | 0.99928          |


After:

| project      | similarity index |
|--------------|------------------|
| cpython      | 0.76081          |
| django       | 0.99921          |
| transformers | 0.99854          |
| twine        | 0.99982          |
| typeshed     | 0.99953          |
| warehouse    | 0.99648          |
| zulip        | 0.99928          |
This commit is contained in:
Charlie Marsh 2023-08-31 21:55:05 +01:00 committed by GitHub
parent 51d69b448c
commit 376d3caf47
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
21 changed files with 1060 additions and 695 deletions

View file

@ -1,55 +0,0 @@
def test():
# fmt: off
a_very_small_indent
(
not_fixed
)
if True:
pass
more
# fmt: on
formatted
def test():
a_small_indent
# fmt: off
# fix under-indented comments
(or_the_inner_expression +
expressions
)
if True:
pass
# fmt: on
# fmt: off
def test():
pass
# It is necessary to indent comments because the following fmt: on comment because it otherwise becomes a trailing comment
# of the `test` function if the "proper" indentation is larger than 2 spaces.
# fmt: on
disabled + formatting;
# fmt: on
formatted;
def test():
pass
# fmt: off
"""A multiline strings
that should not get formatted"""
"A single quoted multiline \
string"
disabled + formatting;
# fmt: on
formatted;

View file

@ -0,0 +1,36 @@
def func():
pass
# fmt: off
x = 1
# fmt: on
# fmt: off
def func():
pass
# fmt: on
x = 1
# fmt: off
def func():
pass
# fmt: on
def func():
pass
# fmt: off
def func():
pass
# fmt: off
def func():
pass
# fmt: on
def func():
pass
# fmt: on
def func():
pass

View file

@ -0,0 +1,161 @@
###
# Blank lines around functions
###
x = 1
# comment
def f():
pass
if True:
x = 1
# comment
def f():
pass
x = 1
# comment
def f():
pass
x = 1
# comment
def f():
pass
x = 1
# comment
# comment
def f():
pass
x = 1
# comment
# comment
def f():
pass
x = 1
# comment
# comment
def f():
pass
x = 1
# comment
# comment
def f():
pass
# comment
def f():
pass
# comment
def f():
pass
# comment
###
# Blank lines around imports.
###
def f():
import x
# comment
import y
def f():
import x
# comment
import y
def f():
import x
# comment
import y
def f():
import x
# comment
import y
def f():
import x
# comment
import y
def f():
import x
# comment
import y
def f():
import x # comment
# comment
import y
def f(): pass # comment
# comment
x = 1
def f():
pass
# comment
x = 1