Treat empty-line separated comments as trailing statement comments (#6999)

## Summary

This PR modifies our between-statement comment handling such that
comments that are not separated by a statement by any newlines continue
to be treated as leading comments on the statement, but comments that
_are_ separated are instead formatted as trailing comments on the
preceding statement.

See, e.g., the originating snippet:

```python
DEFAULT_TEMPLATE = "flatpages/default.html"

# This view is called from FlatpageFallbackMiddleware.process_response
# when a 404 is raised, which often means CsrfViewMiddleware.process_view
# has not been called even if CsrfViewMiddleware is installed. So we need
# to use @csrf_protect, in case the template needs {% csrf_token %}.
# However, we can't just wrap this view; if no matching flatpage exists,
# or a redirect is required for authentication, the 404 needs to be returned
# without any CSRF checks. Therefore, we only
# CSRF protect the internal implementation.


def flatpage(request, url):
    pass
```

Here, we need to ensure that the `def flatpage` is precede by two empty
lines. However, we want those two empty lines to be enforced from the
_end_ of the comment block, _unless_ the comments are directly atop the
`def flatpage`.

I played with this a bit, and I think the simplest conceptual model and
implementation is to instead treat those as trailing comments on the
preceding node. The main difficulty with this approach is that, in order
to be fully compatible with Black, we'd sometimes need to insert
newlines _between_ the preceding node and its trailing comments. See,
e.g.:

```python
def func():
    ...
# comment

x = 1
```

In this case, we'd need to insert two blank lines between `def func():
...` and `# comment`, but `# comment` is trailing comment on `def
func(): ...`. So, we'd need to take this case into account in the
various nodes that _require_ newlines after them: functions, classes,
and imports. After some discussion, we've opted _not_ to support this,
and just treat these as trailing comments -- so we won't insert newlines
there. This means our handling is still identical to Black's on
Black-formatted code, but avoids moving such trailing comments on
unformatted code.

I dislike that the empty handling is so complex, and that it's split
between so many different nodes, but this is really tricky. Continuing
to treat these as leading comments is very difficult too, since we'd
need to do similar tricks for the leading comment handling in those
nodes, and influencing leading comments is even harder, since they're
all formatted _before_ the node itself.

Closes https://github.com/astral-sh/ruff/issues/6761.

## Test Plan

`cargo test`

Surprisingly, it doesn't change the similarity at all (apart from a
0.00001 change in CPython), but I manually confirmed that it did fix the
originating issue in Django.

Before:

| project      | similarity index |
|--------------|------------------|
| cpython      | 0.76082          |
| django       | 0.99921          |
| transformers | 0.99854          |
| twine        | 0.99982          |
| typeshed     | 0.99953          |
| warehouse    | 0.99648          |
| zulip        | 0.99928          |


After:

| project      | similarity index |
|--------------|------------------|
| cpython      | 0.76081          |
| django       | 0.99921          |
| transformers | 0.99854          |
| twine        | 0.99982          |
| typeshed     | 0.99953          |
| warehouse    | 0.99648          |
| zulip        | 0.99928          |
This commit is contained in:
Charlie Marsh 2023-08-31 21:55:05 +01:00 committed by GitHub
parent 51d69b448c
commit 376d3caf47
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
21 changed files with 1060 additions and 695 deletions

View file

@ -3,7 +3,9 @@ use ruff_python_ast::{Decorator, StmtClassDef};
use ruff_python_trivia::lines_after_ignoring_trivia;
use ruff_text_size::Ranged;
use crate::comments::format::empty_lines_before_trailing_comments;
use crate::comments::{leading_comments, trailing_comments, SourceComment};
use crate::context::NodeLevel;
use crate::prelude::*;
use crate::statement::clause::{clause_body, clause_header, ClauseHeader};
use crate::statement::suite::SuiteKind;
@ -108,7 +110,33 @@ impl FormatNodeRule<StmtClassDef> for FormatStmtClassDef {
),
clause_body(body, trailing_definition_comments).with_kind(SuiteKind::Class),
]
)?;
// If the class contains trailing comments, insert newlines before them.
// For example, given:
// ```python
// class Class:
// ...
// # comment
// ```
//
// At the top-level, reformat as:
// ```python
// class Class:
// ...
//
//
// # comment
// ```
empty_lines_before_trailing_comments(
comments.trailing(item),
if f.context().node_level() == NodeLevel::TopLevel {
2
} else {
1
},
)
.fmt(f)
}
fn fmt_dangling_comments(

View file

@ -1,9 +1,11 @@
use crate::comments::format::empty_lines_before_trailing_comments;
use ruff_formatter::write;
use ruff_python_ast::{Parameters, StmtFunctionDef};
use ruff_python_trivia::{SimpleTokenKind, SimpleTokenizer};
use ruff_text_size::Ranged;
use crate::comments::SourceComment;
use crate::context::NodeLevel;
use crate::expression::maybe_parenthesize_expression;
use crate::expression::parentheses::{Parentheses, Parenthesize};
use crate::prelude::*;
@ -144,7 +146,33 @@ impl FormatNodeRule<StmtFunctionDef> for FormatStmtFunctionDef {
),
clause_body(body, trailing_definition_comments).with_kind(SuiteKind::Function),
]
)?;
// If the function contains trailing comments, insert newlines before them.
// For example, given:
// ```python
// def func():
// ...
// # comment
// ```
//
// At the top-level, reformat as:
// ```python
// def func():
// ...
//
//
// # comment
// ```
empty_lines_before_trailing_comments(
comments.trailing(item),
if f.context().node_level() == NodeLevel::TopLevel {
2
} else {
1
},
)
.fmt(f)
}
fn fmt_dangling_comments(

View file

@ -2,7 +2,7 @@ use ruff_formatter::{write, FormatOwnedWithRule, FormatRefWithRule, FormatRuleWi
use ruff_python_ast::helpers::is_compound_statement;
use ruff_python_ast::node::AnyNodeRef;
use ruff_python_ast::{self as ast, Constant, Expr, ExprConstant, Stmt, Suite};
use ruff_python_trivia::{lines_after_ignoring_trivia, lines_before};
use ruff_python_trivia::{lines_after, lines_after_ignoring_trivia, lines_before};
use ruff_text_size::{Ranged, TextRange};
use crate::comments::{leading_comments, trailing_comments, Comments};
@ -143,7 +143,11 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
};
while let Some(following) = iter.next() {
if is_class_or_function_definition(preceding)
// Add empty lines before and after a function or class definition. If the preceding
// node is a function or class, and contains trailing comments, then the statement
// itself will add the requisite empty lines when formatting its comments.
if (is_class_or_function_definition(preceding)
&& !comments.has_trailing_own_line(preceding))
|| is_class_or_function_definition(following)
{
match self.kind {
@ -191,9 +195,13 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
empty_line().fmt(f)?;
}
}
} else if is_import_definition(preceding) && !is_import_definition(following) {
} else if is_import_definition(preceding)
&& (!is_import_definition(following) || comments.has_leading(following))
{
// Enforce _at least_ one empty line after an import statement (but allow up to
// two at the top-level).
// two at the top-level). In this context, "after an import statement" means that
// that the previous node is an import, and the following node is an import _or_ has
// a leading comment.
match self.kind {
SuiteKind::TopLevel => {
match lines_after_ignoring_trivia(preceding.end(), source) {
@ -274,16 +282,21 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
// it then counts the lines between the statement and the trailing comment, which is
// always 0. This is why it skips any trailing trivia (trivia that's on the same line)
// and counts the lines after.
lines_after_ignoring_trivia(offset, source)
lines_after(offset, source)
};
let end = comments
.trailing(preceding)
.last()
.map_or(preceding.end(), |comment| comment.slice().end());
match node_level {
NodeLevel::TopLevel => match count_lines(preceding.end()) {
NodeLevel::TopLevel => match count_lines(end) {
0 | 1 => hard_line_break().fmt(f)?,
2 => empty_line().fmt(f)?,
_ => write!(f, [empty_line(), empty_line()])?,
},
NodeLevel::CompoundStatement => match count_lines(preceding.end()) {
NodeLevel::CompoundStatement => match count_lines(end) {
0 | 1 => hard_line_break().fmt(f)?,
_ => empty_line().fmt(f)?,
},