Preserve newlines after nested compound statements (#7608)

## Summary

Given:
```python
if True:
    if True:
        pass
    else:
        pass
        # a

        # b
        # c

else:
    pass
```

We want to preserve the newline after the `# c` (before the `else`).
However, the `last_node` ends at the `pass`, and the comments are
trailing comments on the `pass`, not trailing comments on the
`last_node` (the `if`). As such, when counting the trailing newlines on
the outer `if`, we abort as soon as we see the comment (`# a`).

This PR changes the logic to skip _all_ comments (even those with
newlines between them). This is safe as we know that there are no
"leading" comments on the `else`, so there's no risk of skipping those
accidentally.

Closes https://github.com/astral-sh/ruff/issues/7602.

## Test Plan

No change in compatibility.

Before:

| project | similarity index | total files | changed files |

|--------------|------------------:|------------------:|------------------:|
| cpython | 0.76083 | 1789 | 1631 |
| django | 0.99983 | 2760 | 36 |
| transformers | 0.99963 | 2587 | 319 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99979 | 3496 | 22 |
| warehouse | 0.99967 | 648 | 15 |
| zulip | 0.99972 | 1437 | 21 |

After:

| project | similarity index | total files | changed files |

|--------------|------------------:|------------------:|------------------:|
| cpython | 0.76083 | 1789 | 1631 |
| django | 0.99983 | 2760 | 36 |
| transformers | 0.99963 | 2587 | 319 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99983 | 3496 | 18 |
| warehouse | 0.99967 | 648 | 15 |
| zulip | 0.99972 | 1437 | 21 |
This commit is contained in:
Charlie Marsh 2023-09-25 10:21:44 -04:00 committed by GitHub
parent 8ce138760a
commit 17ceb5dcb3
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 182 additions and 42 deletions

View file

@ -88,10 +88,33 @@ pub fn lines_after(offset: TextSize, code: &str) -> u32 {
newlines
}
/// Counts the empty lines after `offset`, ignoring any trailing trivia: end-of-line comments,
/// own-line comments, and any intermediary newlines.
pub fn lines_after_ignoring_trivia(offset: TextSize, code: &str) -> u32 {
let mut newlines = 0u32;
for token in SimpleTokenizer::starts_at(offset, code) {
match token.kind() {
SimpleTokenKind::Newline => {
newlines += 1;
}
SimpleTokenKind::Whitespace => {}
// If we see a comment, reset the newlines counter.
SimpleTokenKind::Comment => {
newlines = 0;
}
// As soon as we see a non-trivia token, we're done.
_ => {
break;
}
}
}
newlines
}
/// Counts the empty lines after `offset`, ignoring any trailing trivia on the same line as
/// `offset`.
#[allow(clippy::cast_possible_truncation)]
pub fn lines_after_ignoring_trivia(offset: TextSize, code: &str) -> u32 {
pub fn lines_after_ignoring_end_of_line_trivia(offset: TextSize, code: &str) -> u32 {
// SAFETY: We don't support files greater than 4GB, so casting to u32 is safe.
SimpleTokenizer::starts_at(offset, code)
.skip_while(|token| token.kind != SimpleTokenKind::Newline && token.kind.is_trivia())