Commit graph

32 commits

Author SHA1 Message Date
Charlie Marsh
4c53bfe896
Add formatter support for call and class definition Arguments (#6274)
## Summary

This PR leverages the `Arguments` AST node introduced in #6259 in the
formatter, which ensures that we correctly handle trailing comments in
calls, like:

```python
f(
  1,
  # comment
)

pass
```

(Previously, this was treated as a leading comment on `pass`.)

This also allows us to unify the argument handling across calls and
class definitions.

## Test Plan

A bunch of new fixture tests, plus improved Black compatibility.
2023-08-02 11:54:22 -04:00
Micha Reiser
38b5726948
formatter: WithNodeLevel helper (#6212) 2023-07-31 21:22:17 +00:00
Charlie Marsh
615337a54d
Remove newline-insertion logic from JoinNodesBuilder (#6205)
## Summary

This PR moves the "insert empty lines" behavior out of
`JoinNodesBuilder` and into the `Suite` formatter. I find it a little
confusing that the logic is split between those two formatters right
now, and since this is _only_ used in that one place, IMO it is a bit
simpler to just inline it and use a single approach to tracking state
(right now, both are stateful).

The only other place this was used was for decorators. As a side effect,
we now remove blank lines in both of these cases, which is a known but
intentional deviation from Black (which preserves the empty line before
the comment in the first case):

```python
@foo

# Hello
@bar
def baz():
    pass

@foo

@bar
def baz():
    pass
```
2023-07-31 16:58:15 -04:00
Micha Reiser
311a1f9ec4
Remove len from JoinCommaSeparatedBuilder (#6185) 2023-07-31 12:19:47 +00:00
Micha Reiser
40f54375cb
Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser
2cf00fee96
Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
konsti
46f8961292
Formatter: Add EmptyWithDanglingComments helper (#5951)
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.

Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [  # end-of-line
    # own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.

**Test Plan** Added a regression test.

145 (from previously 149) instances of unstable formatting remaining.

```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```
2023-07-23 14:32:16 +02:00
Charlie Marsh
5f3da9955a
Rename ruff_python_whitespace to ruff_python_trivia (#5886)
## Summary

This crate now contains utilities for dealing with trivia more broadly:
whitespace, newlines, "simple" trivia lexing, etc. So renaming it to
reflect its increased responsibilities.

To avoid conflicts, I've also renamed `Token` and `TokenKind` to
`SimpleToken` and `SimpleTokenKind`.
2023-07-19 11:48:27 -04:00
Charlie Marsh
4204fc002d
Remove exception-handler lexing from unused-bound-exception fix (#5851)
## Summary

The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.

For context, this rule detects the unused `e` in:

```python
try:
  pass
except ValueError as e:
  pass
```
2023-07-18 18:27:46 +00:00
Micha Reiser
3cda89ecaf
Parenthesize with statements (#5758)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR improves the parentheses handling for with items to get closer
to black's formatting.

### Case 1:

```python
# Black / Input
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    aaaaaaaaaaaaaaaaaaaaaaaaaa
    + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
    + cccccccccccccccccccccccccccc
    + ddddddddddddddddd as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
):
    ...

# Before
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    (
        aaaaaaaaaaaaaaaaaaaaaaaaaa
        + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        + cccccccccccccccccccccccccccc
        + ddddddddddddddddd
    ) as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
):
    ...
```

Notice how Ruff wraps the binary expression in an extra set of
parentheses


### Case 2:
Black does not expand the with-items if the with has no parentheses:

```python
# Black / Input
with aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c:
    ...

# Before
with (
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c
):
    ...
```

Or 

```python
# Black / Input
with [
    "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    "bbbbbbbbbb",
    "cccccccccccccccccccccccccccccccccccccccccc",
    dddddddddddddddddddddddddddddddd,
] as example1, aaaaaaaaaaaaaaaaaaaaaaaaaa * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb * cccccccccccccccccccccccccccc + ddddddddddddddddd as example2, CtxManager222222222222222() as example2:
    ...

# Before (Same as Case 1)
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    (
        aaaaaaaaaaaaaaaaaaaaaaaaaa
        * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        * cccccccccccccccccccccccccccc
        + ddddddddddddddddd
    ) as example2,
    CtxManager222222222222222() as example2,
):
    ...

```
## Test Plan

I added new snapshot tests

Improves the django similarity index from 0.973 to 0.977
2023-07-15 16:03:09 +01:00
Micha Reiser
653429bef9
Handle right parens in join comma builder (#5711) 2023-07-12 18:21:28 +02:00
Micha Reiser
f1d367655b
Format target: annotation = value? expressions (#5661) 2023-07-11 16:40:28 +02:00
Micha Reiser
8665a1a19d
Pass FormatContext to NeedsParentheses
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

I started working on this because I assumed that I would need access to options inside of `NeedsParantheses` but it then turned out that I won't. 
Anyway, it kind of felt nice to pass fewer arguments. So I'm gonna put this out here to get your feedback if you prefer this over passing individual fiels. 

Oh, I sneeked in another change. I renamed `context.contents` to `source`. `contents` is too generic and doesn't tell you anything. 

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

It compiles
2023-07-11 14:28:50 +02:00
Micha Reiser
715250a179
Prefer expanding parenthesized expressions before operands
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:

```python
# We want 
if a + [ 
	b,
	c
]: 
	pass

# Rather than
if (
    a
    + [b, c]
): 
	pass
```

This is implemented by using the new IR elements introduced in #5596. 

* We give the group wrapping the optional parentheses an ID (`parentheses_id`)
* We use `conditional_group` for the lower priority groups  (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary)
* We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we  prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized).

We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free)

## Test Plan

It increases the Jaccard index for Django from 0.915 to 0.917

## Incompatibilites

There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences). 

### Long string literals
I  commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points. 

I think we should ignore this incompatibility for now


### Expressions on statement level

I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width

```python
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd
```

But it would if the expression is used inside of a condition. 

What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
2023-07-11 14:07:39 +02:00
konsti
a647f31600
Don't add a magic trailing comma for a single entry (#5463)
## Summary

If a comma separated list has only one entry, black will respect the
magic trailing comma, but it will not add a new one.

The following code will remain as is:

```python
b1 = [
    aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
]
b2 = [
    aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa,
]
b3 = [
    aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa,
    aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
]
```

## Test Plan

This was first discovered in
7eeadc82c2/django/contrib/admin/checks.py (L674-L681),
which i've minimized into a call test.

I've added tests for the three cases (one entry + no comma, one entry +
comma, more than one entry) to the list tests.

The diffs from the black tests get smaller.
2023-07-03 21:48:44 +02:00
Micha Reiser
38189ed913
Fix invalid printer IR error (#5422) 2023-06-29 08:09:13 +02:00
Micha Reiser
49cabca3e7
Format implicit string continuation (#5328) 2023-06-26 12:41:47 +00:00
Micha Reiser
f18a1f70de
Add tests for skip magic trailing comma
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR adds tests that verify that the magic trailing comma is not respected if disabled in the formatter options. 

Our test setup now allows to create a `<fixture-name>.options.json` file that contains an array of configurations that should be tested. 

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

It's all about tests :) 

<!-- How was it tested? -->
2023-06-26 14:15:55 +02:00
Micha Reiser
dd0d1afb66
Create PyFormatOptions
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR adds a new `PyFormatOptions` struct that stores the python formatter options. 
The new options aren't used yet, with the exception of magical trailing commas and the options passed to the printer. 
I'll follow up with more PRs that use the new options (e.g. `QuoteStyle`).

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

`cargo test` I'll follow up with a new PR that adds support for overriding the options in our fixture tests.
2023-06-26 14:02:17 +02:00
Micha Reiser
d3d69a031e
Add JoinCommaSeparatedBuilder (#5342) 2023-06-23 22:03:05 +01:00
konstin
4b65446de6
Refactor magic trailing comma (#5339)
## Summary

This is small refactoring to reuse the code that detects the magic
trailing comma across functions. I make this change now to avoid copying
code in a later PR. @MichaReiser is planning on making a larger
refactoring later that integrates with the join nodes builder

## Test Plan

No functional changes. The magic trailing comma behaviour is checked by
the fixtures.
2023-06-23 18:53:55 +02:00
Micha Reiser
c1cc6f3be1
Add basic Constant formatting (#4954) 2023-06-08 11:42:44 +00:00
konstin
23abad0bd5
A basic StmtAssign formatter and better dummies for expressions (#4938)
* A basic StmtAssign formatter and better dummies for expressions

The goal of this PR was formatting StmtAssign since many nodes in the black tests (and in python in general) are after an assignment. This caused unstable formatting: The spacing of power op spacing depends on the type of the two involved expressions, but each expression was formatted as dummy string and re-parsed as a ExprName, so in the second round the different rules of ExprName were applied, causing unstable formatting.

This PR does not necessarily bring us closer to black's style, but it unlocks a good porting of black's test suite and is a basis for implementing the Expr nodes.

* fmt

* Review
2023-06-08 12:20:25 +02:00
Micha Reiser
bcf745c5ba
Replace verbatim text with NOT_YET_IMPLEMENTED (#4904)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes. 

The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:

```python
def func_no_args():
  a; b; c
  if True: raise RuntimeError
  if False: ...
  for i in range(10):
    print(i)
    continue
```

Get's reformatted to

```python
def func_no_args():
    a; b; c
    if True: raise RuntimeError
    if False: ...
    for i in range(10):
    print(i)
    continue
```

because our formatter does not yet support `for` statements and just inserts the text from the source. 

## Downsides

Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid. 

## Upsides

Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)

## Test Plan

None?
2023-06-07 14:57:25 +02:00
Micha Reiser
6ab3fc60f4
Correctly handle newlines after/before comments (#4895)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This issue fixes the removal of empty lines between a leading comment and the previous statement:

```python
a  = 20

# leading comment
b = 10
```

Ruff removed the empty line between `a` and `b` because:
* The leading comments formatting does not preserve leading newlines (to avoid adding new lines at the top of a body)
* The `JoinNodesBuilder` counted the lines before `b`, which is 1 -> Doesn't insert a new line

This is fixed by changing the `JoinNodesBuilder` to count the lines instead *after* the last node. This correctly gives 1, and the `# leading comment` will insert the empty lines between any other leading comment or the node.



## Test Plan

I added a new test for empty lines.
2023-06-07 14:49:43 +02:00
Micha Reiser
3f032cf09d
Format binary expressions (#4862)
* Format Binary Expressions

* Extract NeedsParentheses trait
2023-06-06 08:34:53 +00:00
Micha Reiser
c65f47d7c4
Format while Statement (#4810) 2023-06-05 08:24:00 +00:00
Micha Reiser
ebdc4afc33
Suite formatting and JoinNodesBuilder (#4805) 2023-06-02 14:14:38 +00:00
Charlie Marsh
51bca19c1d
Add builders for common comment rendering (#3232) 2023-02-26 04:16:24 +00:00
Jeong YunWon
84e96cdcd9
More enum work (#3212) 2023-02-25 11:40:16 -05:00
Charlie Marsh
180541a924
Unify comment terminology with that of rome_formatter (#2979) 2023-02-17 03:02:25 +00:00
Charlie Marsh
ca49b00e55
Add initial formatter implementation (#2883)
# Summary

This PR contains the code for the autoformatter proof-of-concept.

## Crate structure

The primary formatting hook is the `fmt` function in `crates/ruff_python_formatter/src/lib.rs`.

The current formatter approach is outlined in `crates/ruff_python_formatter/src/lib.rs`, and is structured as follows:

- Tokenize the code using the RustPython lexer.
- In `crates/ruff_python_formatter/src/trivia.rs`, extract a variety of trivia tokens from the token stream. These include comments, trailing commas, and empty lines.
- Generate the AST via the RustPython parser.
- In `crates/ruff_python_formatter/src/cst.rs`, convert the AST to a CST structure. As of now, the CST is nearly identical to the AST, except that every node gets a `trivia` vector. But we might want to modify it further.
- In `crates/ruff_python_formatter/src/attachment.rs`, attach each trivia token to the corresponding CST node. The logic for this is mostly in `decorate_trivia` and is ported almost directly from Prettier (given each token, find its preceding, following, and enclosing nodes, then attach the token to the appropriate node in a second pass).
- In `crates/ruff_python_formatter/src/newlines.rs`, normalize newlines to match Black’s preferences. This involves traversing the CST and inserting or removing `TriviaToken` values as we go.
- Call `format!` on the CST, which delegates to type-specific formatter implementations (e.g., `crates/ruff_python_formatter/src/format/stmt.rs` for `Stmt` nodes, and similar for `Expr` nodes; the others are trivial). Those type-specific implementations delegate to kind-specific functions (e.g., `format_func_def`).

## Testing and iteration

The formatter is being developed against the Black test suite, which was copied over in-full to `crates/ruff_python_formatter/resources/test/fixtures/black`.

The Black fixtures had to be modified to create `[insta](https://github.com/mitsuhiko/insta)`-compatible snapshots, which now exist in the repo.

My approach thus far has been to try and improve coverage by tackling fixtures one-by-one.

## What works, and what doesn’t

- *Most* nodes are supported at a basic level (though there are a few stragglers at time of writing, like `StmtKind::Try`).
- Newlines are properly preserved in most cases.
- Magic trailing commas are properly preserved in some (but not all) cases.
- Trivial leading and trailing standalone comments mostly work (although maybe not at the end of a file).
- Inline comments, and comments within expressions, often don’t work -- they work in a few cases, but it’s one-off right now. (We’re probably associating them with the “right” nodes more often than we are actually rendering them in the right place.)
- We don’t properly normalize string quotes. (At present, we just repeat any constants verbatim.)
- We’re mishandling a bunch of wrapping cases (if we treat Black as the reference implementation). Here are a few examples (demonstrating Black's stable behavior):

```py
# In some cases, if the end expression is "self-closing" (functions,
# lists, dictionaries, sets, subscript accesses, and any length-two
# boolean operations that end in these elments), Black
# will wrap like this...
if some_expression and f(
    b,
    c,
    d,
):
    pass

# ...whereas we do this:
if (
    some_expression
    and f(
        b,
        c,
        d,
    )
):
    pass

# If function arguments can fit on a single line, then Black will
# format them like this, rather than exploding them vertically.
if f(
    a, b, c, d, e, f, g, ...
):
    pass
```

- We don’t properly preserve parentheses in all cases. Black preserves parentheses in some but not all cases.
2023-02-15 04:06:35 +00:00