Commit graph

28 commits

Author SHA1 Message Date
Micha Reiser
9c3fb23ace
Simple lexer for formatter (#4922) 2023-06-08 17:37:39 +02:00
Micha Reiser
6ab3fc60f4
Correctly handle newlines after/before comments (#4895)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This issue fixes the removal of empty lines between a leading comment and the previous statement:

```python
a  = 20

# leading comment
b = 10
```

Ruff removed the empty line between `a` and `b` because:
* The leading comments formatting does not preserve leading newlines (to avoid adding new lines at the top of a body)
* The `JoinNodesBuilder` counted the lines before `b`, which is 1 -> Doesn't insert a new line

This is fixed by changing the `JoinNodesBuilder` to count the lines instead *after* the last node. This correctly gives 1, and the `# leading comment` will insert the empty lines between any other leading comment or the node.



## Test Plan

I added a new test for empty lines.
2023-06-07 14:49:43 +02:00
Micha Reiser
3f032cf09d
Format binary expressions (#4862)
* Format Binary Expressions

* Extract NeedsParentheses trait
2023-06-06 08:34:53 +00:00
Micha Reiser
c65f47d7c4
Format while Statement (#4810) 2023-06-05 08:24:00 +00:00
Micha Reiser
5d939222db
Leading, Dangling, and Trailing comments formatting (#4785) 2023-06-02 09:26:36 +02:00
Micha Reiser
4ea4fd1984
Introduce lines_before helper (#4780) 2023-06-01 11:56:43 +02:00
Micha Reiser
59148344be
Place comments of left and right binary expression operands (#4751) 2023-06-01 07:01:32 +00:00
Micha Reiser
06bcb85f81
formatter: Remove CST and old formatting (#4730) 2023-05-31 08:27:23 +02:00
Micha Reiser
edc6c4058f
Move shared_traits to ruff_formatter (#4632) 2023-05-24 17:38:11 +02:00
Charlie Marsh
f0465bf106
Emit non-logical newlines for "empty" lines (#4444) 2023-05-16 14:58:56 +00:00
Micha Reiser
cab65b25da
Replace row/column based Location with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Charlie Marsh
0a9d259f9c
Remove copied core modules from ruff_python_formatter (#3371) 2023-03-08 19:03:40 +00:00
Charlie Marsh
f5f09b489b
Introduce dedicated CST tokens for other operator kinds (#3267) 2023-02-27 23:54:57 -05:00
Charlie Marsh
061495a9eb
Make BoolOp its own located token (#3265) 2023-02-28 03:43:28 +00:00
Charlie Marsh
470e1c1754
Preserve comments on non-defaulted arguments (#3264) 2023-02-27 23:41:40 +00:00
Charlie Marsh
2261e194a0
Create dedicated Body nodes in the formatter CST (#3223) 2023-02-27 22:55:05 +00:00
Charlie Marsh
1c75071136
Implement basic rendering of remaining AST nodes (#3233) 2023-02-26 05:05:56 +00:00
Jeong YunWon
84e96cdcd9
More enum work (#3212) 2023-02-25 11:40:16 -05:00
Charlie Marsh
6eaacf96be
Introduce a new CST element for slice segments (#3195) 2023-02-24 00:49:41 +00:00
Charlie Marsh
bda2a0007a
Parenthesize numbers during attribute accesses (#3189) 2023-02-23 14:57:23 -05:00
Charlie Marsh
2f9de335db
Upgrade RustPython to match new flattened exports (#3141) 2023-02-22 19:36:13 +00:00
Micha Reiser
ffd8e958fc
chore: Upgrade Rust to 1.67.0 (#3125) 2023-02-22 10:03:17 -05:00
Charlie Marsh
cdc4e86158
Add support for TryStar (#3089) 2023-02-21 13:42:20 -05:00
Charlie Marsh
ce8953442d
Add support for trailing colons in slice expressions (#3077) 2023-02-20 23:24:32 +00:00
Charlie Marsh
6e02405bd6
Add StmtKind::Try; fix trailing newlines (#3074) 2023-02-20 22:55:32 +00:00
Charlie Marsh
180541a924
Unify comment terminology with that of rome_formatter (#2979) 2023-02-17 03:02:25 +00:00
Charlie Marsh
cb971d3a48
Respect self as positional-only argument in annotation rules (#2927) 2023-02-15 15:25:17 +00:00
Charlie Marsh
ca49b00e55
Add initial formatter implementation (#2883)
# Summary

This PR contains the code for the autoformatter proof-of-concept.

## Crate structure

The primary formatting hook is the `fmt` function in `crates/ruff_python_formatter/src/lib.rs`.

The current formatter approach is outlined in `crates/ruff_python_formatter/src/lib.rs`, and is structured as follows:

- Tokenize the code using the RustPython lexer.
- In `crates/ruff_python_formatter/src/trivia.rs`, extract a variety of trivia tokens from the token stream. These include comments, trailing commas, and empty lines.
- Generate the AST via the RustPython parser.
- In `crates/ruff_python_formatter/src/cst.rs`, convert the AST to a CST structure. As of now, the CST is nearly identical to the AST, except that every node gets a `trivia` vector. But we might want to modify it further.
- In `crates/ruff_python_formatter/src/attachment.rs`, attach each trivia token to the corresponding CST node. The logic for this is mostly in `decorate_trivia` and is ported almost directly from Prettier (given each token, find its preceding, following, and enclosing nodes, then attach the token to the appropriate node in a second pass).
- In `crates/ruff_python_formatter/src/newlines.rs`, normalize newlines to match Black’s preferences. This involves traversing the CST and inserting or removing `TriviaToken` values as we go.
- Call `format!` on the CST, which delegates to type-specific formatter implementations (e.g., `crates/ruff_python_formatter/src/format/stmt.rs` for `Stmt` nodes, and similar for `Expr` nodes; the others are trivial). Those type-specific implementations delegate to kind-specific functions (e.g., `format_func_def`).

## Testing and iteration

The formatter is being developed against the Black test suite, which was copied over in-full to `crates/ruff_python_formatter/resources/test/fixtures/black`.

The Black fixtures had to be modified to create `[insta](https://github.com/mitsuhiko/insta)`-compatible snapshots, which now exist in the repo.

My approach thus far has been to try and improve coverage by tackling fixtures one-by-one.

## What works, and what doesn’t

- *Most* nodes are supported at a basic level (though there are a few stragglers at time of writing, like `StmtKind::Try`).
- Newlines are properly preserved in most cases.
- Magic trailing commas are properly preserved in some (but not all) cases.
- Trivial leading and trailing standalone comments mostly work (although maybe not at the end of a file).
- Inline comments, and comments within expressions, often don’t work -- they work in a few cases, but it’s one-off right now. (We’re probably associating them with the “right” nodes more often than we are actually rendering them in the right place.)
- We don’t properly normalize string quotes. (At present, we just repeat any constants verbatim.)
- We’re mishandling a bunch of wrapping cases (if we treat Black as the reference implementation). Here are a few examples (demonstrating Black's stable behavior):

```py
# In some cases, if the end expression is "self-closing" (functions,
# lists, dictionaries, sets, subscript accesses, and any length-two
# boolean operations that end in these elments), Black
# will wrap like this...
if some_expression and f(
    b,
    c,
    d,
):
    pass

# ...whereas we do this:
if (
    some_expression
    and f(
        b,
        c,
        d,
    )
):
    pass

# If function arguments can fit on a single line, then Black will
# format them like this, rather than exploding them vertically.
if f(
    a, b, c, d, e, f, g, ...
):
    pass
```

- We don’t properly preserve parentheses in all cases. Black preserves parentheses in some but not all cases.
2023-02-15 04:06:35 +00:00