Commit graph

56 commits

Author SHA1 Message Date
Micha Reiser
9ae698fe30
Switch to Rust 2024 edition (#18129) 2025-05-16 13:25:28 +02:00
Micha Reiser
196e4befba
Update MSRV to 1.85 and toolchain to 1.87 (#18126) 2025-05-16 09:19:55 +02:00
Douglas Creager
ef85c682bd
Remove customizable reference enum names (#15647)
The AST generator creates a reference enum for each syntax group — an
enum where each variant contains a reference to the relevant syntax
node. Previously you could customize the name of the reference enum for
a group — primarily because there was an existing `ExpressionRef` type
that wouldn't have lined up with the auto-derived name `ExprRef`. This
follow-up PR is a simple search/replace to switch over to the
auto-derived name, so that we can remove this customization point.
2025-01-21 13:46:31 -05:00
Micha Reiser
9442cd8fae
Parenthesize match..case if guards (#13513) 2024-09-26 06:44:33 +00:00
Micha Reiser
531ebf6dff
Fix parentheses around return type annotations (#13381) 2024-09-20 09:23:53 +02:00
Dhruv Manilawala
549cc1e437
Build CommentRanges outside the parser (#11792)
## Summary

This PR updates the parser to remove building the `CommentRanges` and
instead it'll be built by the linter and the formatter when it's
required.

For the linter, it'll be built and owned by the `Indexer` while for the
formatter it'll be built from the `Tokens` struct and passed as an
argument.

## Test Plan

`cargo insta test`
2024-06-09 09:55:17 +00:00
Dhruv Manilawala
bf5b62edac
Maintain synchronicity between the lexer and the parser (#11457)
## Summary

This PR updates the entire parser stack in multiple ways:

### Make the lexer lazy

* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473

Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.

The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.

The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.

Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.

This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)

[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`

### Remove `SoftKeywordTransformer`

* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474

For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.

* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.

If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).

The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.

### Update the parser API

* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505

Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.

Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens

Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.

### Refactor the downstream tools (linter and formatter)

* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592

And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
	* https://github.com/astral-sh/ruff/pull/11402
	* https://github.com/astral-sh/ruff/pull/11406
	* https://github.com/astral-sh/ruff/pull/11418
	* https://github.com/astral-sh/ruff/pull/11419
	* https://github.com/astral-sh/ruff/pull/11420
	* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR

For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
	1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.

## Test Plan

- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects

### Running this change on dozens of open-source projects

Refer to the PR description to get the list of open source projects used
for testing.

Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`

**Conclusion: all output were same**

## What's next?

The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
2024-06-03 18:23:50 +05:30
Micha Reiser
91046e4c81
Preserve indent around multiline strings (#9637) 2024-01-26 08:18:30 +01:00
Micha Reiser
ac02d3aedd
Hug multiline-strings preview style (#9243) 2024-01-10 12:47:34 +01:00
Charlie Marsh
e80260a3c5
Remove source path from parser errors (#9322)
## Summary

I always found it odd that we had to pass this in, since it's really
higher-level context for the error. The awkwardness is further evidenced
by the fact that we pass in fake values everywhere (even outside of
tests). The source path isn't actually used to display the error; it's
only accessed elsewhere to _re-display_ the error in certain cases. This
PR modifies to instead pass the path directly in those cases.
2023-12-30 20:33:05 +00:00
Micha Reiser
5aaf99b856
Implement the fix_power_op_line_length preview style (#8947) 2023-12-02 09:35:34 +09:00
Charlie Marsh
019d9aebe9
Implement multiline dictionary and list hugging for preview style (#8293)
## Summary

This PR implement's Black's new single-argument hugging for lists, sets,
and dictionaries under preview style.

For example, this:

```python
foo(
    [
        1,
        2,
        3,
    ]
)
```

Would instead now be formatted as:

```python
foo([
    1,
    2,
    3,
])
```

A couple notes:

- This doesn't apply when the argument has a magic trailing comma.
- This _does_ apply when the argument is starred or double-starred.
- We don't apply this when there are comments before or after the
argument, though Black does in some cases (and moves the comments
outside the call parentheses).

It doesn't say it in the originating PR
(https://github.com/psf/black/pull/3964), but I think this also applies
to parenthesized expressions? At least, it does in my testing of preview
vs. stable, though it's possible that behavior predated the linked PR.

See: #8279.

## Test Plan

Before:

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99963 | 10596 | 146 |
| poetry | 0.99925 | 317 | 12 |
| transformers | 0.99967 | 2657 | 322 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 21 |

After:

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99963 | 10596 | 146 |
| poetry | 0.96215 | 317 | 34 |
| transformers | 0.99967 | 2657 | 322 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 21 |
2023-11-30 21:11:14 -05:00
konsti
dca430f4d2
Fix instability with await fluent style (#8676)
Fix an instability where await was followed by a breaking fluent style
expression:

```python
test_data = await (
    Stream.from_async(async_data)
    .flat_map_async()
    .map()
    .filter_async(is_valid_data)
    .to_list()
)
```

Note that this technically a minor style change (see ecosystem check)
2023-11-17 12:24:19 -05:00
Dhruv Manilawala
3e00ddce38
Preserve trailing semicolon for Notebooks (#8590)
## Summary

This PR updates the formatter to preserve trailing semicolon for Jupyter
Notebooks.

The motivation behind the change is that semicolons in notebooks are
typically used to hide the output, for example when plotting. This is
highlighted in the linked issue.

The conditions required as to when the trailing semicolon should be
preserved are:
1. It should be a top-level statement which is last in the module.
2. For statement, it can be either assignment, annotated assignment, or
augmented assignment. Here, the target should only be a single
identifier i.e., multiple assignments or tuple unpacking isn't
considered.
3. For expression, it can be any.

## Test Plan

Add a new integration test in `ruff_cli`. The test notebook basically
acts as a document as to which trailing semicolons are to be preserved.

fixes: #8254
2023-11-10 21:53:35 +05:30
Charlie Marsh
d685107638
Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030)
This is a do-over of https://github.com/astral-sh/ruff/pull/8011, which
I accidentally merged into a non-`main` branch. Sorry!
2023-10-18 00:01:18 +00:00
Micha Reiser
192463c2fb
Allow parenthesized content exceed configured line width (#7490) 2023-09-20 08:39:25 +02:00
Micha Reiser
6a4dbd622b
Add optimized best_fit_parenthesize IR (#7475) 2023-09-19 06:29:05 +00:00
konsti
2cbe1733c8
Use CommentRanges in backwards lexing (#7360)
## Summary

The tokenizer was split into a forward and a backwards tokenizer. The
backwards tokenizer uses the same names as the forwards ones (e.g.
`next_token`). The backwards tokenizer gets the comment ranges that we
already built to skip comments.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-09-16 03:21:45 +00:00
Micha Reiser
2d9b39871f
Introduce IndentWidth (#7301) 2023-09-13 14:52:24 +02:00
Micha Reiser
e376c3ff7e
Split implicit concatenated strings before binary expressions (#7145) 2023-09-08 06:51:26 +00:00
Micha Reiser
c05e4628b1
Introduce Token element (#7048) 2023-09-02 10:05:47 +02:00
Charlie Marsh
fc89976c24
Move Ranged into ruff_text_size (#6919)
## Summary

The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.

Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.

## Test Plan

`cargo test`
2023-08-27 14:12:51 -04:00
Micha Reiser
9d77552e18
Add tab width option (#6848) 2023-08-26 12:29:58 +02:00
Micha Reiser
04a9a8dd03
Maybe parenthesize long constants and names (#6816) 2023-08-24 09:47:57 +00:00
Micha Reiser
ccac9681e1
Preserve yield parentheses (#6766) 2023-08-22 10:27:20 +00:00
Micha Reiser
8b347cdaa9
Simplify IfRequired needs parentheses condition (#6678) 2023-08-21 07:11:31 +00:00
Charlie Marsh
3ceb6fbeb0
Remove some unnecessary ampersands in the formatter (#6667) 2023-08-18 04:18:26 +00:00
Charlie Marsh
1334232168
Introduce ExpressionRef (#6637)
## Summary

This PR revives the `ExpressionRef` concept introduced in
https://github.com/astral-sh/ruff/pull/5644, motivated by the change we
want to make in https://github.com/astral-sh/ruff/pull/6575 to narrow
the type of the expression that can be passed to `parenthesized_range`.

## Test Plan

`cargo test`
2023-08-17 10:07:16 -04:00
Micha Reiser
daac31d2b9
Make Buffer::write_element non-failable (#6613) 2023-08-16 15:13:07 +02:00
Charlie Marsh
53246b725e
Allow return type annotations to use their own parentheses (#6436)
## Summary

This PR modifies our logic for wrapping return type annotations.
Previously, we _always_ wrapped the annotation in parentheses if it
expanded; however, Black only exhibits this behavior when the function
parameters is empty (i.e., it doesn't and can't break). In other cases,
it uses the normal parenthesization rules, allowing nodes to bring their
own parentheses.

For example, given:

```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...

def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(x) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...
```

Black will format as:

```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...


def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x,
) -> Set[
    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
    ...
```

Whereas, prior to this PR, Ruff would format as:

```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...


def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
    x,
) -> (
    Set[
        "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
):
    ...
```

Closes https://github.com/astral-sh/ruff/issues/6431.

## Test Plan

Before:

- `zulip`: 0.99702
- `django`: 0.99784
- `warehouse`: 0.99585
- `build`: 0.75623
- `transformers`: 0.99470
- `cpython`: 0.75988
- `typeshed`: 0.74853

After:

- `zulip`: 0.99724
- `django`: 0.99791
- `warehouse`: 0.99586
- `build`: 0.75623
- `transformers`: 0.99474
- `cpython`: 0.75956
- `typeshed`: 0.74857
2023-08-11 18:19:21 +00:00
Charlie Marsh
c7703e205d
Move empty_parenthesized into the parentheses.rs (#6403)
## Summary

This PR moves `empty_parenthesized` such that it's peer to
`parenthesized`, and changes the API to better match that of
`parenthesized` (takes `&str` rather than `StaticText`, has a
`with_dangling_comments` method, etc.).

It may be intentionally _not_ part of `parentheses.rs`, but to me
they're so similar that it makes more sense for them to be in the same
module, with the same API, etc.
2023-08-08 19:17:17 +00:00
Micha Reiser
2bd345358f
Simplify parenthesized formatting (#6419) 2023-08-08 08:50:57 +00:00
Charlie Marsh
8919b6ad9a
Add a with_dangling_comments to the parenthesized formatter (#6402)
See: https://github.com/astral-sh/ruff/pull/6376#discussion_r1285514328.
2023-08-07 19:12:12 +00:00
Charlie Marsh
b763973357
Avoid hard line break after dangling open-parenthesis comments (#6380)
## Summary

Given:

```python
[  # comment
    first,
    second,
    third
]  # another comment
```

We were adding a hard line break as part of the formatting of `#
comment`, which led to the following formatting:

```python
[first, second, third]  # comment
  # another comment
```

Closes https://github.com/astral-sh/ruff/issues/6367.
2023-08-07 14:15:32 +00:00
konsti
99baad12d8
Call chain formatting in fluent style (#6151)
Implement fluent style/call chains. See the `call_chains.py` formatting
for examples.

This isn't fully like black because in `raise A from B` they allow `A`
breaking can influence the formatting of `B` even if it is already
multiline.

Similarity index:

| project      | main  | PR    |
|--------------|-------|-------|
| build        | ???   | 0.753 |
| django       | 0.991 | 0.998 |
| transformers | 0.993 | 0.994 |
| typeshed     | 0.723 | 0.723 |
| warehouse    | 0.978 | 0.994 |
| zulip        | 0.992 | 0.994 |

Call chain formatting is affected by
https://github.com/astral-sh/ruff/issues/627, but i'm cutting scope
here.

Closes #5343

**Test Plan**:
 * Added a dedicated call chains test file
 * The ecosystem checks found some bugs
 * I manually check django and zulip formatting

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-08-04 13:58:01 +00:00
Charlie Marsh
1d8759d5df
Generalize comment-after-bracket handling to lists, sets, etc. (#6320)
## Summary

We already support preserving the end-of-line comment in calls and type
parameters, as in:

```python
foo(  # comment
    bar,
)
```

This PR adds the same behavior for lists, sets, comprehensions, etc.,
such that we preserve:

```python
[  # comment
    1,
    2,
    3,
]
```

And related cases.
2023-08-04 01:28:05 +00:00
Micha Reiser
38b5726948
formatter: WithNodeLevel helper (#6212) 2023-07-31 21:22:17 +00:00
Micha Reiser
40f54375cb
Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser
2cf00fee96
Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Charlie Marsh
5f3da9955a
Rename ruff_python_whitespace to ruff_python_trivia (#5886)
## Summary

This crate now contains utilities for dealing with trivia more broadly:
whitespace, newlines, "simple" trivia lexing, etc. So renaming it to
reflect its increased responsibilities.

To avoid conflicts, I've also renamed `Token` and `TokenKind` to
`SimpleToken` and `SimpleTokenKind`.
2023-07-19 11:48:27 -04:00
Charlie Marsh
4204fc002d
Remove exception-handler lexing from unused-bound-exception fix (#5851)
## Summary

The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.

For context, this rule detects the unused `e` in:

```python
try:
  pass
except ValueError as e:
  pass
```
2023-07-18 18:27:46 +00:00
Micha Reiser
3b32e3a8fe
perf(formatter): Improve is_expression_parenthesized performance (#5825) 2023-07-18 15:48:49 +02:00
Micha Reiser
3cda89ecaf
Parenthesize with statements (#5758)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR improves the parentheses handling for with items to get closer
to black's formatting.

### Case 1:

```python
# Black / Input
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    aaaaaaaaaaaaaaaaaaaaaaaaaa
    + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
    + cccccccccccccccccccccccccccc
    + ddddddddddddddddd as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
):
    ...

# Before
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    (
        aaaaaaaaaaaaaaaaaaaaaaaaaa
        + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        + cccccccccccccccccccccccccccc
        + ddddddddddddddddd
    ) as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
):
    ...
```

Notice how Ruff wraps the binary expression in an extra set of
parentheses


### Case 2:
Black does not expand the with-items if the with has no parentheses:

```python
# Black / Input
with aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c:
    ...

# Before
with (
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c
):
    ...
```

Or 

```python
# Black / Input
with [
    "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    "bbbbbbbbbb",
    "cccccccccccccccccccccccccccccccccccccccccc",
    dddddddddddddddddddddddddddddddd,
] as example1, aaaaaaaaaaaaaaaaaaaaaaaaaa * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb * cccccccccccccccccccccccccccc + ddddddddddddddddd as example2, CtxManager222222222222222() as example2:
    ...

# Before (Same as Case 1)
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    (
        aaaaaaaaaaaaaaaaaaaaaaaaaa
        * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        * cccccccccccccccccccccccccccc
        + ddddddddddddddddd
    ) as example2,
    CtxManager222222222222222() as example2,
):
    ...

```
## Test Plan

I added new snapshot tests

Improves the django similarity index from 0.973 to 0.977
2023-07-15 16:03:09 +01:00
Micha Reiser
8187bf9f7e
Cover Black's is_aritmetic_like formatting (#5738) 2023-07-14 17:54:58 +02:00
Micha Reiser
067b2a6ce6
Pass parent to NeedsParentheses (#5708) 2023-07-13 08:57:29 +02:00
Micha Reiser
f1d367655b
Format target: annotation = value? expressions (#5661) 2023-07-11 16:40:28 +02:00
Micha Reiser
8665a1a19d
Pass FormatContext to NeedsParentheses
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

I started working on this because I assumed that I would need access to options inside of `NeedsParantheses` but it then turned out that I won't. 
Anyway, it kind of felt nice to pass fewer arguments. So I'm gonna put this out here to get your feedback if you prefer this over passing individual fiels. 

Oh, I sneeked in another change. I renamed `context.contents` to `source`. `contents` is too generic and doesn't tell you anything. 

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

It compiles
2023-07-11 14:28:50 +02:00
Micha Reiser
715250a179
Prefer expanding parenthesized expressions before operands
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:

```python
# We want 
if a + [ 
	b,
	c
]: 
	pass

# Rather than
if (
    a
    + [b, c]
): 
	pass
```

This is implemented by using the new IR elements introduced in #5596. 

* We give the group wrapping the optional parentheses an ID (`parentheses_id`)
* We use `conditional_group` for the lower priority groups  (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary)
* We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we  prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized).

We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free)

## Test Plan

It increases the Jaccard index for Django from 0.915 to 0.917

## Incompatibilites

There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences). 

### Long string literals
I  commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points. 

I think we should ignore this incompatibility for now


### Expressions on statement level

I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width

```python
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd
```

But it would if the expression is used inside of a condition. 

What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
2023-07-11 14:07:39 +02:00
Micha Reiser
40ddc1604c
Introduce parenthesized helper (#5565) 2023-07-07 11:28:25 +02:00
Micha Reiser
49cabca3e7
Format implicit string continuation (#5328) 2023-06-26 12:41:47 +00:00