Commit graph

21 commits

Author SHA1 Message Date
Dhruv Manilawala
f3dac27e9a
Fix f-string formatting in assignment statement (#14454)
## Summary

fixes: #13813

This PR fixes a bug in the formatting assignment statement when the
value is an f-string.

This is resolved by using custom best fit layouts if the f-string is (a)
not already a flat f-string (thus, cannot be multiline) and (b) is not a
multiline string (thus, cannot be flattened). So, it is used in cases
like the following:
```py
aaaaaaaaaaaaaaaaaa = f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
    expression}moreeeeeeeeeeeeeeeee"
```
Which is (a) `FStringLayout::Multiline` and (b) not a multiline.

There are various other examples in the PR diff along with additional
explanation and context as code comments.

## Test Plan

Add multiple test cases for various scenarios.
2024-11-26 15:07:18 +05:30
Micha Reiser
9f3a38d408
Extract LineIndex independent methods from Locator (#13938)
Some checks are pending
CI / Fuzz the parser (push) Blocked by required conditions
CI / test scripts (push) Blocked by required conditions
CI / ecosystem (push) Blocked by required conditions
CI / cargo shear (push) Blocked by required conditions
CI / Determine changes (push) Waiting to run
CI / cargo fmt (push) Waiting to run
CI / cargo clippy (push) Blocked by required conditions
CI / python package (push) Waiting to run
CI / cargo test (linux) (push) Blocked by required conditions
CI / cargo test (windows) (push) Blocked by required conditions
CI / cargo test (wasm) (push) Blocked by required conditions
CI / cargo build (release) (push) Blocked by required conditions
CI / cargo build (msrv) (push) Blocked by required conditions
CI / cargo fuzz (push) Blocked by required conditions
CI / pre-commit (push) Waiting to run
CI / mkdocs (push) Waiting to run
CI / formatter instabilities and black similarity (push) Blocked by required conditions
CI / test ruff-lsp (push) Blocked by required conditions
CI / benchmarks (push) Blocked by required conditions
2024-10-28 07:53:41 +00:00
Micha Reiser
73ee72b665
Join implicit concatenated strings when they fit on a line (#13663) 2024-10-24 11:52:22 +02:00
Micha Reiser
8f5b2aac9a
Refactor: Remove StringPart and AnyStringPart in favor of StringLikePart (#13772) 2024-10-16 12:52:06 +02:00
Micha Reiser
b9827a4122
Remove layout values from AnyStringPart (#13681) 2024-10-09 07:25:40 +01:00
Micha Reiser
253f5f269a
refactor: Rename FormatStringContinuation to FormatImplicitConcatenatedString (#13531) 2024-09-27 08:24:50 +00:00
Alex Waygood
246a3388ee
Implement a common trait for the string flags (#11564) 2024-05-27 16:02:01 +01:00
Alex Waygood
6963f75a14
Move string-prefix enumerations to a separate submodule (#11425)
## Summary

This moves the string-prefix enumerations in `ruff_python_ast` to a
separate submodule. I think this helps clarify that these prefixes are
purely abstract: they only depend on each other, and do not depend on
any of the other code in `nodes.rs` in any way. Moreover, while various
AST nodes _use_ them, they're not really nodes themselves, so they feel
slightly out of place in `nodes.rs`.

I considered moving all of them to `str.rs`, but it felt like enough
code that it could be a separate submodule.

## Test Plan

`cargo test`
2024-05-15 07:40:27 -04:00
Dhruv Manilawala
6ecb4776de
Rename AnyStringKind -> AnyStringFlags (#11405)
## Summary

This PR renames `AnyStringKind` to `AnyStringFlags` and `AnyStringFlags`
to `AnyStringFlagsInner`.

The main motivation is to have consistent usage of "kind" and "flags".
For each string kind, it's "flags" like `StringLiteralFlags`,
`BytesLiteralFlags`, and `FStringFlags` but it was `AnyStringKind` for
the "any" variant.
2024-05-13 13:18:07 +00:00
Alex Waygood
7caf0d064a
Simplify formatting of strings by using flags from the AST nodes (#10489) 2024-03-20 16:16:54 +00:00
Alex Waygood
c2e15f38ee
Unify enums used for internal representation of quoting style (#10383) 2024-03-13 17:19:17 +00:00
Dhruv Manilawala
72bf1c2880
Preview minimal f-string formatting (#9642)
## Summary

_This is preview only feature and is available using the `--preview`
command-line flag._

With the implementation of [PEP 701] in Python 3.12, f-strings can now
be broken into multiple lines, can contain comments, and can re-use the
same quote character. Currently, no other Python formatter formats the
f-strings so there's some discussion which needs to happen in defining
the style used for f-string formatting. Relevant discussion:
https://github.com/astral-sh/ruff/discussions/9785

The goal for this PR is to add minimal support for f-string formatting.
This would be to format expression within the replacement field without
introducing any major style changes.

### Newlines

The heuristics for adding newline is similar to that of
[Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals)
where the formatter would only split an expression in the replacement
field across multiple lines if there was already a line break within the
replacement field.

In other words, the formatter would not add any newlines unless they
were already present i.e., they were added by the user. This makes
breaking any expression inside an f-string optional and in control of
the user. For example,

```python
# We wouldn't break this
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"

# But, we would break the following as there's already a newline
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
	aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
```


If there are comments in any of the replacement field of the f-string,
then it will always be a multi-line f-string in which case the formatter
would prefer to break expressions i.e., introduce newlines. For example,

```python
x = f"{ # comment
    a }"
```

### Quotes

The logic for formatting quotes remains unchanged. The existing logic is
used to determine the necessary quote char and is used accordingly.

Now, if the expression inside an f-string is itself a string like, then
we need to make sure to preserve the existing quote and not change it to
the preferred quote unless it's 3.12. For example,

```python
f"outer {'inner'} outer"

# For pre 3.12, preserve the single quote
f"outer {'inner'} outer"

# While for 3.12 and later, the quotes can be changed
f"outer {"inner"} outer"
```

But, for triple-quoted strings, we can re-use the same quote char unless
the inner string is itself a triple-quoted string.

```python
f"""outer {"inner"} outer"""  # valid
f"""outer {'''inner'''} outer"""  # preserve the single quote char for the inner string
```

### Debug expressions

If debug expressions are present in the replacement field of a f-string,
then the whitespace needs to be preserved as they will be rendered as it
is (for example, `f"{ x = }"`. If there are any nested f-strings, then
the whitespace in them needs to be preserved as well which means that
we'll stop formatting the f-string as soon as we encounter a debug
expression.

```python
f"outer {   x =  !s  :.3f}"
#                  ^^
#                  We can remove these whitespaces
```

Now, the whitespace doesn't need to be preserved around conversion spec
and format specifiers, so we'll format them as usual but we won't be
formatting any nested f-string within the format specifier.

### Miscellaneous

- The
[`hug_parens_with_braces_and_square_brackets`](https://github.com/astral-sh/ruff/issues/8279)
preview style isn't implemented w.r.t. the f-string curly braces.
- The
[indentation](https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590)
is always relative to the f-string containing statement

## Test Plan

* Add new test cases
* Review existing snapshot changes
* Review the ecosystem changes

[PEP 701]: https://peps.python.org/pep-0701/
2024-02-16 20:28:11 +05:30
Micha Reiser
fe79798c12
split string module (#9987) 2024-02-14 18:54:55 +01:00
Dhruv Manilawala
6f9c128d77
Separate StringNormalizer from StringPart (#9954)
## Summary

This PR is a small refactor to extract out the logic for normalizing
string in the formatter from the `StringPart` struct. It also separates
the quote selection into a separate method on the new
`StringNormalizer`. Both of these will help in the f-string formatting
to use `StringPart` and `choose_quotes` irrespective of normalization.

The reason for having separate quote selection and normalization step is
so that the f-string formatting can perform quote selection on its own.

Unlike string and byte literals, the f-string formatting would require
that the normalization happens only for the literal elements of it i.e.,
the "foo" and "bar" in `f"foo {x + y} bar"`. This will automatically be
handled by the already separate `normalize_string` function.

Another use-case in the f-string formatting is to extract out the
relevant information from the `StringPart` like quotes and prefix which
is to be passed as context while formatting each element of an f-string.

## Test Plan

Ensure that clippy is happy and all tests pass.
2024-02-13 18:14:56 +05:30
Micha Reiser
4946a1876f
Stabilize quote-style preserve (#9922) 2024-02-12 09:30:07 +00:00
Micha Reiser
4f7fb566f0
Range formatting: Fix invalid syntax after parenthesizing expression (#9751) 2024-02-02 17:56:25 +01:00
Micha Reiser
501bc1c270
Avoid allocating during implicit concatenated string formatting (#9469) 2024-01-11 15:58:49 +01:00
Micha Reiser
ac02d3aedd
Hug multiline-strings preview style (#9243) 2024-01-10 12:47:34 +01:00
Dimitri Papadopoulos Orfanos
d04d49cc0e
Fix typos found by codespell (#9346)
## Summary

Fix typos found by
[codespell](https://github.com/codespell-project/codespell).

## Test Plan

CI tests.
2024-01-02 02:08:15 +00:00
Micha Reiser
5d4825b60f
Normalise Hex and unicode escape sequences in string (#9280) 2023-12-28 09:06:58 +08:00
Dhruv Manilawala
189e947808
Split string formatting to individual nodes (#9058)
This PR splits the string formatting code in the formatter to be handled
by the respective nodes.

Previously, the string formatting was done through a single
`FormatString` interface. Now, the nodes themselves are responsible for
formatting.

The following changes were made:
1. Remove `StringLayout::ImplicitStringConcatenationInBinaryLike` and
inline the call to `FormatStringContinuation`. After the refactor, the
binary like formatting would delegate to `FormatString` which would then
delegate to `FormatStringContinuation`. This removes the intermediary
steps.
2. Add formatter implementation for `FStringPart` which delegates it to
the respective string literal or f-string node.
3. Add `ExprStringLiteralKind` which is either `String` or `Docstring`.
If it's a docstring variant, then the string expression would not be
implicitly concatenated. This is guaranteed by the
`DocstringStmt::try_from_expression` constructor.
4. Add `StringLiteralKind` which is either a `String`, `Docstring` or
`InImplicitlyConcatenatedFString`. The last variant is for when the
string literal is implicitly concatenated with an f-string (`"foo" f"bar
{x}"`).
5. Remove `FormatString`.
6. Extract the f-string quote detection as a standalone function which
is public to the crate. This is used to detect the quote to be used for
an f-string at the expression level (`ExprFString` or
`FormatStringContinuation`).


### Formatter ecosystem result

**This PR**

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99955 | 10596 | 214 |
| poetry | 0.99905 | 321 | 15 |
| transformers | 0.99967 | 2657 | 324 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99976 | 654 | 14 |
| zulip | 0.99958 | 1459 | 36 |

**main**

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99955 | 10596 | 214 |
| poetry | 0.99905 | 321 | 15 |
| transformers | 0.99967 | 2657 | 324 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99976 | 654 | 14 |
| zulip | 0.99958 | 1459 | 36 |
2023-12-14 12:55:10 -06:00
Renamed from crates/ruff_python_formatter/src/expression/string/mod.rs (Browse further)