Commit graph

807 commits

Author SHA1 Message Date
Micha Reiser
1711bca4a0
FString formatting: remove fstring handling in normalize_string (#10119) 2024-02-25 18:28:46 +01:00
Dhruv Manilawala
72bf1c2880
Preview minimal f-string formatting (#9642)
## Summary

_This is preview only feature and is available using the `--preview`
command-line flag._

With the implementation of [PEP 701] in Python 3.12, f-strings can now
be broken into multiple lines, can contain comments, and can re-use the
same quote character. Currently, no other Python formatter formats the
f-strings so there's some discussion which needs to happen in defining
the style used for f-string formatting. Relevant discussion:
https://github.com/astral-sh/ruff/discussions/9785

The goal for this PR is to add minimal support for f-string formatting.
This would be to format expression within the replacement field without
introducing any major style changes.

### Newlines

The heuristics for adding newline is similar to that of
[Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals)
where the formatter would only split an expression in the replacement
field across multiple lines if there was already a line break within the
replacement field.

In other words, the formatter would not add any newlines unless they
were already present i.e., they were added by the user. This makes
breaking any expression inside an f-string optional and in control of
the user. For example,

```python
# We wouldn't break this
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"

# But, we would break the following as there's already a newline
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
	aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
```


If there are comments in any of the replacement field of the f-string,
then it will always be a multi-line f-string in which case the formatter
would prefer to break expressions i.e., introduce newlines. For example,

```python
x = f"{ # comment
    a }"
```

### Quotes

The logic for formatting quotes remains unchanged. The existing logic is
used to determine the necessary quote char and is used accordingly.

Now, if the expression inside an f-string is itself a string like, then
we need to make sure to preserve the existing quote and not change it to
the preferred quote unless it's 3.12. For example,

```python
f"outer {'inner'} outer"

# For pre 3.12, preserve the single quote
f"outer {'inner'} outer"

# While for 3.12 and later, the quotes can be changed
f"outer {"inner"} outer"
```

But, for triple-quoted strings, we can re-use the same quote char unless
the inner string is itself a triple-quoted string.

```python
f"""outer {"inner"} outer"""  # valid
f"""outer {'''inner'''} outer"""  # preserve the single quote char for the inner string
```

### Debug expressions

If debug expressions are present in the replacement field of a f-string,
then the whitespace needs to be preserved as they will be rendered as it
is (for example, `f"{ x = }"`. If there are any nested f-strings, then
the whitespace in them needs to be preserved as well which means that
we'll stop formatting the f-string as soon as we encounter a debug
expression.

```python
f"outer {   x =  !s  :.3f}"
#                  ^^
#                  We can remove these whitespaces
```

Now, the whitespace doesn't need to be preserved around conversion spec
and format specifiers, so we'll format them as usual but we won't be
formatting any nested f-string within the format specifier.

### Miscellaneous

- The
[`hug_parens_with_braces_and_square_brackets`](https://github.com/astral-sh/ruff/issues/8279)
preview style isn't implemented w.r.t. the f-string curly braces.
- The
[indentation](https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590)
is always relative to the f-string containing statement

## Test Plan

* Add new test cases
* Review existing snapshot changes
* Review the ecosystem changes

[PEP 701]: https://peps.python.org/pep-0701/
2024-02-16 20:28:11 +05:30
Micha Reiser
fe79798c12
split string module (#9987) 2024-02-14 18:54:55 +01:00
Dhruv Manilawala
6f9c128d77
Separate StringNormalizer from StringPart (#9954)
## Summary

This PR is a small refactor to extract out the logic for normalizing
string in the formatter from the `StringPart` struct. It also separates
the quote selection into a separate method on the new
`StringNormalizer`. Both of these will help in the f-string formatting
to use `StringPart` and `choose_quotes` irrespective of normalization.

The reason for having separate quote selection and normalization step is
so that the f-string formatting can perform quote selection on its own.

Unlike string and byte literals, the f-string formatting would require
that the normalization happens only for the literal elements of it i.e.,
the "foo" and "bar" in `f"foo {x + y} bar"`. This will automatically be
handled by the already separate `normalize_string` function.

Another use-case in the f-string formatting is to extract out the
relevant information from the `StringPart` like quotes and prefix which
is to be passed as context while formatting each element of an f-string.

## Test Plan

Ensure that clippy is happy and all tests pass.
2024-02-13 18:14:56 +05:30
Micha Reiser
edfe8421ec
Disable top-level docstring formatting for notebooks (#9957) 2024-02-12 18:14:02 +00:00
Micha Reiser
8657a392ff
Docstring formatting: Preserve tab indentation when using indent-style=tabs (#9915) 2024-02-12 16:09:13 +01:00
Micha Reiser
4946a1876f
Stabilize quote-style preserve (#9922) 2024-02-12 09:30:07 +00:00
Micha Reiser
341c2698a7
Run doctests as part of CI pipeline (#9939) 2024-02-12 10:18:58 +01:00
Micha Reiser
1ce07d65bd
Use usize instead of TextSize for indent_len (#9903) 2024-02-09 20:41:36 +00:00
Charlie Marsh
49fe1b85f2
Reduce size of Expr from 80 to 64 bytes (#9900)
## Summary

This PR reduces the size of `Expr` from 80 to 64 bytes, by reducing the
sizes of...

- `ExprCall` from 72 to 56 bytes, by using boxed slices for `Arguments`.
- `ExprCompare` from 64 to 48 bytes, by using boxed slices for its
various vectors.

In testing, the parser gets a bit faster, and the linter benchmarks
improve quite a bit.
2024-02-09 02:53:13 +00:00
Micha Reiser
fe7d965334
Reduce Result<Tok, LexicalError> size by using Box<str> instead of String (#9885) 2024-02-08 20:36:22 +00:00
Dhruv Manilawala
36b752876e
Implement AnyNode/AnyNodeRef for FStringFormatSpec (#9836)
## Summary

This PR adds the `AnyNode` and `AnyNodeRef` implementation for
`FStringFormatSpec` node which will be required in the f-string
formatting.

The main usage for this is so that we can pass in the node directly to
`suppressed_node` in case debug expression is used to format is as
verbatim text.
2024-02-05 19:23:43 +00:00
Shaygan Hooshyari
b47f85eb69
Preview Style: Format module level docstring (#9725)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-02-05 15:03:34 +00:00
Micha Reiser
80fc02e7d5
Don't trim last empty line in docstrings (#9813) 2024-02-05 13:29:24 +00:00
Micha Reiser
4f7fb566f0
Range formatting: Fix invalid syntax after parenthesizing expression (#9751) 2024-02-02 17:56:25 +01:00
Micha Reiser
ce14f4dea5
Range formatting API (#9635) 2024-01-31 11:13:37 +01:00
Dhruv Manilawala
541aef4e6c
Implement blank_line_after_nested_stub_class preview style (#9155)
## Summary

This PR implements the `blank_line_after_nested_stub_class` preview
style in the formatter.

The logic is divided into 3 parts:
1. In between preceding and following nodes at top level and nested
suite
2. When there's a trailing comment after the class
3. When there is no following node from (1) which is the case when it's
the last or the only node in a suite

We handle (3) with `FormatLeadingAlternateBranchComments`.

## Test Plan

- Add new test cases and update existing snapshots
- Checked the `typeshed` diff

fixes: #8891
2024-01-31 00:09:38 +05:30
Micha Reiser
3c7fea769c
Show source-type in formatter snapshot tests with options (#9699) 2024-01-30 10:08:50 +00:00
Micha Reiser
0045032905
Set source type: Stub for black tests with options (#9674) 2024-01-29 15:54:30 +01:00
Micha Reiser
91046e4c81
Preserve indent around multiline strings (#9637) 2024-01-26 08:18:30 +01:00
Charlie Marsh
b0d6fd7343
Generate custom JSON schema for dynamic setting (#9632)
## Summary

If you paste in the TOML for our default configuration (from the docs),
it's rejected by our JSON Schema:

![Screenshot 2024-01-23 at 10 08
09 PM](7b4ea6e8-07db-4590-bd1e-73a01a35d747)

It seems like the issue is with:

```toml
# Set the line length limit used when formatting code snippets in
# docstrings.
#
# This only has an effect when the `docstring-code-format` setting is
# enabled.
docstring-code-line-length = "dynamic"
```

Specifically, since that value uses a custom Serde implementation, I
guess Schemars bails out? This PR adds a custom representation to allow
`"dynamic"` (but no other strings):

![Screenshot 2024-01-23 at 10 27
21 PM](ab7809d4-b077-44e9-8f98-ed893aaefe5d)

This seems like it should work but I don't have a great way to test it.

Closes https://github.com/astral-sh/ruff/issues/9630.
2024-01-24 04:26:02 +00:00
Micha Reiser
395bf3dc98
Fix the input for black's line ranges test file (#9622) 2024-01-23 10:40:23 +00:00
Alex Waygood
a1e65a92bd
Move is_tuple_parenthesized from the formatter to ruff_python_ast (#9533)
This allows it to be used in the linter as well as the formatter. It
will be useful in #9474
2024-01-15 16:10:40 +00:00
Jane Lewis
7504bf347b
--show-settings displays active settings in a far more readable format (#9464)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fixes #8334.

`Display` has been implemented for `ruff_workspace::Settings`, which
gives a much nicer and more readable output to `--show-settings`.

Internally, a `display_settings` utility macro has been implemented to
reduce the boilerplate of the display code.

### Work to be done

- [x] A lot of formatting for `Vec<_>` and `HashSet<_>` types have been
stubbed out, using `Debug` as a fallback. There should be a way to add
generic formatting support for these types as a modifier in
`display_settings`.
- [x] Several complex types were also stubbed out and need proper
`Display` implementations rather than falling back on `Debug`.
- [x] An open question needs to be answered: how important is it that
the output be valid TOML? Some types in settings, such as a hash-map
from a glob pattern to a multi-variant enum, will be hard to rework into
valid _and_ readable TOML.
- [x] Tests need to be implemented.

## Test Plan

Tests consist of a snapshot test for the default `--show-settings`
output and a doctest for `display_settings!`.
2024-01-12 14:30:29 -05:00
Micha Reiser
f192c72596
Remove type parameter from parse_* methods (#9466) 2024-01-11 19:41:19 +01:00
Micha Reiser
501bc1c270
Avoid allocating during implicit concatenated string formatting (#9469) 2024-01-11 15:58:49 +01:00
Micha Reiser
79f4abbb8d
Import Black's type aliases test (#9456) 2024-01-10 12:37:17 +00:00
Micha Reiser
58fcd96ac1
Update Black Tests (#9455) 2024-01-10 12:09:34 +00:00
Micha Reiser
ac02d3aedd
Hug multiline-strings preview style (#9243) 2024-01-10 12:47:34 +01:00
Charlie Marsh
ba71772d93
Parenthesize breaking named expressions in match guards (#9396)
## Summary

This is an attempt to solve
https://github.com/astral-sh/ruff/issues/9394 by avoiding breaks in
named expressions when invalid.
2024-01-08 14:47:01 +00:00
Charlie Marsh
60ba7a7c0d
Allow # fmt: skip with interspersed same-line comments (#9395)
## Summary

This is similar to https://github.com/astral-sh/ruff/pull/8876, but more
limited in scope:

1. It only applies to `# fmt: skip` (like Black). Like `# isort: on`, `#
fmt: on` needs to be on its own line (still).
2. It only delimits on `#`, so you can do `# fmt: skip # noqa`, but not
`# fmt: skip - some other content` or `# fmt: skip; noqa`.

If we want to support the `;`-delimited version, we should revisit
later, since we don't support that in the linter (so `# fmt: skip; noqa`
wouldn't register a `noqa`).

Closes https://github.com/astral-sh/ruff/issues/8892.
2024-01-04 19:39:37 -05:00
Charlie Marsh
9073220887
Make all dependencies workspace dependencies (#9333)
## Summary

This PR modifies our `Cargo.toml` files to use workspace dependencies
for _all_ dependencies, rather than the status quo of sporadically
trying to use workspace dependencies for those dependencies that are
used across multiple crates. I find the current situation more confusing
and harder to manage, since we have a mix of workspace and crate-local
dependencies, whereas this setup consistently uses the same approach for
all dependencies.
2024-01-02 13:41:59 +00:00
Dimitri Papadopoulos Orfanos
d04d49cc0e
Fix typos found by codespell (#9346)
## Summary

Fix typos found by
[codespell](https://github.com/codespell-project/codespell).

## Test Plan

CI tests.
2024-01-02 02:08:15 +00:00
Charlie Marsh
48e04cc2c8
Add row and column numbers to formatted parse errors (#9321)
## Summary

We now render parse errors in the formatter identically to those in the
linter, e.g.:

```
❯ cargo run -p ruff_cli -- format foo.py
error: Failed to parse foo.py:1:17: Unexpected token '='
```

Closes https://github.com/astral-sh/ruff/issues/8338.

Closes https://github.com/astral-sh/ruff/issues/9311.
2023-12-31 07:10:45 -05:00
Charlie Marsh
e80260a3c5
Remove source path from parser errors (#9322)
## Summary

I always found it odd that we had to pass this in, since it's really
higher-level context for the error. The awkwardness is further evidenced
by the fact that we pass in fake values everywhere (even outside of
tests). The source path isn't actually used to display the error; it's
only accessed elsewhere to _re-display_ the error in certain cases. This
PR modifies to instead pass the path directly in those cases.
2023-12-30 20:33:05 +00:00
Charlie Marsh
97e9d3c54f
Use Display for formatter parse errors (#9316)
## Summary

This helps a bit with (but does not close) the issues described in
https://github.com/astral-sh/ruff/issues/9311. E.g., now, we at least
see: `error: Failed to format main.py: source contains syntax errors:
invalid syntax. Got unexpected token '=' at byte offset 20`.
2023-12-29 22:26:57 +00:00
Micha Reiser
5d4825b60f
Normalise Hex and unicode escape sequences in string (#9280) 2023-12-28 09:06:58 +08:00
Micha Reiser
9cc257ee7d
Improve dummy_implementations preview style formatting (#9240) 2023-12-22 03:44:14 +00:00
Micha Reiser
a06723da2b
Parenthesize multi-context managers (#9222) 2023-12-22 03:41:03 +00:00
Micha Reiser
fa2c37b411
Parenthesize long type annotations in annotated assignments (#9210) 2023-12-22 03:33:47 +00:00
Micha Reiser
3cc719bd74
Use named preview test functions (#9239) 2023-12-22 00:23:04 +00:00
Micha Reiser
d835b28d01
Show preview changes for tests with options (#9223) 2023-12-21 23:36:19 +00:00
Micha Reiser
c6d8076034
Set target versions in Black tests (#9221) 2023-12-21 04:20:17 +00:00
Micha Reiser
8cb7950102
Add target_version to formatter options (#9220) 2023-12-21 04:05:58 +00:00
Micha Reiser
ef4bd8d5ff
Fix: Avoid parenthesizing subscript targets and values (#9209) 2023-12-20 23:42:25 +00:00
Dhruv Manilawala
09296e3e3c
Implement no_blank_line_before_class_docstring preview style (#9154)
## Summary

This PR implements the `no_blank_line_before_class_docstring` preview
style.

## Test Plan

Update existing snapshots.

### Formatter ecosystem

`main`

| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99955 | 10596 | 213 |
| poetry | 0.99905 | 321 | 15 |
| transformers | 0.99967 | 2657 | 324 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99976 | 654 | 14 |
| zulip | 0.99958 | 1459 | 36 |

`dhruv/no-blank-line-docstring`

| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99955 | 10596 | 213 |
| poetry | 0.99905 | 321 | 15 |
| transformers | 0.99967 | 2657 | 324 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99976 | 654 | 14 |
| zulip | 0.99958 | 1459 | 36 |

fixes: #8888
2023-12-19 00:43:20 -06:00
Micha Reiser
c8d6958d15
Add new with and match sequence test cases (#9128)
## Summary

Add new test cases for `with_item` and `match` sequence that demonstrate how long headers break. 

Removes one use of `optional_parentheses` in a position where it is know that the parentheses always need to be added.

## Test Plan

cargo test
2023-12-15 11:45:13 +09:00
Micha Reiser
25b2361411
Extend can_omit_optional_parentheses documentation (#9127)
## Summary

Add some more documentation to `can_omit_optional_parentheses` because it is realy hard to understand.
Restrict the `Attribute` and `None` `OperatorPrecedence` branches to ensure they only get applyied to the intended nodes.

## Test Plan

Ecosystem check reports no differences. The compatibility index remains unchanged.
2023-12-15 11:18:40 +09:00
Dhruv Manilawala
189e947808
Split string formatting to individual nodes (#9058)
This PR splits the string formatting code in the formatter to be handled
by the respective nodes.

Previously, the string formatting was done through a single
`FormatString` interface. Now, the nodes themselves are responsible for
formatting.

The following changes were made:
1. Remove `StringLayout::ImplicitStringConcatenationInBinaryLike` and
inline the call to `FormatStringContinuation`. After the refactor, the
binary like formatting would delegate to `FormatString` which would then
delegate to `FormatStringContinuation`. This removes the intermediary
steps.
2. Add formatter implementation for `FStringPart` which delegates it to
the respective string literal or f-string node.
3. Add `ExprStringLiteralKind` which is either `String` or `Docstring`.
If it's a docstring variant, then the string expression would not be
implicitly concatenated. This is guaranteed by the
`DocstringStmt::try_from_expression` constructor.
4. Add `StringLiteralKind` which is either a `String`, `Docstring` or
`InImplicitlyConcatenatedFString`. The last variant is for when the
string literal is implicitly concatenated with an f-string (`"foo" f"bar
{x}"`).
5. Remove `FormatString`.
6. Extract the f-string quote detection as a standalone function which
is public to the crate. This is used to detect the quote to be used for
an f-string at the expression level (`ExprFString` or
`FormatStringContinuation`).


### Formatter ecosystem result

**This PR**

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99955 | 10596 | 214 |
| poetry | 0.99905 | 321 | 15 |
| transformers | 0.99967 | 2657 | 324 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99976 | 654 | 14 |
| zulip | 0.99958 | 1459 | 36 |

**main**

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75804 | 1799 | 1648 |
| django | 0.99984 | 2772 | 34 |
| home-assistant | 0.99955 | 10596 | 214 |
| poetry | 0.99905 | 321 | 15 |
| transformers | 0.99967 | 2657 | 324 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99980 | 3669 | 18 |
| warehouse | 0.99976 | 654 | 14 |
| zulip | 0.99958 | 1459 | 36 |
2023-12-14 12:55:10 -06:00
Andrew Gallant
28b1aa201b
ruff_python_formatter: fix 'dynamic' mode with doctests (#9129)
This fixes a bug where the current indent level was not calculated
correctly for doctests. Namely, it didn't account for the extra indent
level (in terms of ASCII spaces) used by by the PS1 (`>>> `) and PS2
(`... `) prompts. As a result, lines could extend up to 4 spaces beyond
the configured line length limit.

We fix that by passing the `CodeExampleKind` to the `format` routine
instead of just the code itself. In this way, `format` can query whether
there will be any extra indent added _after_ formatting the code and
take that into account for its line length setting.

We add a few regression tests, taken directly from @stinodego's
examples.

Fixes #9126
2023-12-14 09:53:43 -05:00