(This is not possible to actually use until
https://github.com/astral-sh/ruff/pull/8854 is merged.)
ruff_python_formatter: add reStructuredText docstring formatting support
This commit makes use of the refactoring done in prior commits to slot
in reStructuredText support. Essentially, we add a new type of code
example and look for *both* literal blocks and code block directives.
Literal blocks are treated as Python by default because it seems to be a
[common
practice](https://github.com/adamchainz/blacken-docs/issues/195).
That is, literal blocks like this:
```
def example():
"""
Here's an example::
foo( 1 )
All done.
"""
pass
```
Will get reformatted. And code blocks (via reStructuredText directives)
will also get reformatted:
```
def example():
"""
Here's an example:
.. code-block:: python
foo( 1 )
All done.
"""
pass
```
When looking for a code block, it is possible for it to become invalid.
In which case, we back out of looking for a code example and print the
lines out as they are. As with doctest formatting, if reformatting the
code would result in invalid Python or if the code collected from the
block is invalid, then formatting is also skipped.
A number of tests have been added to check both the formatting and
resetting behavior. Mixed indentation is also tested a fair bit, since
one of my initial attempts at dealing with mixed indentation ended up
not working.
I recommend working through this PR commit-by-commit. There is in
particular a somewhat gnarly refactoring before reST support is added.
Closes#8859
## Summary
We should avoid inlining the ellipsis in:
```python
def h():
...
# bye
```
Just as we omit the ellipsis in:
```python
def h():
# bye
...
```
Closes https://github.com/astral-sh/ruff/issues/8905.
In the source of working on #8859, I made a number of smallish refactors
to how code snippet formatting works. Most or all of these were
motivated by writing in support for reStructuredText blocks. They have
some fundamentally different requirements than doctests, and there are a
lot more ways for reStructuredText blocks to become invalid.
(Commit-by-commit review is recommended as the commit messages provide
further context on each change. I split this off from ongoing work to
make review more manageable.)
## Summary
This PR adds opt-in support for formatting doctests in docstrings. This
reflects initial support and it is intended to add support for Markdown
and reStructuredText Python code blocks in the future. But I believe
this PR lays the groundwork, and future additions for Markdown and reST
should be less costly to add.
It's strongly recommended to review this PR commit-by-commit. The last
few commits in particular implement the bulk of the work here and
represent the denser portions.
Some things worth mentioning:
* The formatter is itself not perfect, and it is possible for it to
produce invalid Python code. Because of this, reformatted code snippets
are checked for Python validity. If they aren't valid, then we
(unfortunately silently) bail on formatting that code snippet.
* There are a couple places where it would be nice to at least warn the
user that doctest formatting failed, but it wasn't clear to me what the
best way to do that is.
* I haven't yet run this in anger on a real world code base. I think
that should happen before merging.
Closes#7146
## Test Plan
* [x] Pass the local test suite.
* [x] Scrutinize ecosystem changes.
* [x] Run this formatter on extant code and scrutinize the results.
(e.g., CPython, numpy.)
## Summary
This PR updates the string nodes (`ExprStringLiteral`,
`ExprBytesLiteral`, and `ExprFString`) to account for implicit string
concatenation.
### Motivation
In Python, implicit string concatenation are joined while parsing
because the interpreter doesn't require the information for each part.
While that's feasible for an interpreter, it falls short for a static
analysis tool where having such information is more useful. Currently,
various parts of the code uses the lexer to get the individual string
parts.
One of the main challenge this solves is that of string formatting.
Currently, the formatter relies on the lexer to get the individual
string parts, and formats them including the comments accordingly. But,
with PEP 701, f-string can also contain comments. Without this change,
it becomes very difficult to add support for f-string formatting.
### Implementation
The initial proposal was made in this discussion:
https://github.com/astral-sh/ruff/discussions/6183#discussioncomment-6591993.
There were various AST designs which were explored for this task which
are available in the linked internal document[^1].
The selected variant was the one where the nodes were kept as it is
except that the `implicit_concatenated` field was removed and instead a
new struct was added to the `Expr*` struct. This would be a private
struct would contain the actual implementation of how the AST is
designed for both single and implicitly concatenated strings.
This implementation is achieved through an enum with two variants:
`Single` and `Concatenated` to avoid allocating a vector even for single
strings. There are various public methods available on the value struct
to query certain information regarding the node.
The nodes are structured in the following way:
```
ExprStringLiteral - "foo" "bar"
|- StringLiteral - "foo"
|- StringLiteral - "bar"
ExprBytesLiteral - b"foo" b"bar"
|- BytesLiteral - b"foo"
|- BytesLiteral - b"bar"
ExprFString - "foo" f"bar {x}"
|- FStringPart::Literal - "foo"
|- FStringPart::FString - f"bar {x}"
|- StringLiteral - "bar "
|- FormattedValue - "x"
```
[^1]: Internal document:
https://www.notion.so/astral-sh/Implicit-String-Concatenation-e036345dc48943f89e416c087bf6f6d9?pvs=4
#### Visitor
The way the nodes are structured is that the entire string, including
all the parts that are implicitly concatenation, is a single node
containing individual nodes for the parts. The previous section has a
representation of that tree for all the string nodes. This means that
new visitor methods are added to visit the individual parts of string,
bytes, and f-strings for `Visitor`, `PreorderVisitor`, and
`Transformer`.
## Test Plan
- `cargo insta test --workspace --all-features --unreferenced reject`
- Verify that the ecosystem results are unchanged
Fix an instability where await was followed by a breaking fluent style
expression:
```python
test_data = await (
Stream.from_async(async_data)
.flat_map_async()
.map()
.filter_async(is_valid_data)
.to_list()
)
```
Note that this technically a minor style change (see ecosystem check)
## Summary
This PR implements validation in the formatter tests to ensure that we
don't modify the AST during formatting. Black has similar logic.
In implementing this, I learned that Black actually _does_ modify the
AST, and their test infrastructure normalizes the AST to wipe away those
differences. Specifically, Black changes the indentation of docstrings,
which _does_ modify the AST; and it also inserts parentheses in `del`
statements, which changes the AST too.
Ruff also does both these things, so we _also_ implement the same
normalization using a new visitor that allows for modifying the AST.
Closes https://github.com/astral-sh/ruff/issues/8184.
## Test Plan
`cargo test`
This ensures the python label is used for all python code blocks for
consistency.
## Test Plan
Visual inspection of all changes via git client ensuring no other
changes were made in error.
## Summary
This commit adds some additional error checking to the parser such that
assignments that are invalid syntax are rejected. This covers the
obvious cases like `5 = 3` and some not so obvious cases like `x + y =
42`.
This does add an additional recursive call to the parser for the cases
handling assignments. I had initially been concerned about doing this,
but `set_context` is already doing recursion during assignments, so I
didn't feel as though this was changing any fundamental performance
characteristics of the parser. (Also, in practice, I would expect any
such recursion here to be quite shallow since the recursion is done on
the target of an assignment. Such things are rarely nested much in
practice.)
Fixes#6895
## Test Plan
I've added unit tests covering every case that is detected as invalid on
an `Expr`.
## Summary
This PR fixes a bug in our formatter where a multiline lambda expression
statement was formatted over multiple lines without adding parentheses.
The PR "fixes" the problem by not splitting the lambda parameters if it
is not parenthesized
## Test Plan
Added test
**Summary** Previously, own line comment following after a docstring
followed by newline(s) before the first content statement were treated
as trailing on the docstring and we didn't insert a newline after the
docstring as black would.
Before:
```python
class ModuleBrowser:
"""Browse module classes and functions in IDLE."""
# This class is also the base class for pathbrowser.PathBrowser.
def __init__(self, master, path, *, _htest=False, _utest=False):
pass
```
After:
```python
class ModuleBrowser:
"""Browse module classes and functions in IDLE."""
# This class is also the base class for pathbrowser.PathBrowser.
def __init__(self, master, path, *, _htest=False, _utest=False):
pass
```
I'm not entirely happy about hijacking
`handle_own_line_comment_between_statements`, but i don't know a better
spot to put it.
Fixes#7948
**Test Plan** Fixtures
We previously incorrectly treated byte strings in docstring position as
docstrings because black does so
(https://github.com/astral-sh/ruff/pull/8283#discussion_r1375682931,
https://github.com/psf/black/issues/4002), even CPython doesn't
recognize them:
```console
$ python3.12
Python 3.12.0 (main, Oct 6 2023, 17:57:44) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def f():
... b""" a"""
...
>>> print(str(f.__doc__))
None
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
Change
```python
"""Test docstring"""
a = 1
```
to
```python
"""Test docstring"""
a = 1
```
in preview style, but don't touch the docstring otherwise.
Do we want to ask black to also format the content of module level
docstrings? Seems inconsistent to me that we change function and class
docstring indentation/contents but not module docstrings.
Fixes https://github.com/astral-sh/ruff/issues/7995
**Summary** Prepare for the black preview style becoming the black
stable style at the end of the year.
This adds a new test file to compare stable and preview on some relevant
preview options in black, and makes `format_dev` understand the black
preview flag. I've added poetry as a project that uses preview.
I've implemented one specific deviation (collapsing of stub
implementation in non-stub files) which showed up in poetry for testing.
This also improves poetry compatibility from 0.99891 to 0.99919.
Fixes#7440
New compatibility stats:
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 35 |
| home-assistant | 0.99953 | 10596 | 189 |
| poetry | 0.99919 | 317 | 12 |
| transformers | 0.99963 | 2657 | 332 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99969 | 654 | 15 |
| zulip | 0.99970 | 1459 | 22 |
## Summary
Given:
```python
# comment
class A:
def foo(self):
pass
```
We need to insert an additional newline between `# comment` and `class
A`. We were missing this handling for the case in which `# comment` is a
leading comment on `class A`, as opposed to a trailing comment of some
preceding statement.
In practice, I think this only applies to the specific case in which a
class or function is the first statement in a module, and there's a
single empty line between a leading comment and that class or function.
If there are no empty lines, then the comment "sticks" to the
definition; if there are two or more, then `leading_comments` will
truncate appropriately. If the class or function is nested, then we only
need one empty line anyway.
Closes https://github.com/astral-sh/ruff/issues/8215.
## Test Plan
No change in similarity.
Before:
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
After:
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1648 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
## Summary
This was just a bug in the parser ranges, probably since it was
initially implemented. Given `match n % 3, n % 5: ...`, the "subject"
(i.e., the tuple of two binary operators) was using the entire range of
the `match` statement.
Closes https://github.com/astral-sh/ruff/issues/8091.
## Test Plan
`cargo test`
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Fixes https://github.com/astral-sh/ruff/issues/7448
Fixes https://github.com/astral-sh/ruff/issues/7892
I've removed automatic dangling comment formatting, we're doing manual
dangling comment formatting everywhere anyway (the
assert-all-comments-formatted ensures this) and dangling comments would
break the formatting there.
## Test Plan
New test file.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Split out of #8044: In preview style, ellipsis are also collapsed in
non-stub files. This should only affect function/class contexts since
for other statements stub are generally not used. I've updated our tests
to use `pass` instead to reflect this, which makes tracking the preview
style changes much easier.
## Summary
Given an expression like `[x for (x) in y]`, we weren't skipping over
parentheses when searching for the `in` between `(x)` and `y`.
Closes https://github.com/astral-sh/ruff/issues/8053.
**Summary** Insert a newline after nested function and class
definitions, unless there is a trailing own line comment.
We need to e.g. format
```python
if platform.system() == "Linux":
if sys.version > (3, 10):
def f():
print("old")
else:
def f():
print("new")
f()
```
as
```python
if platform.system() == "Linux":
if sys.version > (3, 10):
def f():
print("old")
else:
def f():
print("new")
f()
```
even though `f()` is directly preceded by an if statement, not a
function or class definition. See the comments and fixtures for trailing
own line comment handling.
**Test Plan** I checked that the new content of `newlines.py` matches
black's formatting.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
**Summary** Handle comment before the default values of function
parameters correctly by inserting a line break instead of space after
the equals sign where required.
```python
def f(
a = # parameter trailing comment; needs line break
1,
b =
# default leading comment; needs line break
2,
c = ( # the default leading can only be end-of-line with parentheses; no line break
3
),
d = (
# own line leading comment with parentheses; no line break
4
)
)
```
Fixes#7603
**Test Plan** Added the different cases and one more complex case as
fixtures.
**Summary** Remove spaces from import statements such as
```python
import tqdm . tqdm
from tqdm . auto import tqdm
```
See also #7760 for a better solution.
**Test Plan** New fixtures
**Summary** Quoting of f-strings can change if they are triple quoted
and only contain single quotes inside.
Fixes#6841
**Test Plan** New fixtures
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
This PR fixes the bug where the formatter would panic if a class/function with
decorators had a suppression comment.
The fix is to use to correct start location to find the `async`/`def`/`class`
keyword when decorators are present which is the end of the last
decorator.
## Test Plan
Add test cases for the fix and update the snapshots.
## Summary
It turns out that _some_ identifiers can contain newlines --
specifically, dot-delimited import identifiers, like:
```python
import foo\
.bar
```
At present, we print all identifiers verbatim, which causes us to retain
the `\` in the formatted output. This also leads to violating some debug
assertions (see the linked issue, though that's a symptom of this
formatting failure).
This PR adds detection for import identifiers that contain newlines, and
formats them via `text` (slow) rather than `source_code_slice` (fast) in
those cases.
Closes https://github.com/astral-sh/ruff/issues/7734.
## Test Plan
`cargo test`
## Summary
At present, `quote-style` is used universally. However, [PEP
8](https://peps.python.org/pep-0008/) and [PEP
257](https://peps.python.org/pep-0257/) suggest that while either single
or double quotes are acceptable in general (as long as they're
consistent), docstrings and triple-quoted strings should always use
double quotes. In our research, the vast majority of Ruff users that
enable the `flake8-quotes` rules only enable them for inline strings
(i.e., non-triple-quoted strings).
Additionally, many Black forks (like Blue and Pyink) use double quotes
for docstrings and triple-quoted strings.
Our decision for now is to always prefer double quotes for triple-quoted
strings (which should include docstrings). Based on feedback, we may
consider adding additional options (e.g., a `"preserve"` mode, to avoid
changing quotes; or a `"multiline-quote-style"` to override this).
Closes https://github.com/astral-sh/ruff/issues/7615.
## Test Plan
`cargo test`