## Summary
This PR moves `Diagnostic`, `DiagnosticKind`, and `Fix` into their own crate, which will enable us to further split up Ruff, since sub-linter crates (which need to implement functions that return `Diagnostic`) can now depend on `ruff_diagnostics` rather than Ruff.
This PR introduces a new `CacheKey` trait for types that can be used as a cache key.
I'm not entirely sure if this is worth the "overhead", but I was surprised to find `HashableHashSet` and got scared when I looked at the time complexity of the `hash` function. These implementations must be extremely slow in hashed collections.
I then searched for usages and quickly realized that only the cache uses these `Hash` implementations, where performance is less sensitive.
This PR introduces a new `CacheKey` trait to communicate the difference between a hash and computing a key for the cache. The new trait can be implemented for types that don't implement `Hash` for performance reasons, and we can define additional constraints on the implementation: For example, we'll want to enforce portability when we add remote caching support. Using a different trait further allows us not to implement it for types without stable identities (e.g. pointers) or use other implementations than the standard hash function.
In ruff-lsp (https://github.com/charliermarsh/ruff-lsp/pull/76) we want to add a "Disable \<rule\> for this line" quickfix. However, finding the correct line into which the `noqa` comment should be inserted is non-trivial (multi-line strings for example).
Ruff already has this info, so expose it in the JSON output for use by ruff-lsp.
For example:
$ ruff check --select=EM<Tab>
EM -- flake8-errmsg
EM10 EM1 --
EM101 -- raw-string-in-exception
EM102 -- f-string-in-exception
EM103 -- dot-format-in-exception
(You will need to enable autocompletion as described
in the Autocompletion section in the README.)
Fixes#2808.
(The --help help change in the README is due to a clap bug,
for which I already submitted a fix:
https://github.com/clap-rs/clap/pull/4710.)
# Summary
This PR contains the code for the autoformatter proof-of-concept.
## Crate structure
The primary formatting hook is the `fmt` function in `crates/ruff_python_formatter/src/lib.rs`.
The current formatter approach is outlined in `crates/ruff_python_formatter/src/lib.rs`, and is structured as follows:
- Tokenize the code using the RustPython lexer.
- In `crates/ruff_python_formatter/src/trivia.rs`, extract a variety of trivia tokens from the token stream. These include comments, trailing commas, and empty lines.
- Generate the AST via the RustPython parser.
- In `crates/ruff_python_formatter/src/cst.rs`, convert the AST to a CST structure. As of now, the CST is nearly identical to the AST, except that every node gets a `trivia` vector. But we might want to modify it further.
- In `crates/ruff_python_formatter/src/attachment.rs`, attach each trivia token to the corresponding CST node. The logic for this is mostly in `decorate_trivia` and is ported almost directly from Prettier (given each token, find its preceding, following, and enclosing nodes, then attach the token to the appropriate node in a second pass).
- In `crates/ruff_python_formatter/src/newlines.rs`, normalize newlines to match Black’s preferences. This involves traversing the CST and inserting or removing `TriviaToken` values as we go.
- Call `format!` on the CST, which delegates to type-specific formatter implementations (e.g., `crates/ruff_python_formatter/src/format/stmt.rs` for `Stmt` nodes, and similar for `Expr` nodes; the others are trivial). Those type-specific implementations delegate to kind-specific functions (e.g., `format_func_def`).
## Testing and iteration
The formatter is being developed against the Black test suite, which was copied over in-full to `crates/ruff_python_formatter/resources/test/fixtures/black`.
The Black fixtures had to be modified to create `[insta](https://github.com/mitsuhiko/insta)`-compatible snapshots, which now exist in the repo.
My approach thus far has been to try and improve coverage by tackling fixtures one-by-one.
## What works, and what doesn’t
- *Most* nodes are supported at a basic level (though there are a few stragglers at time of writing, like `StmtKind::Try`).
- Newlines are properly preserved in most cases.
- Magic trailing commas are properly preserved in some (but not all) cases.
- Trivial leading and trailing standalone comments mostly work (although maybe not at the end of a file).
- Inline comments, and comments within expressions, often don’t work -- they work in a few cases, but it’s one-off right now. (We’re probably associating them with the “right” nodes more often than we are actually rendering them in the right place.)
- We don’t properly normalize string quotes. (At present, we just repeat any constants verbatim.)
- We’re mishandling a bunch of wrapping cases (if we treat Black as the reference implementation). Here are a few examples (demonstrating Black's stable behavior):
```py
# In some cases, if the end expression is "self-closing" (functions,
# lists, dictionaries, sets, subscript accesses, and any length-two
# boolean operations that end in these elments), Black
# will wrap like this...
if some_expression and f(
b,
c,
d,
):
pass
# ...whereas we do this:
if (
some_expression
and f(
b,
c,
d,
)
):
pass
# If function arguments can fit on a single line, then Black will
# format them like this, rather than exploding them vertically.
if f(
a, b, c, d, e, f, g, ...
):
pass
```
- We don’t properly preserve parentheses in all cases. Black preserves parentheses in some but not all cases.
Rule::noqa_code previously return a single &'static str,
which was possible because we had one enum listing all
rule code prefixes. This commit series will however split up
the RuleCodePrefix enum into several enums ... so we'll end up
with two &'static str ... this commit wraps the return type
of Rule::noqa_code into a newtype so that we can easily change
it to return two &'static str in the 6th commit of this series.
Post this commit series several codes can be mapped to a single rule,
this commit therefore renames Rule::code to Rule::noqa_code,
which is the code that --add-noqa will add to ignore a rule.
This is causing wheel creation to fail on some of our more exotic build targets: 4159524132.
Let's figure out how to gate appropriately, but for now, reverting to get the release out.