## Summary
This PR is a spin-off from https://github.com/astral-sh/ruff/pull/19415.
It enables replacing the severity and lint name in a ty-style
diagnostic:
```
error[unused-import]: `os` imported but unused
```
with the noqa code and optional fix availability icon for a Ruff
diagnostic:
```
F401 [*] `os` imported but unused
F821 Undefined name `a`
```
or nothing at all for a Ruff syntax error:
```
SyntaxError: Expected one or more symbol names after import
```
Ruff adds the `SyntaxError` prefix to these messages manually.
Initially (d912458), I just passed a `hide_severity` flag through a
bunch of calls to get it into `annotate-snippets`, but after looking at
it again today, I think reusing the `None` severity/level gave a nicer
result. As I note in a lengthy code comment, I think all of this code
should be temporary and reverted when Ruff gets real severities, so
hopefully it's okay if it feels a little hacky.
I think the main visible downside of this approach is that we can't
style the asterisk in the fix availabilty icon in cyan, as in Ruff's
current output. It's part of the message in this PR and any styling gets
overwritten in `annotate-snippets`.
<img width="400" height="342" alt="image"
src="https://github.com/user-attachments/assets/57542ec9-a81c-4a01-91c7-bd6d7ec99f99"
/>
Hmm, I guess reusing `Level::None` also means the `F401` isn't red
anymore. Maybe my initial approach was better after all. In any case,
the rest of the PR should be basically the same, it just depends how we
want to toggle the severity.
## Test Plan
New `ruff_db` tests. These snapshots should be compared to the two tests
just above them (`hide_severity_output` vs `output` and
`hide_severity_syntax_errors` against `syntax_errors`).
Summary
--
This PR tweaks Ruff's internal usage of the new diagnostic model to more
closely
match the intended use, as I understand it. Specifically, it moves the
fix/help
suggestion from the primary annotation's message to a subdiagnostic. In
turn, it
adds the secondary/noqa code as the new primary annotation message. As
shown in
the new `ruff_db` tests, this more closely mirrors Ruff's current
diagnostic
output.
I also added `Severity::Help` to render the fix suggestion with a
`help:` prefix
instead of `info:`.
These changes don't have any external impact now but should help a bit
with #19415.
Test Plan
--
New full output format tests in `ruff_db`
Rendered Diagnostics
--
Full diagnostic output from `annotate-snippets` in this PR:
```
error[unused-import]: `os` imported but unused
--> fib.py:1:8
|
1 | import os
| ^^
|
help: Remove unused import: `os`
```
Current Ruff output for the same code:
```
fib.py:1:8: F401 [*] `os` imported but unused
|
1 | import os
| ^^ F401
|
= help: Remove unused import: `os`
```
Proposed final output after #19415:
```
F401 [*] `os` imported but unused
--> fib.py:1:8
|
1 | import os
| ^^
|
help: Remove unused import: `os`
```
These are slightly updated from
https://github.com/astral-sh/ruff/pull/19464#issuecomment-3097377634
below to remove the extra noqa codes in the primary annotation messages
for the first and third cases.
## Summary
This PR is a collaboration with @AlexWaygood from our pairing session
last Friday.
The main goal here is removing `ruff_linter::message::OldDiagnostic` in
favor of
using `ruff_db::diagnostic::Diagnostic` directly. This involved a few
major steps:
- Transferring the fields
- Transferring the methods and trait implementations, where possible
- Converting some constructor methods to free functions
- Moving the `SecondaryCode` struct
- Updating the method names
I'm hoping that some of the methods, especially those in the
`expect_ruff_*`
family, won't be necessary long-term, but I avoided trying to replace
them
entirely for now to keep the already-large diff a bit smaller.
### Related refactors
Alex and I noticed a few refactoring opportunities while looking at the
code,
specifically the very similar implementations for
`create_parse_diagnostic`,
`create_unsupported_syntax_diagnostic`, and
`create_semantic_syntax_diagnostic`.
We combined these into a single generic function, which I then copied
into
`ruff_linter::message` with some small changes and a TODO to combine
them in the
future.
I also deleted the `DisplayParseErrorType` and `TruncateAtNewline` types
for
reporting parse errors. These were added in #4124, I believe to work
around the
error messages from LALRPOP. Removing these didn't affect any tests, so
I think
they were unnecessary now that we fully control the error messages from
the
parser.
On a more minor note, I factored out some calls to the
`OldDiagnostic::filename`
(now `Diagnostic::expect_ruff_filename`) function to avoid repeatedly
allocating
`String`s in some places.
### Snapshot changes
The `show_statistics_syntax_errors` integration test changed because the
`OldDiagnostic::name` method used `syntax-error` instead of
`invalid-syntax`
like in ty. I think this (`--statistics`) is one of the only places we
actually
use this name for syntax errors, so I hope this is okay. An alternative
is to
use `syntax-error` in ty too.
The other snapshot changes are from removing this code, as discussed on
[Discord](1388252408):
34052a1185/crates/ruff_linter/src/message/mod.rs (L128-L135)
I think both of these are technically breaking changes, but they only
affect
syntax errors and are very narrow in scope, while also pretty
substantially
simplifying the refactor, so I hope they're okay to include in a patch
release.
## Test plan
Existing tests, with the adjustments mentioned above
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
I think this should be the last step before combining `OldDiagnostic`
and `ruff_db::Diagnostic`. We can't store a `NoqaCode` on
`ruff_db::Diagnostic`, so I converted the `noqa_code` field to an
`Option<String>` and then propagated this change to all of the callers.
I tried to use `&str` everywhere it was possible, so I think the
remaining `to_string` calls are necessary. I spent some time trying to
convert _everything_ to `&str` but ran into lifetime issues, especially
in the `FixTable`. Maybe we can take another look at that if it causes a
performance regression, but hopefully these paths aren't too hot. We
also avoid some `to_string` calls, so it might even out a bit too.
## Test Plan
Existing tests
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Summary
--
This PR unifies the remaining differences between `OldDiagnostic` and
`Message` (`OldDiagnostic` was only missing an optional `noqa_offset`
field) and
replaces `Message` with `OldDiagnostic`.
The biggest functional difference is that the combined `OldDiagnostic`
kind no
longer implements `AsRule` for an infallible conversion to `Rule`. This
was
pretty easy to work around with `is_some_and` and `is_none_or` in the
few places
it was needed. In `LintContext::report_diagnostic_if_enabled` we can
just use
the new `Violation::rule` method, which takes care of most cases.
Most of the interesting changes are in [this
range](8156992540)
before I started renaming.
Test Plan
--
Existing tests
Future Work
--
I think it's time to start shifting some of these fields to the new
`Diagnostic`
kind. I believe we want `Fix` for sure, but I'm less sure about the
others. We
may want to keep a thin wrapper type here anyway to implement a `rule`
method,
so we could leave some of these fields on that too.
## Summary
As the title says, this PR removes the `Message::to_rule` method by
replacing related uses of `Rule` with `NoqaCode` (or the rule's name in
the case of the cache). Where it seemed a `Rule` was really needed, we
convert back to the `Rule` by parsing either the rule name (with
`str::parse`) or the `NoqaCode` (with `Rule::from_code`).
I thought this was kind of like cheating and that it might not resolve
this part of Micha's
[comment](https://github.com/astral-sh/ruff/pull/18391#issuecomment-2933764275):
> because we can't add Rule to Diagnostic or **have it anywhere in our
shared rendering logic**
but after looking again, the only remaining `Rule` conversion in
rendering code is for the SARIF output format. The other two non-test
`Rule` conversions are for caching and writing a fix summary, which I
don't think fall into the shared rendering logic. That leaves the SARIF
format as the only real problem, but maybe we can delay that for now.
The motivation here is that we won't be able to store a `Rule` on the
new `Diagnostic` type, but we should be able to store a `NoqaCode`,
likely as a string.
## Test Plan
Existing tests
##
[Benchmarks](https://codspeed.io/astral-sh/ruff/branches/brent%2Fremove-to-rule)
Almost no perf regression, only -1% on
`linter/default-rules[large/dataset.py]`.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This is a spin-off from
https://github.com/astral-sh/ruff/pull/18447#discussion_r2125844669 to
avoid using `Message::noqa_code` to differentiate between lints and
syntax errors. I went through all of the calls on `main` and on the
branch from #18447, and the instance in `ruff_server` noted in the
linked comment was actually the primary place where this was being done.
Other calls to `noqa_code` are typically some variation of
`message.noqa_code().map_or(String::new, format!(...))`, with the major
exception of the gitlab output format:
a120610b5b/crates/ruff_linter/src/message/gitlab.rs (L93-L105)
which obviously assumes that `None` means syntax error. A simple fix
here would be to use `message.name()` for `check_name` instead of the
noqa code, but I'm not sure how breaking that would be. This could just
be:
```rust
let description = message.body();
let description = description.strip_prefix("SyntaxError: ").unwrap_or(description).to_string();
let check_name = message.name();
```
In that case. This sounds reasonable based on the [Code Quality report
format](https://docs.gitlab.com/ci/testing/code_quality/#code-quality-report-format)
docs:
> | Name | Type | Description|
> |-----|-----|----|
> |`check_name` | String | A unique name representing the check, or
rule, associated with this violation. |
## Test Plan
Existing tests
## Summary
This PR unifies the ruff `Message` enum variants for syntax errors and
rule violations into a single `Message` struct consisting of a shared
`db::Diagnostic` and some additional, optional fields used for some rule
violations.
This version of `Message` is nearly a drop-in replacement for
`ruff_diagnostics::Diagnostic`, which is the next step I have in mind
for the refactor.
I think this is also a useful checkpoint because we could possibly add
some of these optional fields to the new `Diagnostic` type. I think
we've previously discussed wanting support for `Fix`es, but the other
fields seem less relevant, so we may just need to preserve the `Message`
wrapper for a bit longer.
## Test plan
Existing tests
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This PR deletes the `DiagnosticKind` type by inlining its three fields
(`name`, `body`, and `suggestion`) into three other diagnostic types:
`Diagnostic`, `DiagnosticMessage`, and `CacheMessage`.
Instead of deferring to an internal `DiagnosticKind`, both `Diagnostic`
and `DiagnosticMessage` now have their own macro-generated `AsRule`
implementations.
This should make both https://github.com/astral-sh/ruff/pull/18051 and
another follow-up PR changing the type of `name` on `CacheMessage`
easier since its type will be able to change separately from
`Diagnostic` and `DiagnosticMessage`.
## Test Plan
Existing tests
## Summary
This PR is a first step toward integration of the new `Diagnostic` type
into ruff. There are two main changes:
- A new `UnifiedFile` enum wrapping `File` for red-knot and a
`SourceFile` for ruff
- ruff's `Message::SyntaxError` variant is now a `Diagnostic` instead of
a `SyntaxErrorMessage`
The second of these changes was mostly just a proof of concept for the
first, and it went pretty smoothly. Converting `DiagnosticMessage`s will
be most of the work in replacing `Message` entirely.
## Test Plan
Existing tests, which show no changes.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This PR partially addresses #16418 via the following:
- `LinterSettings::unresolved_python_version` is now a `TargetVersion`,
which is a thin wrapper around an `Option<PythonVersion>`
- `Checker::target_version` now calls `TargetVersion::linter_version`
internally, which in turn uses `unwrap_or_default` to preserve the
current default behavior
- Calls to the parser now call `TargetVersion::parser_version`, which
calls `unwrap_or_else(PythonVersion::latest)`
- The `Checker`'s implementation of
`SemanticSyntaxContext::python_version` also uses
`TargetVersion::parser_version` to use `PythonVersion::latest` for
semantic errors
In short, all lint rule behavior should be unchanged, but we default to
the latest Python version for the new syntax errors, which should
minimize confusing version-related syntax errors for users without a
version configured.
## Test Plan
Existing tests, which showed no changes (except for printing default
settings).
Summary
--
This PR updates `check_path` in the `ruff_linter` crate to return a
`Vec<Message>` instead of a `Vec<Diagnostic>`. The main motivation for
this is to make it easier to convert semantic syntax errors directly
into `Message`s rather than `Diagnostic`s in #16106. However, this also
has the benefit of keeping the preview check on unsupported syntax
errors in `check_path`, as suggested in
https://github.com/astral-sh/ruff/pull/16429#discussion_r1974748024.
All of the interesting changes are in the first commit. The second
commit just renames variables like `diagnostics` to `messages`, and the
third commit is a tiny import fix.
I also updated the `ExpandedMessage::location` field name, which caused
a few extra commits tidying up the playground code. I thought it was
nicely symmetric with `end_location`, but I'm happy to revert that too.
Test Plan
--
Existing tests. I also tested the playground and server manually.
## Summary
This PR builds on the changes in #16220 to pass a target Python version
to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which
collects version-related syntax errors while parsing. These syntax
errors are then turned into `Message`s in ruff (in preview mode).
This PR only detects one syntax error (`match` statement before Python
3.10), but it has been pretty quick to extend to several other simple
errors (see #16308 for example).
## Test Plan
The current tests are CLI tests in the linter crate, but these could be
supplemented with inline parser tests after #16357.
I also tested the display of these syntax errors in VS Code:


---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
## Summary
This PR is another step in preparing to detect syntax errors in the
parser. It introduces the new `per-file-target-version` top-level
configuration option, which holds a mapping of compiled glob patterns to
Python versions. I intend to use the
`LinterSettings::resolve_target_version` method here to pass to the
parser:
f50849aeef/crates/ruff_linter/src/linter.rs (L491-L493)
## Test Plan
I added two new CLI tests to show that the `per-file-target-version` is
respected in both the formatter and the linter.
## Summary
This is part of the preparation for detecting syntax errors in the
parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested
in [this
comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509),
I started working on a `ParseOptions` struct that could be stored in the
parser. For this initial refactor, I only made it hold the existing
`Mode` option, but for syntax errors, we will also need it to have a
`PythonVersion`. For that use case, I'm picturing something like a
`ParseOptions::with_python_version` method, so you can extend the
current calls to something like
```rust
ParseOptions::from(mode).with_python_version(settings.target_version)
```
But I thought it was worth adding `ParseOptions` alone without changing
any other behavior first.
Most of the diff is just updating call sites taking `Mode` to take
`ParseOptions::from(Mode)` or those taking `PySourceType`s to take
`ParseOptions::from(PySourceType)`. The interesting changes are in the
new `parser/options.rs` file and smaller parts of `parser/mod.rs` and
`ruff_python_parser/src/lib.rs`.
## Test Plan
Existing tests, this should not change any behavior.
## Summary
This PR updates the formatter and linter to use the `PythonVersion`
struct from the `ruff_python_ast` crate internally. While this doesn't
remove the need for the `linter::PythonVersion` enum, it does remove the
`formatter::PythonVersion` enum and limits the use in the linter to
deserializing from CLI arguments and config files and moves most of the
remaining methods to the `ast::PythonVersion` struct.
## Test Plan
Existing tests, with some inputs and outputs updated to reflect the new
(de)serialization format. I think these are test-specific and shouldn't
affect any external (de)serialization.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR removes a trailing semicolon after an interface definition in
the custom TypeScript section of `ruff_wasm`. Currently, this semicolon
triggers the error "TS1036: Statements are not allowed in ambient
contexts" when including the file and compiling with e.g `tsc`.
## Test Plan
I made the change, ran `wasm-pack` and copied the generated directory
manually to my `node_modules` folder. I then compiled a file importing
`@astral-sh/ruff-wasm-web` again and confirmed that the compilation
error was gone.
## Summary
This PR updates the playground to show syntax errors.
(I forgot to update this and noticed it this morning.)
## Test Plan
Build the playground locally and preview it:
<img width="764" alt="Screenshot 2024-06-28 at 11 03 35"
src="1fd48d6c-ae41-4672-bf3c-32a61d9946ef">
## Summary
Follow-up to #11902
This PR simplifies the `LinterResult` struct by avoiding the generic and
not store the `ParseError`.
This is possible because the callers already have access to the
`ParseError` via the `Parsed` output. This also means that we can
simplify the return type of `check_path` and avoid the generic `T` on
`LinterResult`.
## Test Plan
`cargo insta test`
## Summary
I look at the token stream a lot, not specifically in the playground but
in the terminal output and it's annoying to scroll a lot to find
specific location. Most of the information is also redundant.
The final format we end up with is: `<kind> <range> (flags = ...)` e.g.,
`String 0..4 (flags = BYTE_STRING)` where the flags part is only
populated if there are any flags set.
## Summary
This PR updates the parser to remove building the `CommentRanges` and
instead it'll be built by the linter and the formatter when it's
required.
For the linter, it'll be built and owned by the `Indexer` while for the
formatter it'll be built from the `Tokens` struct and passed as an
argument.
## Test Plan
`cargo insta test`
## Summary
This PR updates the entire parser stack in multiple ways:
### Make the lexer lazy
* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473
Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.
The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.
The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.
Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.
This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)
[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`
### Remove `SoftKeywordTransformer`
* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474
For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.
* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.
If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).
The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.
### Update the parser API
* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505
Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.
Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens
Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.
### Refactor the downstream tools (linter and formatter)
* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592
And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
* https://github.com/astral-sh/ruff/pull/11402
* https://github.com/astral-sh/ruff/pull/11406
* https://github.com/astral-sh/ruff/pull/11418
* https://github.com/astral-sh/ruff/pull/11419
* https://github.com/astral-sh/ruff/pull/11420
* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR
For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.
## Test Plan
- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects
### Running this change on dozens of open-source projects
Refer to the PR description to get the list of open source projects used
for testing.
Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`
**Conclusion: all output were same**
## What's next?
The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
## Summary
Alternative to #11237
This PR adds a new `Tokens` struct which is a newtype wrapper around a
vector of lexer output. This allows us to add a `kinds` method which
returns an iterator over the corresponding `TokenKind`. This iterator is
implemented as a separate `TokenKindIter` struct to allow using the type
and provide additional methods like `peek` directly on the iterator.
This exposes the linter to access the stream of `TokenKind` instead of
`Tok`.
Edit: I've made the necessary downstream changes and plan to merge the
entire stack at once.
## Summary
This PR removes the `ImportMap` implementation and all its routing
through ruff.
The import map was added in https://github.com/astral-sh/ruff/pull/3243
but we then never ended up using it to do cross file analysis.
We are now working on adding multifile analysis to ruff, and revisit
import resolution as part of it.
```
hyperfine --warmup 10 --runs 20 --setup "./target/release/ruff clean" \
"./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" \
"./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I"
Benchmark 1: ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
Time (mean ± σ): 37.6 ms ± 0.9 ms [User: 52.2 ms, System: 63.7 ms]
Range (min … max): 35.8 ms … 39.8 ms 20 runs
Benchmark 2: ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
Time (mean ± σ): 36.0 ms ± 0.7 ms [User: 50.3 ms, System: 58.4 ms]
Range (min … max): 34.5 ms … 37.6 ms 20 runs
Summary
./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I ran
1.04 ± 0.03 times faster than ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
```
I suspect that the performance improvement should even be more
significant for users that otherwise don't have any diagnostics.
```
hyperfine --warmup 10 --runs 20 --setup "cd ../ecosystem/airflow && ../../ruff/target/release/ruff clean" \
"./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I" \
"./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I"
Benchmark 1: ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I
Time (mean ± σ): 53.7 ms ± 1.8 ms [User: 68.4 ms, System: 63.0 ms]
Range (min … max): 51.1 ms … 58.7 ms 20 runs
Benchmark 2: ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I
Time (mean ± σ): 50.8 ms ± 1.4 ms [User: 50.7 ms, System: 60.9 ms]
Range (min … max): 48.5 ms … 55.3 ms 20 runs
Summary
./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I ran
1.06 ± 0.05 times faster than ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I
```
## Test Plan
`cargo test`
## Summary
This PR updates the playground to display the AST even if it contains a
syntax error. This could be useful for development and also to give a
quick preview of what error recovery looks like.
Note that not all recovery is correct but this allows us to iterate
quickly on what can be improved.
## Test Plan
Build the playground locally and test it.
<img width="1688" alt="Screenshot 2024-04-25 at 21 02 22"
src="2b94934c-4f2c-4a9a-9693-3d8460ed9d0b">
Fixes#8368
Fixes https://github.com/astral-sh/ruff/issues/9186
## Summary
Arbitrary TOML strings can be provided via the command-line to override
configuration options in `pyproject.toml` or `ruff.toml`. As an example:
to run over typeshed and respect typeshed's `pyproject.toml`, but
override a specific isort setting and enable an additional pep8-naming
setting:
```
cargo run -- check ../typeshed --no-cache --config ../typeshed/pyproject.toml --config "lint.isort.combine-as-imports=false" --config "lint.extend-select=['N801']"
```
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
Adds an additional warning macro (we should consolidate these later)
that shows a warning once based on the content of the warning itself.
This is less efficient than `warn_user_once!` and `warn_user_by_id!`,
but this is so expensive that it doesn't matter at all.
Applies this macro to the various warnings for the v0.2.0 release, and
also includes the filename in said warnings, so the FastAPI case is now:
```text
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in /Users/crmarsh/workspace/fastapi/pyproject.toml:
- 'ignore' -> 'lint.ignore'
- 'select' -> 'lint.select'
- 'isort' -> 'lint.isort'
- 'pyupgrade' -> 'lint.pyupgrade'
- 'per-file-ignores' -> 'lint.per-file-ignores'
```
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
I always found it odd that we had to pass this in, since it's really
higher-level context for the error. The awkwardness is further evidenced
by the fact that we pass in fake values everywhere (even outside of
tests). The source path isn't actually used to display the error; it's
only accessed elsewhere to _re-display_ the error in certain cases. This
PR modifies to instead pass the path directly in those cases.
## Summary
This PR adds some helper structs to the linter paths to enable passing
in the pre-computed tokens and parsed source code during benchmarking,
to remove lexing and parsing from the overall linter benchmark
measurement. We already remove parsing for the formatter, and we have
separate benchmarks for the lexer and the parser, so this should make it
much easier to measure linter performance changes.
## Summary
This PR renames the `tab-size` configuration option to `indent-width` to
express that the formatter uses the option to determine the indentation
width AND as tab width.
I first preferred naming the option `tab-width` but then decided to go
with `indent-width` because:
* It aligns with the `indent-style` option
* It would allow us to write a lint rule that asserts that each
indentation uses `indent-width` spaces.
Closes#7643
## Test Plan
Added integration test
See the provided breaking changes note for details.
Removes support for the deprecated `--format`option in the `ruff check`
CLI, `format` inference as `output-format` in the configuration file,
and the `RUFF_FORMAT` environment variable.
The error message for use of `format` in the configuration file could be
better, but would require some awkward serde wrappers and it seems hard
to present the correct schema to the user still.
Closes https://github.com/astral-sh/ruff/issues/7572
Drops formatting specific rules from the default rule set as they
conflict with formatters in general (and in particular, conflict with
our formatter). Most of these rules are in preview, but the removal of
`line-too-long` and `mixed-spaces-and-tabs` is a change to the stable
rule set.
## Example
The following no longer raises `E501`
```
echo "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx = 1" | ruff check -
```
## Summary
This PR adds a new `lint` section to the configuration that groups all linter-specific settings. The existing top-level configurations continue to work without any warning because the `lint.*` settings are experimental.
The configuration merges the top level and `lint.*` settings where the settings in `lint` have higher precedence (override the top-level settings). The reasoning behind this is that the settings in `lint.` are more specific and more specific settings should override less specific settings.
I decided against showing the new `lint.*` options on our website because it would make the page extremely long (it's technically easy to do, just attribute `lint` with `[option_group`]). We may want to explore adding an `alias` field to the `option` attribute and show the alias on the website along with its regular name.
## Test Plan
* I added new integration tests
* I verified that the generated `options.md` is identical
* Verified the default settings in the playground

I got confused and refactored a bit, now the naming should be more
consistent. This is the basis for the range formatting work.
Chages:
* `format_module` -> `format_module_source` (format a string)
* `format_node` -> `format_module_ast` (format a program parsed into an
AST)
* Added `parse_ok_tokens` that takes `Token` instead of `Result<Token>`
* Call the source code `source` consistently
* Added a `tokens_and_ranges` helper
* `python_ast` -> `module` (because that's the type)