mirror of
https://github.com/astral-sh/ruff.git
synced 2025-08-03 02:12:22 +00:00
![]() ## Summary This PR introduces a structured `DiagnosticId` instead of using a plain `&'static str`. It is the first of three in a stack that implements a basic rules infrastructure for Red Knot. `DiagnosticId` is an enum over all known diagnostic codes. A closed enum reduces the risk of accidentally introducing two identical diagnostic codes. It also opens the possibility of generating reference documentation from the enum in the future (not part of this PR). The enum isn't *fully closed* because it uses a `&'static str` for lint names. This is because we want the flexibility to define lints in different crates, and all names are only known in `red_knot_linter` or above. Still, lower-level crates must already reference the lint names to emit diagnostics. We could define all lint-names in `DiagnosticId` but I decided against it because: * We probably want to share the `DiagnosticId` type between Ruff and Red Knot to avoid extra complexity in the diagnostic crate, and both tools use different lint names. * Lints require a lot of extra metadata beyond just the name. That's why I think defining them close to their implementation is important. In the long term, we may also want to support plugins, which would make it impossible to know all lint names at compile time. The next PR in the stack introduces extra syntax for defining lints. A closed enum does have a few disadvantages: * rustc can't help us detect unused diagnostic codes because the enum is public * Adding a new diagnostic in the workspace crate now requires changes to at least two crates: It requires changing the workspace crate to add the diagnostic and the `ruff_db` crate to define the diagnostic ID. I consider this an acceptable trade. We may want to move `DiagnosticId` to its own crate or into a shared `red_knot_diagnostic` crate. ## Preventing duplicate diagnostic identifiers One goal of this PR is to make it harder to introduce ambiguous diagnostic IDs, which is achieved by defining a closed enum. However, the enum isn't fully "closed" because it doesn't explicitly list the IDs for all lint rules. That leaves the possibility that a lint rule and a diagnostic ID share the same name. I made the names unambiguous in this PR by separating them into different namespaces by using `lint/<rule>` for lint rule codes. I don't mind the `lint` prefix in a *Ruff next* context, but it is a bit weird for a standalone type checker. I'd like to not overfocus on this for now because I see a few different options: * We remove the `lint` prefix and add a unit test in a top-level crate that iterates over all known lint rules and diagnostic IDs to ensure the names are non-overlapping. * We only render `[lint]` as the error code and add a note to the diagnostic mentioning the lint rule. This is similar to clippy and has the advantage that the header line remains short (`lint/some-long-rule-name` is very long ;)) * Any other form of adjusting the diagnostic rendering to make the distinction clear I think we can defer this decision for now because the `DiagnosticId` contains all the relevant information to change the rendering accordingly. ## Why `Lint` and not `LintRule` I see three kinds of diagnostics in Red Knot: * Non-suppressable: Reveal type, IO errors, configuration errors, etc. (any `DiagnosticId`) * Lints: code-related diagnostics that are suppressable. * Lint rules: The same as lints, but they can be enabled or disabled in the configuration. The majority of lints in Red Knot and the Ruff linter. Our current implementation doesn't distinguish between lints and Lint rules because we aren't aware of a suppressible code-related lint that can't be configured in the configuration. The only lint that comes to my mind is maybe `division-by-zero` if we're 99.99% sure that it is always right. However, I want to keep the door open to making this distinction in the future if it proves useful. Another reason why I chose lint over lint rule (or just rule) is that I want to leave room for a future lint rule and lint phase concept: * lint is the *what*: a specific code smell, pattern, or violation * the lint rule is the *how*: I could see a future `LintRule` trait in `red_knot_python_linter` that provides the necessary hooks to run as part of the linter. A lint rule produces diagnostics for exactly one lint. A lint rule differs from all lints in `red_knot_python_semantic` because they don't run as "rules" in the Ruff sense. Instead, they're a side-product of type inference. * the lint phase is a different form of *how*: A lint phase can produce many different lints in a single pass. This is a somewhat common pattern in Ruff where running one analysis collects the necessary information for finding many different lints * diagnostic is the *presentation*: Unlike a lint, the diagnostic isn't the what, but how a specific lint gets presented. I expect that many lints can use one generic `LintDiagnostic`, but a few lints might need more flexibility and implement their custom diagnostic rendering (at least custom `Diagnostic` implementation). ## Test Plan `cargo test` |
||
---|---|---|
.. | ||
src | ||
tests | ||
Cargo.toml |