mirror of
https://github.com/astral-sh/ruff.git
synced 2025-09-29 21:35:58 +00:00

## Summary This is the third and last PR in this stack that adds support for toggling lints at a per-rule level. This PR introduces a new `LintRegistry`, a central index of known lints. The registry is required because we want to support lint rules from many different crates but need a way to look them up by name, e.g., when resolving a lint from a name in the configuration or analyzing a suppression comment. Adding a lint now requires two steps: 1. Declare the lint with `declare_lint` 2. Register the lint in the registry inside the `register_lints` function. I considered some more involved macros to avoid changes in two places. Still, I ultimately decided against it because a) it's just two places and b) I'd expect that registering a type checker lint will differ from registering a lint that runs as a rule in the linter. I worry that any more opinionated design could limit our options when working on the linter, so I kept it simple. The second part of this PR is the `RuleSelection`. It stores which lints are enabled and what severity they should use for created diagnostics. For now, the `RuleSelection` always gets initialized with all known lints and it uses their default level. ## Linter crates Each crate that defines lints should export a `register_lints` function that accepts a `&mut LintRegistryBuilder` to register all its known lints in the registry. This should make registering all known lints in a top-level crate easy: Just call `register_lints` of every crate that defines lint rules. I considered defining a `LintCollection` trait and even some fancy macros to accomplish the same but decided to go for this very simplistic approach for now. We can add more abstraction once needed. ## Lint rules This is a bit hand-wavy. I don't have a good sense for how our linter infrastructure will look like, but I expect we'll need a way to register the rules that should run as part of the red knot linter. One way is to keep doing what Ruff does by having one massive `checker` and each lint rule adds a call to itself in the relevant AST visitor methods. An alternative is that we have a `LintRule` trait that provides common hooks and implementations will be called at the "right time". Such a design would need a way to register all known lint implementations, possibly with the lint. This is where we'd probably want a dedicated `register_rule` method. A third option is that lint rules are handled separately from the `LintRegistry` and are specific to the linter crate. The current design should be flexible enough to support the three options. ## Documentation generation The documentation for all known lints can be generated by creating a factory, registering all lints by calling the `register_lints` methods, and then querying the registry for the metadata. ## Deserialization and Schema generation I haven't fully decided what the best approach is when it comes to deserializing lint rule names: * Reject invalid names in the deserializer. This gives us error messages with line and column numbers (by serde) * Don't validate lint rule names during deserialization; defer the validation until the configuration is resolved. This gives us more control over handling the error, e.g. emit a warning diagnostic instead of aborting when a rule isn't known. One technical challenge for both deserialization and schema generation is that the `Deserialize` and `JSONSchema` traits do not allow passing the `LintRegistry`, which is required to look up the lints by name. I suggest that we either rely on the salsa db being set for the current thread (`salsa::Attach`) or build our own thread-local storage for the `LintRegistry`. It's the caller's responsibility to make the lint registry available before calling `Deserialize` or `JSONSchema`. ## CLI support I prefer deferring adding support for enabling and disabling lints from the CLI for now because I think it will be easier to add once I've figured out how to handle configurations. ## Bitset optimization Ruff tracks the enabled rules using a cheap copyable `Bitset` instead of a hash map. This helped improve performance by a few percent (see https://github.com/astral-sh/ruff/pull/3606). However, this approach is no longer possible because lints have no "cheap" way to compute their index inside the registry (other than using a hash map). We could consider doing something similar to Salsa where each `LintMetadata` stores a `LazyLintIndex`. ``` pub struct LazyLintIndex { cached: OnceLock<(Nonce, LintIndex)> } impl LazyLintIndex { pub fn get(registry: &LintRegistry, lint: &'static LintMetadata) { let (nonce, index) = self.cached.get_or_init(|| registry.lint_index(lint)); if registry.nonce() == nonce { index } else { registry.lint_index(lint) } } ``` Each registry keeps a map from `LintId` to `LintIndex` where `LintIndex` is in the range of `0...registry.len()`. The `LazyLintIndex` is based on the assumption that every program has exactly **one** registry. This assumption allows to cache the `LintIndex` directly on the `LintMetadata`. The implementation falls back to the "slow" path if there is more than one registry at runtime. I was very close to implementing this optimization because it's kind of fun to implement. I ultimately decided against it because it adds complexity and I don't think it's worth doing in Red Knot today: * Red Knot only queries the rule selection when deciding whether or not to emit a diagnostic. It is rarely used to detect if a certain code block should run. This is different from Ruff where the rule selection is queried many times for every single AST node to determine which rules *should* run. * I'm not sure if a 2-3% performance improvement is worth the complexity I suggest revisiting this decision when working on the linter where a fast path for deciding if a rule is enabled might be more important (but that depends on how lint rules are implemented) ## Test Plan I removed a lint from the default rule registry, and the MD tests started failing because the diagnostics were no longer emitted.
42 lines
1.1 KiB
Rust
42 lines
1.1 KiB
Rust
use std::hash::BuildHasherDefault;
|
|
|
|
use rustc_hash::FxHasher;
|
|
|
|
use crate::lint::{LintRegistry, LintRegistryBuilder};
|
|
pub use db::Db;
|
|
pub use module_name::ModuleName;
|
|
pub use module_resolver::{resolve_module, system_module_search_paths, Module};
|
|
pub use program::{Program, ProgramSettings, SearchPathSettings, SitePackages};
|
|
pub use python_version::PythonVersion;
|
|
pub use semantic_model::{HasTy, SemanticModel};
|
|
|
|
pub mod ast_node_ref;
|
|
mod db;
|
|
pub mod lint;
|
|
mod module_name;
|
|
mod module_resolver;
|
|
mod node_key;
|
|
mod program;
|
|
mod python_version;
|
|
pub mod semantic_index;
|
|
mod semantic_model;
|
|
pub(crate) mod site_packages;
|
|
mod stdlib;
|
|
pub(crate) mod symbol;
|
|
pub mod types;
|
|
mod unpack;
|
|
mod util;
|
|
|
|
type FxOrderSet<V> = ordermap::set::OrderSet<V, BuildHasherDefault<FxHasher>>;
|
|
|
|
/// Creates a new registry with all known semantic lints.
|
|
pub fn default_lint_registry() -> LintRegistry {
|
|
let mut registry = LintRegistryBuilder::default();
|
|
register_lints(&mut registry);
|
|
registry.build()
|
|
}
|
|
|
|
/// Register all known semantic lints.
|
|
pub fn register_lints(registry: &mut LintRegistryBuilder) {
|
|
types::register_lints(registry);
|
|
}
|