mirror of
https://github.com/astral-sh/ruff.git
synced 2025-10-02 22:55:08 +00:00
[red-knot] Lint registry and rule selection (#14874)
## Summary This is the third and last PR in this stack that adds support for toggling lints at a per-rule level. This PR introduces a new `LintRegistry`, a central index of known lints. The registry is required because we want to support lint rules from many different crates but need a way to look them up by name, e.g., when resolving a lint from a name in the configuration or analyzing a suppression comment. Adding a lint now requires two steps: 1. Declare the lint with `declare_lint` 2. Register the lint in the registry inside the `register_lints` function. I considered some more involved macros to avoid changes in two places. Still, I ultimately decided against it because a) it's just two places and b) I'd expect that registering a type checker lint will differ from registering a lint that runs as a rule in the linter. I worry that any more opinionated design could limit our options when working on the linter, so I kept it simple. The second part of this PR is the `RuleSelection`. It stores which lints are enabled and what severity they should use for created diagnostics. For now, the `RuleSelection` always gets initialized with all known lints and it uses their default level. ## Linter crates Each crate that defines lints should export a `register_lints` function that accepts a `&mut LintRegistryBuilder` to register all its known lints in the registry. This should make registering all known lints in a top-level crate easy: Just call `register_lints` of every crate that defines lint rules. I considered defining a `LintCollection` trait and even some fancy macros to accomplish the same but decided to go for this very simplistic approach for now. We can add more abstraction once needed. ## Lint rules This is a bit hand-wavy. I don't have a good sense for how our linter infrastructure will look like, but I expect we'll need a way to register the rules that should run as part of the red knot linter. One way is to keep doing what Ruff does by having one massive `checker` and each lint rule adds a call to itself in the relevant AST visitor methods. An alternative is that we have a `LintRule` trait that provides common hooks and implementations will be called at the "right time". Such a design would need a way to register all known lint implementations, possibly with the lint. This is where we'd probably want a dedicated `register_rule` method. A third option is that lint rules are handled separately from the `LintRegistry` and are specific to the linter crate. The current design should be flexible enough to support the three options. ## Documentation generation The documentation for all known lints can be generated by creating a factory, registering all lints by calling the `register_lints` methods, and then querying the registry for the metadata. ## Deserialization and Schema generation I haven't fully decided what the best approach is when it comes to deserializing lint rule names: * Reject invalid names in the deserializer. This gives us error messages with line and column numbers (by serde) * Don't validate lint rule names during deserialization; defer the validation until the configuration is resolved. This gives us more control over handling the error, e.g. emit a warning diagnostic instead of aborting when a rule isn't known. One technical challenge for both deserialization and schema generation is that the `Deserialize` and `JSONSchema` traits do not allow passing the `LintRegistry`, which is required to look up the lints by name. I suggest that we either rely on the salsa db being set for the current thread (`salsa::Attach`) or build our own thread-local storage for the `LintRegistry`. It's the caller's responsibility to make the lint registry available before calling `Deserialize` or `JSONSchema`. ## CLI support I prefer deferring adding support for enabling and disabling lints from the CLI for now because I think it will be easier to add once I've figured out how to handle configurations. ## Bitset optimization Ruff tracks the enabled rules using a cheap copyable `Bitset` instead of a hash map. This helped improve performance by a few percent (see https://github.com/astral-sh/ruff/pull/3606). However, this approach is no longer possible because lints have no "cheap" way to compute their index inside the registry (other than using a hash map). We could consider doing something similar to Salsa where each `LintMetadata` stores a `LazyLintIndex`. ``` pub struct LazyLintIndex { cached: OnceLock<(Nonce, LintIndex)> } impl LazyLintIndex { pub fn get(registry: &LintRegistry, lint: &'static LintMetadata) { let (nonce, index) = self.cached.get_or_init(|| registry.lint_index(lint)); if registry.nonce() == nonce { index } else { registry.lint_index(lint) } } ``` Each registry keeps a map from `LintId` to `LintIndex` where `LintIndex` is in the range of `0...registry.len()`. The `LazyLintIndex` is based on the assumption that every program has exactly **one** registry. This assumption allows to cache the `LintIndex` directly on the `LintMetadata`. The implementation falls back to the "slow" path if there is more than one registry at runtime. I was very close to implementing this optimization because it's kind of fun to implement. I ultimately decided against it because it adds complexity and I don't think it's worth doing in Red Knot today: * Red Knot only queries the rule selection when deciding whether or not to emit a diagnostic. It is rarely used to detect if a certain code block should run. This is different from Ruff where the rule selection is queried many times for every single AST node to determine which rules *should* run. * I'm not sure if a 2-3% performance improvement is worth the complexity I suggest revisiting this decision when working on the linter where a fast path for deciding if a rule is enabled might be more important (but that depends on how lint rules are implemented) ## Test Plan I removed a lint from the default rule registry, and the MD tests started failing because the diagnostics were no longer emitted.
This commit is contained in:
parent
6f8d8fa36b
commit
881375a8d9
9 changed files with 371 additions and 13 deletions
|
@ -1,3 +1,4 @@
|
|||
use crate::lint::RuleSelection;
|
||||
use ruff_db::files::File;
|
||||
use ruff_db::{Db as SourceDb, Upcast};
|
||||
|
||||
|
@ -5,6 +6,8 @@ use ruff_db::{Db as SourceDb, Upcast};
|
|||
#[salsa::db]
|
||||
pub trait Db: SourceDb + Upcast<dyn SourceDb> {
|
||||
fn is_file_open(&self, file: File) -> bool;
|
||||
|
||||
fn rule_selection(&self) -> &RuleSelection;
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
@ -13,23 +16,24 @@ pub(crate) mod tests {
|
|||
|
||||
use crate::program::{Program, SearchPathSettings};
|
||||
use crate::python_version::PythonVersion;
|
||||
use crate::ProgramSettings;
|
||||
use crate::{default_lint_registry, ProgramSettings};
|
||||
|
||||
use super::Db;
|
||||
use crate::lint::RuleSelection;
|
||||
use anyhow::Context;
|
||||
use ruff_db::files::{File, Files};
|
||||
use ruff_db::system::{DbWithTestSystem, System, SystemPathBuf, TestSystem};
|
||||
use ruff_db::vendored::VendoredFileSystem;
|
||||
use ruff_db::{Db as SourceDb, Upcast};
|
||||
|
||||
use super::Db;
|
||||
|
||||
#[salsa::db]
|
||||
pub(crate) struct TestDb {
|
||||
storage: salsa::Storage<Self>,
|
||||
files: Files,
|
||||
system: TestSystem,
|
||||
vendored: VendoredFileSystem,
|
||||
events: std::sync::Arc<std::sync::Mutex<Vec<salsa::Event>>>,
|
||||
events: Arc<std::sync::Mutex<Vec<salsa::Event>>>,
|
||||
rule_selection: Arc<RuleSelection>,
|
||||
}
|
||||
|
||||
impl TestDb {
|
||||
|
@ -38,8 +42,9 @@ pub(crate) mod tests {
|
|||
storage: salsa::Storage::default(),
|
||||
system: TestSystem::default(),
|
||||
vendored: red_knot_vendored::file_system().clone(),
|
||||
events: std::sync::Arc::default(),
|
||||
events: Arc::default(),
|
||||
files: Files::default(),
|
||||
rule_selection: Arc::new(RuleSelection::from_registry(&default_lint_registry())),
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -102,6 +107,10 @@ pub(crate) mod tests {
|
|||
fn is_file_open(&self, file: File) -> bool {
|
||||
!file.path(self).is_vendored_path()
|
||||
}
|
||||
|
||||
fn rule_selection(&self) -> &RuleSelection {
|
||||
&self.rule_selection
|
||||
}
|
||||
}
|
||||
|
||||
#[salsa::db]
|
||||
|
|
|
@ -2,6 +2,7 @@ use std::hash::BuildHasherDefault;
|
|||
|
||||
use rustc_hash::FxHasher;
|
||||
|
||||
use crate::lint::{LintRegistry, LintRegistryBuilder};
|
||||
pub use db::Db;
|
||||
pub use module_name::ModuleName;
|
||||
pub use module_resolver::{resolve_module, system_module_search_paths, Module};
|
||||
|
@ -27,3 +28,15 @@ mod unpack;
|
|||
mod util;
|
||||
|
||||
type FxOrderSet<V> = ordermap::set::OrderSet<V, BuildHasherDefault<FxHasher>>;
|
||||
|
||||
/// Creates a new registry with all known semantic lints.
|
||||
pub fn default_lint_registry() -> LintRegistry {
|
||||
let mut registry = LintRegistryBuilder::default();
|
||||
register_lints(&mut registry);
|
||||
registry.build()
|
||||
}
|
||||
|
||||
/// Register all known semantic lints.
|
||||
pub fn register_lints(registry: &mut LintRegistryBuilder) {
|
||||
types::register_lints(registry);
|
||||
}
|
||||
|
|
|
@ -1,5 +1,8 @@
|
|||
use itertools::Itertools;
|
||||
use ruff_db::diagnostic::{LintName, Severity};
|
||||
use rustc_hash::FxHashMap;
|
||||
use std::hash::Hasher;
|
||||
use thiserror::Error;
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct LintMetadata {
|
||||
|
@ -224,3 +227,239 @@ macro_rules! declare_lint {
|
|||
};
|
||||
};
|
||||
}
|
||||
|
||||
/// A unique identifier for a lint rule.
|
||||
///
|
||||
/// Implements `PartialEq`, `Eq`, and `Hash` based on the `LintMetadata` pointer
|
||||
/// for fast comparison and lookup.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct LintId {
|
||||
definition: &'static LintMetadata,
|
||||
}
|
||||
|
||||
impl LintId {
|
||||
pub const fn of(definition: &'static LintMetadata) -> Self {
|
||||
LintId { definition }
|
||||
}
|
||||
}
|
||||
|
||||
impl PartialEq for LintId {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
std::ptr::eq(self.definition, other.definition)
|
||||
}
|
||||
}
|
||||
|
||||
impl Eq for LintId {}
|
||||
|
||||
impl std::hash::Hash for LintId {
|
||||
fn hash<H: Hasher>(&self, state: &mut H) {
|
||||
std::ptr::hash(self.definition, state);
|
||||
}
|
||||
}
|
||||
|
||||
impl std::ops::Deref for LintId {
|
||||
type Target = LintMetadata;
|
||||
|
||||
fn deref(&self) -> &Self::Target {
|
||||
self.definition
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Default, Debug)]
|
||||
pub struct LintRegistryBuilder {
|
||||
/// Registered lints that haven't been removed.
|
||||
lints: Vec<LintId>,
|
||||
|
||||
/// Lints indexed by name, including aliases and removed rules.
|
||||
by_name: FxHashMap<&'static str, LintEntry>,
|
||||
}
|
||||
|
||||
impl LintRegistryBuilder {
|
||||
#[track_caller]
|
||||
pub fn register_lint(&mut self, lint: &'static LintMetadata) {
|
||||
assert_eq!(
|
||||
self.by_name.insert(&*lint.name, lint.into()),
|
||||
None,
|
||||
"duplicate lint registration for '{name}'",
|
||||
name = lint.name
|
||||
);
|
||||
|
||||
if !lint.status.is_removed() {
|
||||
self.lints.push(LintId::of(lint));
|
||||
}
|
||||
}
|
||||
|
||||
#[track_caller]
|
||||
pub fn register_alias(&mut self, from: LintName, to: &'static LintMetadata) {
|
||||
let target = match self.by_name.get(to.name.as_str()) {
|
||||
Some(LintEntry::Lint(target) | LintEntry::Removed(target)) => target,
|
||||
Some(LintEntry::Alias(target)) => {
|
||||
panic!(
|
||||
"lint alias {from} -> {to:?} points to another alias {target:?}",
|
||||
target = target.name()
|
||||
)
|
||||
}
|
||||
None => panic!(
|
||||
"lint alias {from} -> {to} points to non-registered lint",
|
||||
to = to.name
|
||||
),
|
||||
};
|
||||
|
||||
assert_eq!(
|
||||
self.by_name
|
||||
.insert(from.as_str(), LintEntry::Alias(*target)),
|
||||
None,
|
||||
"duplicate lint registration for '{from}'",
|
||||
);
|
||||
}
|
||||
|
||||
pub fn build(self) -> LintRegistry {
|
||||
LintRegistry {
|
||||
lints: self.lints,
|
||||
by_name: self.by_name,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Default, Debug)]
|
||||
pub struct LintRegistry {
|
||||
lints: Vec<LintId>,
|
||||
by_name: FxHashMap<&'static str, LintEntry>,
|
||||
}
|
||||
|
||||
impl LintRegistry {
|
||||
/// Looks up a lint by its name.
|
||||
pub fn get(&self, code: &str) -> Result<LintId, GetLintError> {
|
||||
match self.by_name.get(code) {
|
||||
Some(LintEntry::Lint(metadata)) => Ok(*metadata),
|
||||
Some(LintEntry::Alias(lint)) => {
|
||||
if lint.status.is_removed() {
|
||||
Err(GetLintError::Removed(lint.name()))
|
||||
} else {
|
||||
Ok(*lint)
|
||||
}
|
||||
}
|
||||
Some(LintEntry::Removed(lint)) => Err(GetLintError::Removed(lint.name())),
|
||||
None => Err(GetLintError::Unknown(code.to_string())),
|
||||
}
|
||||
}
|
||||
|
||||
/// Returns all registered, non-removed lints.
|
||||
pub fn lints(&self) -> &[LintId] {
|
||||
&self.lints
|
||||
}
|
||||
|
||||
/// Returns an iterator over all known aliases and to their target lints.
|
||||
///
|
||||
/// This iterator includes aliases that point to removed lints.
|
||||
pub fn aliases(&self) -> impl Iterator<Item = (LintName, LintId)> + '_ {
|
||||
self.by_name.iter().filter_map(|(key, value)| {
|
||||
if let LintEntry::Alias(alias) = value {
|
||||
Some((LintName::of(key), *alias))
|
||||
} else {
|
||||
None
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
/// Iterates over all removed lints.
|
||||
pub fn removed(&self) -> impl Iterator<Item = LintId> + '_ {
|
||||
self.by_name.iter().filter_map(|(_, value)| {
|
||||
if let LintEntry::Removed(metadata) = value {
|
||||
Some(*metadata)
|
||||
} else {
|
||||
None
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Error, Debug, Clone)]
|
||||
pub enum GetLintError {
|
||||
/// The name maps to this removed lint.
|
||||
#[error("lint {0} has been removed")]
|
||||
Removed(LintName),
|
||||
|
||||
/// No lint with the given name is known.
|
||||
#[error("unknown lint {0}")]
|
||||
Unknown(String),
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq)]
|
||||
pub enum LintEntry {
|
||||
/// An existing lint rule. Can be in preview, stable or deprecated.
|
||||
Lint(LintId),
|
||||
/// A lint rule that has been removed.
|
||||
Removed(LintId),
|
||||
Alias(LintId),
|
||||
}
|
||||
|
||||
impl From<&'static LintMetadata> for LintEntry {
|
||||
fn from(metadata: &'static LintMetadata) -> Self {
|
||||
if metadata.status.is_removed() {
|
||||
LintEntry::Removed(LintId::of(metadata))
|
||||
} else {
|
||||
LintEntry::Lint(LintId::of(metadata))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Default)]
|
||||
pub struct RuleSelection {
|
||||
/// Map with the severity for each enabled lint rule.
|
||||
///
|
||||
/// If a rule isn't present in this map, then it should be considered disabled.
|
||||
lints: FxHashMap<LintId, Severity>,
|
||||
}
|
||||
|
||||
impl RuleSelection {
|
||||
/// Creates a new rule selection from all known lints in the registry that are enabled
|
||||
/// according to their default severity.
|
||||
pub fn from_registry(registry: &LintRegistry) -> Self {
|
||||
let lints = registry
|
||||
.lints()
|
||||
.iter()
|
||||
.filter_map(|lint| {
|
||||
Severity::try_from(lint.default_level())
|
||||
.ok()
|
||||
.map(|severity| (*lint, severity))
|
||||
})
|
||||
.collect();
|
||||
|
||||
RuleSelection { lints }
|
||||
}
|
||||
|
||||
/// Returns an iterator over all enabled lints.
|
||||
pub fn enabled(&self) -> impl Iterator<Item = LintId> + '_ {
|
||||
self.lints.keys().copied()
|
||||
}
|
||||
|
||||
/// Returns an iterator over all enabled lints and their severity.
|
||||
pub fn iter(&self) -> impl ExactSizeIterator<Item = (LintId, Severity)> + '_ {
|
||||
self.lints.iter().map(|(&lint, &severity)| (lint, severity))
|
||||
}
|
||||
|
||||
/// Returns the configured severity for the lint with the given id or `None` if the lint is disabled.
|
||||
pub fn severity(&self, lint: LintId) -> Option<Severity> {
|
||||
self.lints.get(&lint).copied()
|
||||
}
|
||||
|
||||
/// Enables `lint` and configures with the given `severity`.
|
||||
///
|
||||
/// Overrides any previous configuration for the lint.
|
||||
pub fn enable(&mut self, lint: LintId, severity: Severity) {
|
||||
self.lints.insert(lint, severity);
|
||||
}
|
||||
|
||||
/// Disables `lint` if it was previously enabled.
|
||||
pub fn disable(&mut self, lint: LintId) {
|
||||
self.lints.remove(&lint);
|
||||
}
|
||||
|
||||
/// Merges the enabled lints from `other` into this selection.
|
||||
///
|
||||
/// Lints from `other` will override any existing configuration.
|
||||
pub fn merge(&mut self, other: &RuleSelection) {
|
||||
self.lints.extend(other.iter());
|
||||
}
|
||||
}
|
||||
|
|
|
@ -7,6 +7,7 @@ use ruff_db::files::File;
|
|||
use ruff_python_ast as ast;
|
||||
|
||||
pub(crate) use self::builder::{IntersectionBuilder, UnionBuilder};
|
||||
pub(crate) use self::diagnostic::register_lints;
|
||||
pub use self::diagnostic::{TypeCheckDiagnostic, TypeCheckDiagnostics};
|
||||
pub(crate) use self::display::TypeArrayDisplay;
|
||||
pub(crate) use self::infer::{
|
||||
|
|
|
@ -1,4 +1,9 @@
|
|||
use crate::lint::{Level, LintMetadata, LintStatus};
|
||||
use crate::lint::{Level, LintId, LintMetadata, LintRegistryBuilder, LintStatus};
|
||||
use crate::types::string_annotation::{
|
||||
BYTE_STRING_TYPE_ANNOTATION, ESCAPE_CHARACTER_IN_FORWARD_ANNOTATION, FSTRING_TYPE_ANNOTATION,
|
||||
IMPLICIT_CONCATENATED_STRING_TYPE_ANNOTATION, INVALID_SYNTAX_IN_FORWARD_ANNOTATION,
|
||||
RAW_STRING_TYPE_ANNOTATION,
|
||||
};
|
||||
use crate::types::{ClassLiteralType, Type};
|
||||
use crate::{declare_lint, Db};
|
||||
use ruff_db::diagnostic::{Diagnostic, DiagnosticId, Severity};
|
||||
|
@ -10,6 +15,48 @@ use std::fmt::Formatter;
|
|||
use std::ops::Deref;
|
||||
use std::sync::Arc;
|
||||
|
||||
/// Registers all known type check lints.
|
||||
pub(crate) fn register_lints(registry: &mut LintRegistryBuilder) {
|
||||
registry.register_lint(&UNRESOLVED_REFERENCE);
|
||||
registry.register_lint(&POSSIBLY_UNRESOLVED_REFERENCE);
|
||||
registry.register_lint(&NOT_ITERABLE);
|
||||
registry.register_lint(&INDEX_OUT_OF_BOUNDS);
|
||||
registry.register_lint(&NON_SUBSCRIPTABLE);
|
||||
registry.register_lint(&UNRESOLVED_IMPORT);
|
||||
registry.register_lint(&POSSIBLY_UNBOUND_IMPORT);
|
||||
registry.register_lint(&ZERO_STEPSIZE_IN_SLICE);
|
||||
registry.register_lint(&INVALID_ASSIGNMENT);
|
||||
registry.register_lint(&INVALID_DECLARATION);
|
||||
registry.register_lint(&CONFLICTING_DECLARATIONS);
|
||||
registry.register_lint(&DIVISION_BY_ZERO);
|
||||
registry.register_lint(&CALL_NON_CALLABLE);
|
||||
registry.register_lint(&INVALID_TYPE_PARAMETER);
|
||||
registry.register_lint(&INVALID_TYPE_VARIABLE_CONSTRAINTS);
|
||||
registry.register_lint(&CYCLIC_CLASS_DEFINITION);
|
||||
registry.register_lint(&DUPLICATE_BASE);
|
||||
registry.register_lint(&INVALID_BASE);
|
||||
registry.register_lint(&INCONSISTENT_MRO);
|
||||
registry.register_lint(&INVALID_LITERAL_PARAMETER);
|
||||
registry.register_lint(&CALL_POSSIBLY_UNBOUND_METHOD);
|
||||
registry.register_lint(&POSSIBLY_UNBOUND_ATTRIBUTE);
|
||||
registry.register_lint(&UNRESOLVED_ATTRIBUTE);
|
||||
registry.register_lint(&CONFLICTING_METACLASS);
|
||||
registry.register_lint(&UNSUPPORTED_OPERATOR);
|
||||
registry.register_lint(&INVALID_CONTEXT_MANAGER);
|
||||
registry.register_lint(&UNDEFINED_REVEAL);
|
||||
registry.register_lint(&INVALID_PARAMETER_DEFAULT);
|
||||
registry.register_lint(&INVALID_TYPE_FORM);
|
||||
registry.register_lint(&INVALID_EXCEPTION_CAUGHT);
|
||||
|
||||
// String annotations
|
||||
registry.register_lint(&FSTRING_TYPE_ANNOTATION);
|
||||
registry.register_lint(&BYTE_STRING_TYPE_ANNOTATION);
|
||||
registry.register_lint(&RAW_STRING_TYPE_ANNOTATION);
|
||||
registry.register_lint(&IMPLICIT_CONCATENATED_STRING_TYPE_ANNOTATION);
|
||||
registry.register_lint(&INVALID_SYNTAX_IN_FORWARD_ANNOTATION);
|
||||
registry.register_lint(&ESCAPE_CHARACTER_IN_FORWARD_ANNOTATION);
|
||||
}
|
||||
|
||||
declare_lint! {
|
||||
/// ## What it does
|
||||
/// Checks for references to names that are not defined.
|
||||
|
@ -710,11 +757,12 @@ impl<'db> TypeCheckDiagnosticsBuilder<'db> {
|
|||
|
||||
pub(super) fn add_lint(
|
||||
&mut self,
|
||||
lint: &LintMetadata,
|
||||
lint: &'static LintMetadata,
|
||||
node: AnyNodeRef,
|
||||
message: std::fmt::Arguments,
|
||||
) {
|
||||
let Ok(severity) = Severity::try_from(lint.default_level()) else {
|
||||
// Skip over diagnostics if the rule is disabled.
|
||||
let Some(severity) = self.db.rule_selection().severity(LintId::of(lint)) else {
|
||||
return;
|
||||
};
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue