language-servers/ruff - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/astral-sh/ruff.git synced 2025-09-30 05:45:24 +00:00

Author	SHA1	Message	Date
Micha Reiser	881375a8d9	[red-knot] Lint registry and rule selection (#14874 ) ## Summary This is the third and last PR in this stack that adds support for toggling lints at a per-rule level. This PR introduces a new `LintRegistry`, a central index of known lints. The registry is required because we want to support lint rules from many different crates but need a way to look them up by name, e.g., when resolving a lint from a name in the configuration or analyzing a suppression comment. Adding a lint now requires two steps: 1. Declare the lint with `declare_lint` 2. Register the lint in the registry inside the `register_lints` function. I considered some more involved macros to avoid changes in two places. Still, I ultimately decided against it because a) it's just two places and b) I'd expect that registering a type checker lint will differ from registering a lint that runs as a rule in the linter. I worry that any more opinionated design could limit our options when working on the linter, so I kept it simple. The second part of this PR is the `RuleSelection`. It stores which lints are enabled and what severity they should use for created diagnostics. For now, the `RuleSelection` always gets initialized with all known lints and it uses their default level. ## Linter crates Each crate that defines lints should export a `register_lints` function that accepts a `&mut LintRegistryBuilder` to register all its known lints in the registry. This should make registering all known lints in a top-level crate easy: Just call `register_lints` of every crate that defines lint rules. I considered defining a `LintCollection` trait and even some fancy macros to accomplish the same but decided to go for this very simplistic approach for now. We can add more abstraction once needed. ## Lint rules This is a bit hand-wavy. I don't have a good sense for how our linter infrastructure will look like, but I expect we'll need a way to register the rules that should run as part of the red knot linter. One way is to keep doing what Ruff does by having one massive `checker` and each lint rule adds a call to itself in the relevant AST visitor methods. An alternative is that we have a `LintRule` trait that provides common hooks and implementations will be called at the "right time". Such a design would need a way to register all known lint implementations, possibly with the lint. This is where we'd probably want a dedicated `register_rule` method. A third option is that lint rules are handled separately from the `LintRegistry` and are specific to the linter crate. The current design should be flexible enough to support the three options. ## Documentation generation The documentation for all known lints can be generated by creating a factory, registering all lints by calling the `register_lints` methods, and then querying the registry for the metadata. ## Deserialization and Schema generation I haven't fully decided what the best approach is when it comes to deserializing lint rule names: * Reject invalid names in the deserializer. This gives us error messages with line and column numbers (by serde) * Don't validate lint rule names during deserialization; defer the validation until the configuration is resolved. This gives us more control over handling the error, e.g. emit a warning diagnostic instead of aborting when a rule isn't known. One technical challenge for both deserialization and schema generation is that the `Deserialize` and `JSONSchema` traits do not allow passing the `LintRegistry`, which is required to look up the lints by name. I suggest that we either rely on the salsa db being set for the current thread (`salsa::Attach`) or build our own thread-local storage for the `LintRegistry`. It's the caller's responsibility to make the lint registry available before calling `Deserialize` or `JSONSchema`. ## CLI support I prefer deferring adding support for enabling and disabling lints from the CLI for now because I think it will be easier to add once I've figured out how to handle configurations. ## Bitset optimization Ruff tracks the enabled rules using a cheap copyable `Bitset` instead of a hash map. This helped improve performance by a few percent (see https://github.com/astral-sh/ruff/pull/3606). However, this approach is no longer possible because lints have no "cheap" way to compute their index inside the registry (other than using a hash map). We could consider doing something similar to Salsa where each `LintMetadata` stores a `LazyLintIndex`. ``` pub struct LazyLintIndex { cached: OnceLock<(Nonce, LintIndex)> } impl LazyLintIndex { pub fn get(registry: &LintRegistry, lint: &'static LintMetadata) { let (nonce, index) = self.cached.get_or_init(\|\| registry.lint_index(lint)); if registry.nonce() == nonce { index } else { registry.lint_index(lint) } } ``` Each registry keeps a map from `LintId` to `LintIndex` where `LintIndex` is in the range of `0...registry.len()`. The `LazyLintIndex` is based on the assumption that every program has exactly one registry. This assumption allows to cache the `LintIndex` directly on the `LintMetadata`. The implementation falls back to the "slow" path if there is more than one registry at runtime. I was very close to implementing this optimization because it's kind of fun to implement. I ultimately decided against it because it adds complexity and I don't think it's worth doing in Red Knot today: * Red Knot only queries the rule selection when deciding whether or not to emit a diagnostic. It is rarely used to detect if a certain code block should run. This is different from Ruff where the rule selection is queried many times for every single AST node to determine which rules should run. * I'm not sure if a 2-3% performance improvement is worth the complexity I suggest revisiting this decision when working on the linter where a fast path for deciding if a rule is enabled might be more important (but that depends on how lint rules are implemented) ## Test Plan I removed a lint from the default rule registry, and the MD tests started failing because the diagnostics were no longer emitted.	2024-12-11 13:25:19 +01:00
Micha Reiser	5fc8e5d80e	[red-knot] Add infrastructure to declare lints (#14873 ) ## Summary This is the second PR out of three that adds support for enabling/disabling lint rules in Red Knot. You may want to take a look at the [first PR](https://github.com/astral-sh/ruff/pull/14869) in this stack to familiarize yourself with the used terminology. This PR adds a new syntax to define a lint: ```rust declare_lint! { /// ## What it does /// Checks for references to names that are not defined. /// /// ## Why is this bad? /// Using an undefined variable will raise a `NameError` at runtime. /// /// ## Example /// /// ```python /// print(x) # NameError: name 'x' is not defined /// ``` pub(crate) static UNRESOLVED_REFERENCE = { summary: "detects references to names that are not defined", status: LintStatus::preview("1.0.0"), default_level: Level::Warn, } } ``` A lint has a name and metadata about its status (preview, stable, removed, deprecated), the default diagnostic level (unless the configuration changes), and documentation. I use a macro here to derive the kebab-case name and extract the documentation automatically. This PR doesn't yet add any mechanism to discover all known lints. This will be added in the next and last PR in this stack. ## Documentation I documented some rules but then decided that it's probably not my best use of time if I document all of them now (it also means that I play catch-up with all of you forever). That's why I left some rules undocumented (marked with TODO) ## Where is the best place to define all lints? I'm not sure. I think what I have in this PR is fine but I also don't love it because most lints are in a single place but not all of them. If you have ideas, let me know. ## Why is the message not part of the lint, unlike Ruff's `Violation` I understand that the main motivation for defining `message` on `Violation` in Ruff is to remove the need to repeat the same message over and over again. I'm not sure if this is an actual problem. Most rules only emit a diagnostic in a single place and they commonly use different messages if they emit diagnostics in different code paths, requiring extra fields on the `Violation` struct. That's why I'm not convinced that there's an actual need for it and there are alternatives that can reduce the repetition when creating a diagnostic: * Create a helper function. We already do this in red knot with the `add_xy` methods * Create a custom `Diagnostic` implementation that tailors the entire diagnostic and pre-codes e.g. the message Avoiding an extra field on the `Violation` also removes the need to allocate intermediate strings as it is commonly the place in Ruff. Instead, Red Knot can use a borrowed string with `format_args` ## Test Plan `cargo test`	2024-12-10 16:14:44 +00:00
Dhruv Manilawala	e302c2de7c	Cached inference of all definitions in an unpacking (#13979 ) ## Summary This PR adds a new salsa query and an ingredient to resolve all the variables involved in an unpacking assignment like `(a, b) = (1, 2)` at once. Previously, we'd recursively try to match the correct type for each definition individually which will result in creating duplicate diagnostics. This PR still doesn't solve the duplicate diagnostics issue because that requires a different solution like using salsa accumulator or de-duplicating the diagnostics manually. Related: #13773 ## Test Plan Make sure that all unpack assignment test cases pass, there are no panics in the corpus tests. ## Todo - [x] Look at the performance regression	2024-11-04 17:11:57 +05:30
David Peter	53fa32a389	[red-knot] Remove `Type::Unbound` (#13980 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary - Remove `Type::Unbound` - Handle (potential) unboundness as a concept orthogonal to the type system (see new `Symbol` type) - Improve existing and add new diagnostics related to (potential) unboundness closes #13671 ## Test Plan - Update existing markdown-based tests - Add new tests for added/modified functionality	2024-10-31 20:05:53 +01:00
David Peter	77ae0ccf0f	[red-knot] Infer subscript expression types for bytes literals (#13901 ) ## Summary Infer subscript expression types for bytes literals: ```py b = b"\x00abc\xff" reveal_type(b[0]) # revealed: Literal[b"\x00"] reveal_type(b[1]) # revealed: Literal[b"a"] reveal_type(b[-1]) # revealed: Literal[b"\xff"] reveal_type(b[-2]) # revealed: Literal[b"c"] reveal_type(b[False]) # revealed: Literal[b"\x00"] reveal_type(b[True]) # revealed: Literal[b"a"] ``` part of #13689 (https://github.com/astral-sh/ruff/issues/13689#issuecomment-2404285064) ## Test Plan - New Markdown-based tests (see `mdtest/subscript/bytes.md`) - Added missing test for `string_literal[bool_literal]`	2024-10-24 12:07:41 +02:00
Micha Reiser	653c09001a	Use an empty vendored file system in Ruff (#13436 ) ## Summary This PR changes removes the typeshed stubs from the vendored file system shipped with ruff and instead ships an empty "typeshed". Making the typeshed files optional required extracting the typshed files into a new `ruff_vendored` crate. I do like this even if all our builds always include typeshed because it means `red_knot_python_semantic` contains less code that needs compiling. This also allows us to use deflate because the compression algorithm doesn't matter for an archive containing a single, empty file. ## Test Plan `cargo test` I verified with ` cargo tree -f "{p} {f}" -p <package> ` that: * red_knot_wasm: enables `deflate` compression * red_knot: enables `zstd` compression * `ruff`: uses stored I'm not quiet sure how to build the binary that maturin builds but comparing the release artifact size with `strip = true` shows a `1.5MB` size reduction --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-09-21 16:31:42 +00:00
Charlie Marsh	4e935f7d7d	Add a subcommand to generate dependency graphs (#13402 ) ## Summary This PR adds an experimental Ruff subcommand to generate dependency graphs based on module resolution. A few highlights: - You can generate either dependency or dependent graphs via the `--direction` command-line argument. - Like Pants, we also provide an option to identify imports from string literals (`--detect-string-imports`). - Users can also provide additional dependency data via the `include-dependencies` key under `[tool.ruff.import-map]`. This map uses file paths as keys, and lists of strings as values. Those strings can be file paths or globs. The dependency resolution uses the red-knot module resolver which is intended to be fully spec compliant, so it's also a chance to expose the module resolver in a real-world setting. The CLI is, e.g., `ruff graph build ../autobot`, which will output a JSON map from file to files it depends on for the `autobot` project.	2024-09-19 21:06:32 -04:00
Carl Meyer	dcfebaa4a8	[red-knot] use declared types in inference/checking (#13335 ) Use declared types in inference and checking. This means several things: * Imports prefer declarations over inference, when declarations are available. * When we encounter a binding, we check that the bound value's inferred type is assignable to the live declarations of the bound symbol, if any. * When we encounter a declaration, we check that the declared type is assignable from the inferred type of the symbol from previous bindings, if any. * When we encounter a binding+declaration, we check that the inferred type of the bound value is assignable to the declared type.	2024-09-17 08:11:06 -07:00
Alex Waygood	46a457318d	[red-knot] Add type inference for basic `for` loops (#13195 )	2024-09-04 10:19:50 +00:00
Simon	46e687e8d1	[red-knot] Condense literals display by types (#13185 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2024-09-03 07:23:28 +00:00
Micha Reiser	dce87c21fd	Eagerly validate typeshed versions (#12786 )	2024-08-21 15:49:53 +00:00
Alex Waygood	cf1a57df5a	Remove `red_knot_python_semantic::python_version::TargetVersion` (#12790 )	2024-08-10 14:28:31 +01:00
Alex Waygood	c4e651921b	[red-knot] Move, rename and make public the `PyVersion` type (#12782 )	2024-08-09 16:49:17 +01:00
Micha Reiser	2abfab0f9b	Move Program and related structs to `red_knot_python_semantic` (#12777 )	2024-08-09 11:50:45 +02:00
Alex Waygood	f1de08c2a0	[red-knot] Merge the semantic and module-resolver crates (#12751 )	2024-08-08 15:34:11 +01:00
Micha Reiser	e18b4e42d3	[red-knot] Upgrade to the new new salsa (#12406 )	2024-07-29 07:21:24 +00:00
Alex Waygood	d8cf8ac2ef	[red-knot] Resolve symbols from `builtins.pyi` in the stdlib if they cannot be found in other scopes (#12390 ) Co-authored-by: Carl Meyer <carl@astral.sh>	2024-07-19 17:44:56 +01:00
Carl Meyer	0e44235981	[red-knot] intern types using Salsa (#12061 ) Intern types using Salsa interning instead of in the `TypeInference` result. This eliminates the need for `TypingContext`, and also paves the way for finer-grained type inference queries.	2024-07-05 12:16:37 -07:00
Micha Reiser	37f260b5af	Introduce `HasTy` trait and `SemanticModel` facade (#11963 )	2024-07-01 14:48:27 +02:00
Micha Reiser	5109b50bb3	Use `CompactString` for `Identifier` (#12101 )	2024-07-01 10:06:02 +02:00
Alex Waygood	736a4ead14	[red-knot] Move module-resolution logic to its own crate (#11964 )	2024-06-21 13:25:44 +00:00
Micha Reiser	2dfbf118d7	[red-knot] Extract `red_knot_python_semantic` crate (#11926 )	2024-06-20 13:24:24 +02:00

22 commits