language-servers/ruff - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/astral-sh/ruff.git synced 2025-09-29 05:15:12 +00:00

Author	SHA1	Message	Date
Dhruv Manilawala	68ada05b00	[red-knot] Infer value expr for empty list / tuple target (#15121 ) ## Summary This PR resolves https://github.com/astral-sh/ruff/pull/15058#discussion_r1893868406 by inferring the value expression even if there are no targets in the list / tuple expression. ## Test Plan Remove TODO from corpus tests, making sure it doesn't panic.	2024-12-23 16:00:35 +05:30
Micha Reiser	2f85749fa0	`type: ignore[codes]` and `knot: ignore` (#15078 )	2024-12-23 10:52:43 +01:00
Dhruv Manilawala	113c804a62	[red-knot] Add support for unpacking `for` target (#15058 ) ## Summary Related to #13773 This PR adds support for unpacking `for` statement targets. This involves updating the `value` field in the `Unpack` target to use an enum which specifies the "where did the value expression came from?". This is because for an iterable expression, we need to unpack the iterator type while for assignment statement we need to unpack the value type itself. And, this needs to be done in the unpack query. ### Question One of the ways unpacking works in `for` statement is by looking at the union of the types because if the iterable expression is a tuple then the iterator type will be union of all the types in the tuple. This means that the test cases that will test the unpacking in `for` statement will also implicitly test the unpacking union logic. I was wondering if it makes sense to merge these cases and only add the ones that are specific to the union unpacking or for statement unpacking logic. ## Test Plan Add test cases involving iterating over a tuple type. I've intentionally left out certain cases for now and I'm curious to know any thoughts on the above query.	2024-12-23 06:13:49 +00:00
David Peter	000948ad3b	[red-knot] Statically known branches (#15019 ) ## Summary This changeset adds support for precise type-inference and boundness-handling of definitions inside control-flow branches with statically-known conditions, i.e. test-expressions whose truthiness we can unambiguously infer as always false or always true. This branch also includes: - `sys.platform` support - statically-known branches handling for Boolean expressions and while loops - new `target-version` requirements in some Markdown tests which were now required due to the understanding of `sys.version_info` branches. closes #12700 closes #15034 ## Performance ### `tomllib`, -7%, needs to resolve one additional module (sys) \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `./red_knot_main --project /home/shark/tomllib` \| 22.2 ± 1.3 \| 19.1 \| 25.6 \| 1.00 \| \| `./red_knot_feature --project /home/shark/tomllib` \| 23.8 ± 1.6 \| 20.8 \| 28.6 \| 1.07 ± 0.09 \| ### `black`, -6% \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `./red_knot_main --project /home/shark/black` \| 129.3 ± 5.1 \| 119.0 \| 137.8 \| 1.00 \| \| `./red_knot_feature --project /home/shark/black` \| 136.5 ± 6.8 \| 123.8 \| 147.5 \| 1.06 ± 0.07 \| ## Test Plan - New Markdown tests for the main feature in `statically-known-branches.md` - New Markdown tests for `sys.platform` - Adapted tests for `EllipsisType`, `Never`, etc	2024-12-21 11:33:10 +01:00
Micha Reiser	c3b6139f39	Upgrade salsa (#15039 ) The only code change is that Salsa now requires the `Db` to implement `Clone` to create "lightweight" snapshots.	2024-12-17 15:50:33 +00:00
Micha Reiser	f52b1f4a4d	Add tracing support to mdtest (#14935 ) ## Summary This PR extends the mdtest configuration with a `log` setting that can be any of: * `true`: Enables tracing * `false`: Disables tracing (default) * String: An ENV_FILTER similar to `RED_KNOT_LOG` ```toml log = true ``` Closes https://github.com/astral-sh/ruff/issues/13865 ## Test Plan I changed a test and tried `log=true`, `log=false`, and `log=INFO`	2024-12-13 09:10:01 +00:00
Micha Reiser	c1837e4189	Rename `custom-typeshed-dir`, `target-version` and `current-directory` CLI options (#14930 ) ## Summary This PR renames the `--custom-typeshed-dir`, `target-version`, and `--current-directory` cli options to `--typeshed`, `--python-version`, and `--project` as discussed in the CLI proposal document. I added aliases for `--target-version` (for Ruff compat) and `--custom-typeshed-dir` (for Alex) ## Test Plan Long help ``` An extremely fast Python type checker. Usage: red_knot [OPTIONS] [COMMAND] Commands: server Start the language server help Print this message or the help of the given subcommand(s) Options: --project <PROJECT> Run the command within the given project directory. All `pyproject.toml` files will be discovered by walking up the directory tree from the project root, as will the project's virtual environment (`.venv`). Other command-line arguments (such as relative paths) will be resolved relative to the current working directory."#, --venv-path <PATH> Path to the virtual environment the project uses. If provided, red-knot will use the `site-packages` directory of this virtual environment to resolve type information for the project's third-party dependencies. --typeshed-path <PATH> Custom directory to use for stdlib typeshed stubs --extra-search-path <PATH> Additional path to use as a module-resolution source (can be passed multiple times) --python-version <VERSION> Python version to assume when resolving types [possible values: 3.7, 3.8, 3.9, 3.10, 3.11, 3.12, 3.13] -v, --verbose... Use verbose output (or `-vv` and `-vvv` for more verbose output) -W, --watch Run in watch mode by re-running whenever files change -h, --help Print help (see a summary with '-h') -V, --version Print version ``` Short help ``` An extremely fast Python type checker. Usage: red_knot [OPTIONS] [COMMAND] Commands: server Start the language server help Print this message or the help of the given subcommand(s) Options: --project <PROJECT> Run the command within the given project directory --venv-path <PATH> Path to the virtual environment the project uses --typeshed-path <PATH> Custom directory to use for stdlib typeshed stubs --extra-search-path <PATH> Additional path to use as a module-resolution source (can be passed multiple times) --python-version <VERSION> Python version to assume when resolving types [possible values: 3.7, 3.8, 3.9, 3.10, 3.11, 3.12, 3.13] -v, --verbose... Use verbose output (or `-vv` and `-vvv` for more verbose output) -W, --watch Run in watch mode by re-running whenever files change -h, --help Print help (see more with '--help') -V, --version Print version ``` --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-12-13 08:21:52 +00:00
Micha Reiser	881375a8d9	[red-knot] Lint registry and rule selection (#14874 ) ## Summary This is the third and last PR in this stack that adds support for toggling lints at a per-rule level. This PR introduces a new `LintRegistry`, a central index of known lints. The registry is required because we want to support lint rules from many different crates but need a way to look them up by name, e.g., when resolving a lint from a name in the configuration or analyzing a suppression comment. Adding a lint now requires two steps: 1. Declare the lint with `declare_lint` 2. Register the lint in the registry inside the `register_lints` function. I considered some more involved macros to avoid changes in two places. Still, I ultimately decided against it because a) it's just two places and b) I'd expect that registering a type checker lint will differ from registering a lint that runs as a rule in the linter. I worry that any more opinionated design could limit our options when working on the linter, so I kept it simple. The second part of this PR is the `RuleSelection`. It stores which lints are enabled and what severity they should use for created diagnostics. For now, the `RuleSelection` always gets initialized with all known lints and it uses their default level. ## Linter crates Each crate that defines lints should export a `register_lints` function that accepts a `&mut LintRegistryBuilder` to register all its known lints in the registry. This should make registering all known lints in a top-level crate easy: Just call `register_lints` of every crate that defines lint rules. I considered defining a `LintCollection` trait and even some fancy macros to accomplish the same but decided to go for this very simplistic approach for now. We can add more abstraction once needed. ## Lint rules This is a bit hand-wavy. I don't have a good sense for how our linter infrastructure will look like, but I expect we'll need a way to register the rules that should run as part of the red knot linter. One way is to keep doing what Ruff does by having one massive `checker` and each lint rule adds a call to itself in the relevant AST visitor methods. An alternative is that we have a `LintRule` trait that provides common hooks and implementations will be called at the "right time". Such a design would need a way to register all known lint implementations, possibly with the lint. This is where we'd probably want a dedicated `register_rule` method. A third option is that lint rules are handled separately from the `LintRegistry` and are specific to the linter crate. The current design should be flexible enough to support the three options. ## Documentation generation The documentation for all known lints can be generated by creating a factory, registering all lints by calling the `register_lints` methods, and then querying the registry for the metadata. ## Deserialization and Schema generation I haven't fully decided what the best approach is when it comes to deserializing lint rule names: * Reject invalid names in the deserializer. This gives us error messages with line and column numbers (by serde) * Don't validate lint rule names during deserialization; defer the validation until the configuration is resolved. This gives us more control over handling the error, e.g. emit a warning diagnostic instead of aborting when a rule isn't known. One technical challenge for both deserialization and schema generation is that the `Deserialize` and `JSONSchema` traits do not allow passing the `LintRegistry`, which is required to look up the lints by name. I suggest that we either rely on the salsa db being set for the current thread (`salsa::Attach`) or build our own thread-local storage for the `LintRegistry`. It's the caller's responsibility to make the lint registry available before calling `Deserialize` or `JSONSchema`. ## CLI support I prefer deferring adding support for enabling and disabling lints from the CLI for now because I think it will be easier to add once I've figured out how to handle configurations. ## Bitset optimization Ruff tracks the enabled rules using a cheap copyable `Bitset` instead of a hash map. This helped improve performance by a few percent (see https://github.com/astral-sh/ruff/pull/3606). However, this approach is no longer possible because lints have no "cheap" way to compute their index inside the registry (other than using a hash map). We could consider doing something similar to Salsa where each `LintMetadata` stores a `LazyLintIndex`. ``` pub struct LazyLintIndex { cached: OnceLock<(Nonce, LintIndex)> } impl LazyLintIndex { pub fn get(registry: &LintRegistry, lint: &'static LintMetadata) { let (nonce, index) = self.cached.get_or_init(\|\| registry.lint_index(lint)); if registry.nonce() == nonce { index } else { registry.lint_index(lint) } } ``` Each registry keeps a map from `LintId` to `LintIndex` where `LintIndex` is in the range of `0...registry.len()`. The `LazyLintIndex` is based on the assumption that every program has exactly one registry. This assumption allows to cache the `LintIndex` directly on the `LintMetadata`. The implementation falls back to the "slow" path if there is more than one registry at runtime. I was very close to implementing this optimization because it's kind of fun to implement. I ultimately decided against it because it adds complexity and I don't think it's worth doing in Red Knot today: * Red Knot only queries the rule selection when deciding whether or not to emit a diagnostic. It is rarely used to detect if a certain code block should run. This is different from Ruff where the rule selection is queried many times for every single AST node to determine which rules should run. * I'm not sure if a 2-3% performance improvement is worth the complexity I suggest revisiting this decision when working on the linter where a fast path for deciding if a rule is enabled might be more important (but that depends on how lint rules are implemented) ## Test Plan I removed a lint from the default rule registry, and the MD tests started failing because the diagnostics were no longer emitted.	2024-12-11 13:25:19 +01:00
Micha Reiser	5f548072d9	[red-knot] Typed diagnostic id (#14869 ) ## Summary This PR introduces a structured `DiagnosticId` instead of using a plain `&'static str`. It is the first of three in a stack that implements a basic rules infrastructure for Red Knot. `DiagnosticId` is an enum over all known diagnostic codes. A closed enum reduces the risk of accidentally introducing two identical diagnostic codes. It also opens the possibility of generating reference documentation from the enum in the future (not part of this PR). The enum isn't fully closed because it uses a `&'static str` for lint names. This is because we want the flexibility to define lints in different crates, and all names are only known in `red_knot_linter` or above. Still, lower-level crates must already reference the lint names to emit diagnostics. We could define all lint-names in `DiagnosticId` but I decided against it because: * We probably want to share the `DiagnosticId` type between Ruff and Red Knot to avoid extra complexity in the diagnostic crate, and both tools use different lint names. * Lints require a lot of extra metadata beyond just the name. That's why I think defining them close to their implementation is important. In the long term, we may also want to support plugins, which would make it impossible to know all lint names at compile time. The next PR in the stack introduces extra syntax for defining lints. A closed enum does have a few disadvantages: * rustc can't help us detect unused diagnostic codes because the enum is public * Adding a new diagnostic in the workspace crate now requires changes to at least two crates: It requires changing the workspace crate to add the diagnostic and the `ruff_db` crate to define the diagnostic ID. I consider this an acceptable trade. We may want to move `DiagnosticId` to its own crate or into a shared `red_knot_diagnostic` crate. ## Preventing duplicate diagnostic identifiers One goal of this PR is to make it harder to introduce ambiguous diagnostic IDs, which is achieved by defining a closed enum. However, the enum isn't fully "closed" because it doesn't explicitly list the IDs for all lint rules. That leaves the possibility that a lint rule and a diagnostic ID share the same name. I made the names unambiguous in this PR by separating them into different namespaces by using `lint/<rule>` for lint rule codes. I don't mind the `lint` prefix in a Ruff next context, but it is a bit weird for a standalone type checker. I'd like to not overfocus on this for now because I see a few different options: * We remove the `lint` prefix and add a unit test in a top-level crate that iterates over all known lint rules and diagnostic IDs to ensure the names are non-overlapping. * We only render `[lint]` as the error code and add a note to the diagnostic mentioning the lint rule. This is similar to clippy and has the advantage that the header line remains short (`lint/some-long-rule-name` is very long ;)) * Any other form of adjusting the diagnostic rendering to make the distinction clear I think we can defer this decision for now because the `DiagnosticId` contains all the relevant information to change the rendering accordingly. ## Why `Lint` and not `LintRule` I see three kinds of diagnostics in Red Knot: * Non-suppressable: Reveal type, IO errors, configuration errors, etc. (any `DiagnosticId`) * Lints: code-related diagnostics that are suppressable. * Lint rules: The same as lints, but they can be enabled or disabled in the configuration. The majority of lints in Red Knot and the Ruff linter. Our current implementation doesn't distinguish between lints and Lint rules because we aren't aware of a suppressible code-related lint that can't be configured in the configuration. The only lint that comes to my mind is maybe `division-by-zero` if we're 99.99% sure that it is always right. However, I want to keep the door open to making this distinction in the future if it proves useful. Another reason why I chose lint over lint rule (or just rule) is that I want to leave room for a future lint rule and lint phase concept: * lint is the what: a specific code smell, pattern, or violation * the lint rule is the how: I could see a future `LintRule` trait in `red_knot_python_linter` that provides the necessary hooks to run as part of the linter. A lint rule produces diagnostics for exactly one lint. A lint rule differs from all lints in `red_knot_python_semantic` because they don't run as "rules" in the Ruff sense. Instead, they're a side-product of type inference. * the lint phase is a different form of how: A lint phase can produce many different lints in a single pass. This is a somewhat common pattern in Ruff where running one analysis collects the necessary information for finding many different lints * diagnostic is the presentation: Unlike a lint, the diagnostic isn't the what, but how a specific lint gets presented. I expect that many lints can use one generic `LintDiagnostic`, but a few lints might need more flexibility and implement their custom diagnostic rendering (at least custom `Diagnostic` implementation). ## Test Plan `cargo test`	2024-12-10 15:58:07 +00:00
Dimitri Papadopoulos Orfanos	59145098d6	Fix typos found by codespell (#14863 ) ## Summary Just fix typos. ## Test Plan CI tests. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2024-12-09 09:32:12 +00:00
Micha Reiser	c2e17d0399	Possible fix for flaky file watching test (#14543 )	2024-12-03 08:22:42 +01:00
David Peter	5137fcc9c8	[red-knot] Re-enable linter corpus tests (#14736 ) ## Summary Seeing the fuzzing results from @dhruvmanila in #13778, I think we can re-enable these tests. We also had one regression that would have been caught by these tests, so there is some value in having them enabled.	2024-12-02 20:11:30 +01:00
Micha Reiser	30d80d9746	Sort discovered workspace packages for consistent cross-platform package discovery (#14725 )	2024-12-02 07:36:08 +00:00
Micha Reiser	b63c2e126b	Upgrade Rust toolchain to 1.83 (#14677 )	2024-11-29 12:05:05 +00:00
David Peter	b94d6cf567	[red-knot] Fix panic related to f-strings in annotations (#14613 ) ## Summary Fix panics related to expressions without inferred types in invalid syntax examples like: ```py x: f"Literal[{1 + 2}]" = 3 ``` where the `1 + 2` expression (and its sub-expressions) inside the annotation did not have an inferred type. ## Test Plan Added new corpus test.	2024-11-26 16:35:44 +01:00
David Peter	cd0c97211c	[red-knot] Update KNOWN_FAILURES (#14612 ) ## Summary Remove entry that was prevously fixed in `5a30ec0df6`. ## Test Plan ```sh cargo test -p red_knot_workspace -- --ignored linter_af linter_gz ```	2024-11-26 15:56:42 +01:00
David Peter	f6b2cd5588	[red-knot] Semantic index: handle invalid `break`s (#14522 ) ## Summary This fix addresses panics related to invalid syntax like the following where a `break` statement is used in a nested definition inside a loop: ```py while True: def b(): x: int break ``` closes #14342 ## Test Plan * New corpus regression tests. * New unit test to make sure we handle nested while loops correctly. This test is passing on `main`, but can easily fail if the `is_inside_loop` state isn't properly saved/restored.	2024-11-22 13:13:55 +01:00
David Peter	a90e404c3f	[red-knot] PEP 695 type aliases (#14357 ) ## Summary Add support for (non-generic) type aliases. The main motivation behind this was to get rid of panics involving expressions in (generic) type aliases. But it turned out the best way to fix it was to implement (partial) support for type aliases. ```py type IntOrStr = int \| str reveal_type(IntOrStr) # revealed: typing.TypeAliasType reveal_type(IntOrStr.__name__) # revealed: Literal["IntOrStr"] x: IntOrStr = 1 reveal_type(x) # revealed: Literal[1] def f() -> None: reveal_type(x) # revealed: int \| str ``` ## Test Plan - Updated corpus test allow list to reflect that we don't panic anymore. - Added Markdown-based test for type aliases (`type_alias.md`)	2024-11-22 08:47:14 +01:00
David Peter	f684b6fff4	[red-knot] Fix: Infer type for typing.Union[..] tuple expression (#14510 ) ## Summary Fixes a panic related to sub-expressions of `typing.Union` where we fail to store a type for the `int, str` tuple-expression in code like this: ``` x: Union[int, str] = 1 ``` relates to [my comment](https://github.com/astral-sh/ruff/pull/14499#discussion_r1851794467) on #14499. ## Test Plan New corpus test	2024-11-21 11:49:20 +01:00
David Peter	f8c20258ae	[red-knot] Do not panic on f-string format spec expressions (#14436 ) ## Summary Previously, we panicked on expressions like `f"{v:{f'0.2f'}}"` because we did not infer types for expressions nested inside format spec elements. ## Test Plan ``` cargo nextest run -p red_knot_workspace -- --ignored linter_af linter_gz ```	2024-11-19 10:04:51 +01:00
David Peter	d81b6cd334	[red-knot] Types for subexpressions of annotations (#14426 ) ## Summary This patches up various missing paths where sub-expressions of type annotations previously had no type attached. Examples include: ```py tuple[int, str] # ~~~~~~~~ type[MyClass] # ~~~~~~~ Literal["foo"] # ~~~~~ Literal["foo", Literal[1, 2]] # ~~~~~~~~~~~~~ Literal[1, "a", random.illegal(sub[expr + ession])] # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` ## Test Plan ``` cargo nextest run -p red_knot_workspace -- --ignored linter_af linter_gz ```	2024-11-18 13:03:27 +01:00
Micha Reiser	d99210c049	[red-knot] Default to python 3.9 (#14429 )	2024-11-18 11:27:40 +00:00
David Peter	d470f29093	[red-knot] Disable linter-corpus tests (#14391 ) ## Summary Disable the no-panic tests for the linter corpus, as there are too many problems right now, requiring linter-contributors to add their test files to the allow-list. We can still run the tests using `cargo test -p red_knot_workspace -- --ignored linter_af linter_gz`. This is also why I left the `crates/ruff_linter/` entries in the allow list for now, even if they will get out of sync. But let me know if I should rather remove them.	2024-11-16 23:33:19 +01:00
Simon Brugman	1fbed6c325	[`ruff`] Implement `redundant-bool-literal` (`RUF038`) (#14319 ) ## Summary Implements `redundant-bool-literal` ## Test Plan <!-- How was it tested? --> `cargo test` The ecosystem results are all correct, but for `Airflow` the rule is not relevant due to the use of overloading (and is marked as unsafe correctly). --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-11-16 21:52:51 +00:00
David Peter	4dcb7ddafe	[red-knot] Remove duplicates from KNOWN_FAILURES (#14386 ) ## Summary - Sort the list of `KNOWN_FAILURES` - Remove accidental duplicates	2024-11-16 20:54:21 +01:00
Micha Reiser	5be90c3a67	Split the corpus tests into smaller tests (#14367 ) ## Summary This PR splits the corpus tests into smaller chunks because running all of them takes 8s on my windows machine and it's by far the longest test in `red_knot_workspace`. Splitting the tests has the advantage that they run in parallel. This PR brings down the wall time from 8s to 4s. This PR also limits the glob for the linter tests because it's common to clone cpython into the `ruff_linter/resources/test` folder for benchmarks (because that's what's written in the contributing guides) ## Test Plan `cargo test`	2024-11-16 20:29:21 +01:00
Simon Brugman	78210b198b	[`flake8-pyi`] Implement `redundant-none-literal` (`PYI061`) (#14316 ) ## Summary `Literal[None]` can be simplified into `None` in type annotations. Surprising to see that this is not that rare: - https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chat_models/base.py#L54 - https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/annotation.py#L69 - https://github.com/jax-ml/jax/blob/main/jax/numpy/__init__.pyi#L961 - https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_common.py#L179 ## Test Plan `cargo test` Reviewed all ecosystem results, and they are true positives. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-11-16 18:22:51 +00:00
Micha Reiser	a6a3d3f656	Fix file watcher panic when event has no paths (#14364 )	2024-11-16 08:36:57 +01:00
Micha Reiser	81e5830585	Workspace discovery (#14308 )	2024-11-15 19:20:15 +01:00
David Peter	9f3235a37f	[red-knot] Expand test corpus (#14360 ) ## Summary - Add 383 files from `crates/ruff_python_parser/resources` to the test corpus - Add 1296 files from `crates/ruff_linter/resources` to the test corpus - Use in-memory file system for tests - Improve test isolation by cleaning the test environment between checks - Add a mechanism for "known failures". Mark ~80 files as known failures. - The corpus test is now a lot slower (6 seconds). Note: While `red_knot` as a command line tool can run over all of these files without panicking, we still have a lot of test failures caused by explicitly "pulling" all types. ## Test Plan Run `cargo test -p red_knot_workspace` while making sure that - Introducing code that is known to lead to a panic fails the test - Removing code that is known to lead to a panic from `KNOWN_FAILURES`-files also fails the test	2024-11-15 17:09:15 +01:00
David Peter	77e8da7497	[red-knot] Avoid panics for ipython magic commands (#14326 ) ## Summary Avoids panics when encountering Jupyter notebooks with [IPython magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html). ## Test Plan Added Jupyter notebook to corpus.	2024-11-13 20:58:08 +01:00
David Peter	5e64863895	[red-knot] Handle invalid assignment targets (#14325 ) ## Summary This fixes several panics related to invalid assignment targets. All of these led to some a crash, previously: ```py (x.y := 1) # only name-expressions are valid targets of named expressions ([x, y] := [1, 2]) # same (x, y): tuple[int, int] = (2, 3) # tuples are not valid targets for annotated assignments (x, y) += 2 # tuples are not valid targets for augmented assignments ``` closes #14321 closes #14322 ## Test Plan I symlinked four files from `crates/ruff_python_parser/resources` into the red knot corpus, as they seemed like ideal test files for this exact scenario. I think eventually, it might be a good idea to simply include all invalid-syntax examples from the parser tests into red knots corpus (I believe we're actually not too far from that goal). Or expand the scope of the corpus test to this directory. Then we can get rid of these symlinks again.	2024-11-13 20:50:39 +01:00
David Peter	0eb36e4345	[red-knot] Avoid panic for generic type aliases (#14312 ) ## Summary This avoids a panic inside `TypeInferenceBuilder::infer_type_parameters` when encountering generic type aliases: ```py type ListOrSet[T] = list[T] \| set[T] ``` To fix this properly, we would have to treat type aliases as being their own annotation scope [1]. The left hand side is a definition for the type parameter `T` which is being used in the special annotation scope on the right hand side. Similar to how it works for generic functions and classes. [1] https://docs.python.org/3/reference/compound_stmts.html#generic-type-aliases closes #14307 ## Test Plan Added new example to the corpus.	2024-11-13 16:01:15 +01:00
David Peter	b946cfd1f7	[red-knot] Use memory address as AST node key (#14317 ) ## Summary Use the memory address to uniquely identify AST nodes, instead of relying on source range and kind. The latter fails for ASTs resulting from invalid syntax examples. See #14313 for details. Also results in a 1-2% speedup (`67349cf55f`) closes #14313 ## Review Here are the places where we use `NodeKey` directly or indirectly (via `ExpressionNodeKey` or `DefinitionNodeKey`): ```rs // semantic_index.rs pub(crate) struct SemanticIndex<'db> { // [...] /// Map expressions to their corresponding scope. scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>, /// Map from a node creating a definition to its definition. definitions_by_node: FxHashMap<DefinitionNodeKey, Definition<'db>>, /// Map from a standalone expression to its [`Expression`] ingredient. expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>, // [...] } // semantic_index/builder.rs pub(super) struct SemanticIndexBuilder<'db> { // [...] scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>, definitions_by_node: FxHashMap<ExpressionNodeKey, Definition<'db>>, expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>, } // semantic_index/ast_ids.rs pub(crate) struct AstIds { /// Maps expressions to their expression id. expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>, /// Maps expressions which "use" a symbol (that is, [`ast::ExprName`]) to a use id. uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>, } pub(super) struct AstIdsBuilder { expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>, uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>, } ``` ## Test Plan Added two failing examples to the corpus.	2024-11-13 14:35:54 +01:00
David Peter	3e36a7ab81	[red-knot] Fix assertion for invalid match pattern (#14306 ) ## Summary Fixes a failing debug assertion that triggers for the following code: ```py match some_int: case x:=2: pass ``` closes #14305 ## Test Plan Added problematic code example to corpus.	2024-11-13 10:07:29 +01:00
Carl Meyer	645ce7e5ec	[red-knot] infer types for PEP695 typevars (#14182 ) ## Summary Create definitions and infer types for PEP 695 type variables. This just gives us the type of the type variable itself (the type of `T` as a runtime object in the body of `def f[T](): ...`), with special handling for its attributes `__name__`, `__bound__`, `__constraints__`, and `__default__`. Mostly the support for these attributes exists because it is easy to implement and allows testing that we are internally representing the typevar correctly. This PR doesn't yet have support for interpreting a typevar as a type annotation, which is of course the primary use of a typevar. But the information we store in the typevar's type in this PR gives us everything we need to handle it correctly in a future PR when the typevar appears in an annotation. ## Test Plan Added mdtest.	2024-11-08 21:23:05 +00:00
Micha Reiser	59c0dacea0	Introduce `Diagnostic` trait (#14130 )	2024-11-07 13:26:21 +01:00
David Peter	239cbc6f33	[red-knot] Store starred-expression annotation types (#14106 ) ## Summary - Store the expression type for annotations that are starred expressions (see [discussion here](https://github.com/astral-sh/ruff/pull/14091#discussion_r1828332857)) - Use `self.store_expression_type(…)` consistently throughout, as it makes sure that no double-insertion errors occur. closes #14115 ## Test Plan Added an invalid-syntax example to the corpus which leads to a panic on `main`. Also added a Markdown test with a valid-syntax example that would lead to a panic once we implement function parameter inference. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-11-05 20:25:45 +01:00
Micha Reiser	6aaf1d9446	[red-knot] Remove lint-phase (#13922 ) Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-10-25 18:40:52 +00:00
David Peter	f335fe4d4a	[red-knot] rename {Class,Module,Function} => {Class,Module,Function}Literal (#13873 ) ## Summary * Rename `Type::Class` => `Type::ClassLiteral` * Rename `Type::Function` => `Type::FunctionLiteral` * Do not rename `Type::Module` * Remove `Literal` suffixes in `display::LiteralTypeKind` variants, as per clippy suggestion Get rid of `Type::is_class()` in favor of `is_subtype_of(…, 'type')`; modifiy `is_subtype_of` to support this. * Add new `Type::is_xyz()` methods and use them instead of matching on `Type` variants. closes #13863 ## Test Plan New `is_subtype_of_class_literals` unit test. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-10-22 22:10:53 +02:00
Dhruv Manilawala	b16f665a81	[red-knot] Infer target types for unpacked tuple assignment (#13316 ) ## Summary This PR adds support for unpacking tuple expression in an assignment statement where the target expression can be a tuple or a list (the allowed sequence targets). The implementation introduces a new `infer_assignment_target` which can then be used for other targets like the ones in for loops as well. This delegates it to the `infer_definition`. The final implementation uses a recursive function that visits the target expression in source order and compares the variable node that corresponds to the definition. At the same time, it keeps track of where it is on the assignment value type. The logic also accounts for the number of elements on both sides such that it matches even if there's a gap in between. For example, if there's a starred expression like `(a, b, c) = (1, 2, 3)`, then the type of `a` will be `Literal[1]` and the type of `b` will be `Literal[2]`. There are a couple of follow-ups that can be done: Use this logic for other target positions like `for` loop * Add diagnostics for mis-match length between LHS and RHS ## Test Plan Add various test cases using the new markdown test framework. Validate that existing test cases pass. --------- Co-authored-by: Carl Meyer <carl@astral.sh>	2024-10-15 19:07:11 +00:00
Micha Reiser	5f65e842e8	Upgrade salsa (#13757 )	2024-10-15 11:06:32 +00:00
Charlie Marsh	c3b40da0d2	Use backticks for code in red-knot messages (#13599 ) ## Summary ...and remove periods from messages that don't span more than a single sentence. This is more consistent with how we present user-facing messages in uv (which has a defined style guide).	2024-10-02 03:14:28 +00:00
Alex Waygood	82324678cf	Rename the `ruff_vendored` crate to `red_knot_vendored` (#13586 )	2024-10-01 16:16:59 +01:00
Micha Reiser	653c09001a	Use an empty vendored file system in Ruff (#13436 ) ## Summary This PR changes removes the typeshed stubs from the vendored file system shipped with ruff and instead ships an empty "typeshed". Making the typeshed files optional required extracting the typshed files into a new `ruff_vendored` crate. I do like this even if all our builds always include typeshed because it means `red_knot_python_semantic` contains less code that needs compiling. This also allows us to use deflate because the compression algorithm doesn't matter for an archive containing a single, empty file. ## Test Plan `cargo test` I verified with ` cargo tree -f "{p} {f}" -p <package> ` that: * red_knot_wasm: enables `deflate` compression * red_knot: enables `zstd` compression * `ruff`: uses stored I'm not quiet sure how to build the binary that maturin builds but comparing the release artifact size with `strip = true` shows a `1.5MB` size reduction --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-09-21 16:31:42 +00:00
Carl Meyer	149fb2090e	[red-knot] more efficient UnionBuilder::add (#13411 ) Avoid quadratic time in subsumed elements when adding a super-type of existing union elements. Reserve space in advance when adding multiple elements (from another union) to a union. Make union elements a `Box<[Type]>` instead of an `FxOrderSet`; the set doesn't buy much since the rules of union uniqueness are defined in terms of supertype/subtype, not in terms of simple type identity. Move sealed-boolean handling out of a separate `UnionBuilder::simplify` method and into `UnionBuilder::add`; now that `add` is iterating existing elements anyway, this is more efficient. Remove `UnionType::contains`, since it's now `O(n)` and we shouldn't really need it, generally we care about subtype/supertype, not type identity. (Right now it's used for `Type::Unbound`, which shouldn't even be a type.) Add support for `is_subtype_of` for the `object` type. Addresses comments on https://github.com/astral-sh/ruff/pull/13401	2024-09-20 10:49:45 -07:00
Carl Meyer	260c2ecd15	[red-knot] visit with-item vars even if not a Name (#13409 ) This fixes the last panic on checking pandas. (Match statement became an `if let` because clippy decided it wanted that once I added the additional line in the else case?) --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-09-19 10:37:49 -07:00
Carl Meyer	29c36a56b2	[red-knot] fix scope inference with deferred types (#13204 ) Test coverage for #13131 wasn't as good as I thought it was, because although we infer a lot of types in stubs in typeshed, we don't check typeshed, and therefore we don't do scope-level inference and pull all types for a scope. So we didn't really have good test coverage for scope-level inference in a stub. And because of this, I got the code for supporting that wrong, meaning that if we did scope-level inference with deferred types, we'd end up never populating the deferred types in the scope's `TypeInference`, which causes panics like #13160. Here I both add test coverage by running the corpus tests both as `.py` and as `.pyi` (which reveals the panic), and I fix the code to support deferred types in scope inference. This also revealed a problem with deferred types in generic functions, which effectively span two scopes. That problem will require a bit more thought, and I don't want to block this PR on it, so for now I just don't defer annotations on generic functions. Fixes #13160.	2024-09-03 11:20:43 -07:00
Micha Reiser	599103c933	Add a few missing `#[return_ref]` attributes (#13223 )	2024-09-03 09:15:43 +02:00
Micha Reiser	ecab04e338	Basic concurrent checking (#13049 )	2024-08-24 09:53:27 +01:00

1 2

86 commits