## Summary
Fixes https://github.com/astral-sh/ruff/issues/15027
The `MemoryFileSystem::write_file` API automatically creates
non-existing ancestor directoryes
but we failed to update the status of the now created ancestor
directories in the `Files` data structure.
## Test Plan
Tested that the case in https://github.com/astral-sh/ruff/issues/15027
now passes regardless of whether the *Simple* case is commented out or
not
## Summary
This PR extends the mdtest configuration with a `log` setting that can
be any of:
* `true`: Enables tracing
* `false`: Disables tracing (default)
* String: An ENV_FILTER similar to `RED_KNOT_LOG`
```toml
log = true
```
Closes https://github.com/astral-sh/ruff/issues/13865
## Test Plan
I changed a test and tried `log=true`, `log=false`, and `log=INFO`
## Summary
Fixes a small scoping issue in `DiagnosticId::matches`
Note: I don't think we should use `lint:id` in mdtests just yet. I worry
that it could lead to many unnecessary churns if we decide **not** to
use `lint:<id>` as the format (e.g., `lint/id`).
The reason why users even see `lint:<rule>` is because the mdtest
framework uses the diagnostic infrastructure
Closes#14910
## Test Plan
Added tests
## Summary
This PR introduces a structured `DiagnosticId` instead of using a plain
`&'static str`. It is the first of three in a stack that implements a
basic rules infrastructure for Red Knot.
`DiagnosticId` is an enum over all known diagnostic codes. A closed enum
reduces the risk of accidentally introducing two identical diagnostic
codes. It also opens the possibility of generating reference
documentation from the enum in the future (not part of this PR).
The enum isn't *fully closed* because it uses a `&'static str` for lint
names. This is because we want the flexibility to define lints in
different crates, and all names are only known in `red_knot_linter` or
above. Still, lower-level crates must already reference the lint names
to emit diagnostics. We could define all lint-names in `DiagnosticId`
but I decided against it because:
* We probably want to share the `DiagnosticId` type between Ruff and Red
Knot to avoid extra complexity in the diagnostic crate, and both tools
use different lint names.
* Lints require a lot of extra metadata beyond just the name. That's why
I think defining them close to their implementation is important.
In the long term, we may also want to support plugins, which would make
it impossible to know all lint names at compile time. The next PR in the
stack introduces extra syntax for defining lints.
A closed enum does have a few disadvantages:
* rustc can't help us detect unused diagnostic codes because the enum is
public
* Adding a new diagnostic in the workspace crate now requires changes to
at least two crates: It requires changing the workspace crate to add the
diagnostic and the `ruff_db` crate to define the diagnostic ID. I
consider this an acceptable trade. We may want to move `DiagnosticId` to
its own crate or into a shared `red_knot_diagnostic` crate.
## Preventing duplicate diagnostic identifiers
One goal of this PR is to make it harder to introduce ambiguous
diagnostic IDs, which is achieved by defining a closed enum. However,
the enum isn't fully "closed" because it doesn't explicitly list the IDs
for all lint rules. That leaves the possibility that a lint rule and a
diagnostic ID share the same name.
I made the names unambiguous in this PR by separating them into
different namespaces by using `lint/<rule>` for lint rule codes. I don't
mind the `lint` prefix in a *Ruff next* context, but it is a bit weird
for a standalone type checker. I'd like to not overfocus on this for now
because I see a few different options:
* We remove the `lint` prefix and add a unit test in a top-level crate
that iterates over all known lint rules and diagnostic IDs to ensure the
names are non-overlapping.
* We only render `[lint]` as the error code and add a note to the
diagnostic mentioning the lint rule. This is similar to clippy and has
the advantage that the header line remains short
(`lint/some-long-rule-name` is very long ;))
* Any other form of adjusting the diagnostic rendering to make the
distinction clear
I think we can defer this decision for now because the `DiagnosticId`
contains all the relevant information to change the rendering
accordingly.
## Why `Lint` and not `LintRule`
I see three kinds of diagnostics in Red Knot:
* Non-suppressable: Reveal type, IO errors, configuration errors, etc.
(any `DiagnosticId`)
* Lints: code-related diagnostics that are suppressable.
* Lint rules: The same as lints, but they can be enabled or disabled in
the configuration. The majority of lints in Red Knot and the Ruff
linter.
Our current implementation doesn't distinguish between lints and Lint
rules because we aren't aware of a suppressible code-related lint that
can't be configured in the configuration. The only lint that comes to my
mind is maybe `division-by-zero` if we're 99.99% sure that it is always
right. However, I want to keep the door open to making this distinction
in the future if it proves useful.
Another reason why I chose lint over lint rule (or just rule) is that I
want to leave room for a future lint rule and lint phase concept:
* lint is the *what*: a specific code smell, pattern, or violation
* the lint rule is the *how*: I could see a future `LintRule` trait in
`red_knot_python_linter` that provides the necessary hooks to run as
part of the linter. A lint rule produces diagnostics for exactly one
lint. A lint rule differs from all lints in `red_knot_python_semantic`
because they don't run as "rules" in the Ruff sense. Instead, they're a
side-product of type inference.
* the lint phase is a different form of *how*: A lint phase can produce
many different lints in a single pass. This is a somewhat common pattern
in Ruff where running one analysis collects the necessary information
for finding many different lints
* diagnostic is the *presentation*: Unlike a lint, the diagnostic isn't
the what, but how a specific lint gets presented. I expect that many
lints can use one generic `LintDiagnostic`, but a few lints might need
more flexibility and implement their custom diagnostic rendering (at
least custom `Diagnostic` implementation).
## Test Plan
`cargo test`
## Summary
- Add 383 files from `crates/ruff_python_parser/resources` to the test
corpus
- Add 1296 files from `crates/ruff_linter/resources` to the test corpus
- Use in-memory file system for tests
- Improve test isolation by cleaning the test environment between checks
- Add a mechanism for "known failures". Mark ~80 files as known
failures.
- The corpus test is now a lot slower (6 seconds).
Note:
While `red_knot` as a command line tool can run over all of these
files without panicking, we still have a lot of test failures caused by
explicitly "pulling" all types.
## Test Plan
Run `cargo test -p red_knot_workspace` while making sure that
- Introducing code that is known to lead to a panic fails the test
- Removing code that is known to lead to a panic from
`KNOWN_FAILURES`-files also fails the test
## Summary
...and remove periods from messages that don't span more than a single
sentence.
This is more consistent with how we present user-facing messages in uv
(which has a defined style guide).
## Summary
This PR changes removes the typeshed stubs from the vendored file system
shipped with ruff
and instead ships an empty "typeshed".
Making the typeshed files optional required extracting the typshed files
into a new `ruff_vendored` crate. I do like this even if all our builds
always include typeshed because it means `red_knot_python_semantic`
contains less code that needs compiling.
This also allows us to use deflate because the compression algorithm
doesn't matter for an archive containing a single, empty file.
## Test Plan
`cargo test`
I verified with ` cargo tree -f "{p} {f}" -p <package> ` that:
* red_knot_wasm: enables `deflate` compression
* red_knot: enables `zstd` compression
* `ruff`: uses stored
I'm not quiet sure how to build the binary that maturin builds but
comparing the release artifact size with `strip = true` shows a `1.5MB`
size reduction
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This PR adds an experimental Ruff subcommand to generate dependency
graphs based on module resolution.
A few highlights:
- You can generate either dependency or dependent graphs via the
`--direction` command-line argument.
- Like Pants, we also provide an option to identify imports from string
literals (`--detect-string-imports`).
- Users can also provide additional dependency data via the
`include-dependencies` key under `[tool.ruff.import-map]`. This map uses
file paths as keys, and lists of strings as values. Those strings can be
file paths or globs.
The dependency resolution uses the red-knot module resolver which is
intended to be fully spec compliant, so it's also a chance to expose the
module resolver in a real-world setting.
The CLI is, e.g., `ruff graph build ../autobot`, which will output a
JSON map from file to files it depends on for the `autobot` project.
Use declared types in inference and checking. This means several things:
* Imports prefer declarations over inference, when declarations are
available.
* When we encounter a binding, we check that the bound value's inferred
type is assignable to the live declarations of the bound symbol, if any.
* When we encounter a declaration, we check that the declared type is
assignable from the inferred type of the symbol from previous bindings,
if any.
* When we encounter a binding+declaration, we check that the inferred
type of the bound value is assignable to the declared type.
Prototype deferred evaluation of type expressions by deferring
evaluation of class bases in a stub file. This allows self-referential
class definitions, as occur with the definition of `str` in typeshed
(which inherits `Sequence[str]`).
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR simplifies the virtual file support in the red knot core,
specifically:
* Update `File::add_virtual_file` method to `File::virtual_file` which
will always create a new virtual file and override the existing entry in
the lookup table
* Add `VirtualFile` which is a wrapper around `File` and provides
methods to increment the file revision / close the virtual file
* Add a new `File::try_virtual_file` to lookup the `VirtualFile` from
`Files`
* Add `File::sync_virtual_path` which takes in the `SystemVirtualPath`,
looks up the `VirtualFile` for it and calls the `sync` method to
increment the file revision
* Removes the `virtual_path_metadata` method on `System` trait
## Test Plan
- [x] Make sure the existing red knot tests pass
- [x] Updated code works well with the LSP
## Summary
This PR adds support for untitled files in the Red Knot project.
Refer to the [design
discussion](https://github.com/astral-sh/ruff/discussions/12336) for
more details.
### Changes
* The `parsed_module` always assumes that the `SystemVirtual` path is of
`PySourceType::Python`.
* For the module resolver, as suggested, I went ahead by adding a new
`SystemOrVendoredPath` enum and renamed `FilePathRef` to
`SystemOrVendoredPathRef` (happy to consider better names here).
* The `file_to_module` query would return if it's a
`FilePath::SystemVirtual` variant because a virtual file doesn't belong
to any module.
* The sync implementation for the system virtual path is basically the
same as that of system path except that it uses the
`virtual_path_metadata`. The reason for this is that the system
(language server) would provide the metadata on whether it still exists
or not and if it exists, the corresponding metadata.
For point (1), VS Code would use `Untitled-1` for Python files and
`Untitled-1.ipynb` for Jupyter Notebooks. We could use this distinction
to determine whether the source type is `Python` or `Ipynb`.
## Test Plan
Added test cases in #12526
Add a lint rule to detect if a name is definitely or possibly undefined
at a given usage.
If I create the file `undef/main.py` with contents:
```python
x = int
def foo():
z
return x
if flag:
y = x
y
```
And then run `cargo run --bin red_knot -- --current-directory
../ruff-examples/undef`, I get the output:
```
Name 'z' used when not defined.
Name 'flag' used when not defined.
Name 'y' used when possibly not defined.
```
If I modify the file to add `y = 0` at the top, red-knot re-checks it
and I get the new output:
```
Name 'z' used when not defined.
Name 'flag' used when not defined.
```
Note that `int` is not flagged, since it's a builtin, and `return x` in
the function scope is not flagged, since it refers to the global `x`.