This change makes it so we aren't doing a directory traversal every time
we ask for completions from a module. Specifically, submodules that
aren't attributes of their parent module can only be discovered by
looking at the directory tree. But we want to avoid doing a directory
scan unless we think there are changes.
To make this work, this change does a little bit of surgery to
`FileRoot`. Previously, a `FileRoot` was only used for library search
paths. Its revision was bumped whenever a file in that tree was added,
deleted or even modified (to support the discovery of `pth` files and
changes to its contents). This generally seems fine since these are
presumably dependency paths that shouldn't change frequently.
In this change, we add a `FileRoot` for the project. But having the
`FileRoot`'s revision bumped for every change in the project makes
caching based on that `FileRoot` rather ineffective. That is, cache
invalidation will occur too aggressively. To the point that there is
little point in adding caching in the first place. To mitigate this, a
`FileRoot`'s revision is only bumped on a change to a child file's
contents when the `FileRoot` is a `LibrarySearchPath`. Otherwise, we
only bump the revision when a file is created or added.
The effect is that, at least in VS Code, when a new module is added or
removed, this change is picked up and the cache is properly invalidated.
Other LSP clients with worse support for file watching (which seems to
be the case for the CoC vim plugin that I use) don't work as well. Here,
the cache is less likely to be invalidated which might cause completions
to have stale results. Unless there's an obvious way to fix or improve
this, I propose punting on improvements here for now.
## Summary
This PR updates the server to keep track of open files both system and
virtual files.
This is done by updating the project by adding the file in the open file
set in `didOpen` notification and removing it in `didClose`
notification.
This does mean that for workspace diagnostics, ty will only check open
files because the behavior of different diagnostic builder is to first
check `is_file_open` and only add diagnostics for open files. So, this
required updating the `is_file_open` model to be `should_check_file`
model which validates whether the file needs to be checked based on the
`CheckMode`. If the check mode is open files only then it will check
whether the file is open. If it's all files then it'll return `true` by
default.
Closes: astral-sh/ty#619
## Test Plan
### Before
There are two files in the project: `__init__.py` and `diagnostics.py`.
In the video, I'm demonstrating the old behavior where making changes to
the (open) `diagnostics.py` file results in re-parsing the file:
https://github.com/user-attachments/assets/c2ac0ecd-9c77-42af-a924-c3744b146045
### After
Same setup as above.
In the video, I'm demonstrating the new behavior where making changes to
the (open) `diagnostics.py` file doesn't result in re-parting the file:
https://github.com/user-attachments/assets/7b82fe92-f330-44c7-b527-c841c4545f8f
## Summary
This PR makes the necessary changes to the server that it can request
configurations from the client using the `configuration` request.
This PR doesn't make use of the request yet. It only sets up the
foundation (mainly the coordination between client and server)
so that future PRs could pull specific settings.
I plan to use this for pulling the Python environment from the Python
extension.
Deno does something very similar to this.
## Test Plan
Tested that diagnostics are still shown.
## Summary
Setting `TY_MEMORY_REPORT=full` will generate and print a memory usage
report to the CLI after a `ty check` run:
```
=======SALSA STRUCTS=======
`Definition` metadata=7.24MB fields=17.38MB count=181062
`Expression` metadata=4.45MB fields=5.94MB count=92804
`member_lookup_with_policy_::interned_arguments` metadata=1.97MB fields=2.25MB count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
metadata=11.46MB fields=88.86MB count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
metadata=24.52MB fields=86.68MB count=146018
`File -> ruff_db::parsed::ParsedModule`
metadata=0.12MB fields=69.06MB count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
struct metadata = 29.00MB
struct fields = 35.68MB
memo metadata = 103.87MB
memo fields = 409.06MB
```
Eventually, we should integrate these numbers into CI in some form. The
one limitation currently is that heap allocations in salsa structs (e.g.
interned values) are not tracked, but memoized values should have full
coverage. We may also want a peak memory usage counter (that accounts
for non-salsa memory), but that is relatively simple to profile manually
(e.g. `time -v ty check`) and would require a compile-time option to
avoid runtime overhead.
## Summary
Implement basic *Goto type definition* support for Red Knot's LSP.
This PR also builds the foundation for other LSP operations. E.g., Goto
definition, hover, etc., should be able to reuse some, if not most,
logic introduced in this PR.
The basic steps of resolving the type definitions are:
1. Find the closest token for the cursor offset. This is a bit more
subtle than I first anticipated because the cursor could be positioned
right between the callee and the `(` in `call(test)`, in which case we
want to resolve the type for `call`.
2. Find the node with the minimal range that fully encloses the token
found in 1. I somewhat suspect that 1 and 2 could be done at the same
time but it complicated things because we also need to compute the spine
(ancestor chain) for the node and there's no guarantee that the found
nodes have the same ancestors
3. Reduce the node found in 2. to a node that is a valid goto target.
This may require traversing upwards to e.g. find the closest expression.
4. Resolve the type for the goto target
5. Resolve the location for the type, return it to the LSP
## Design decisions
The current implementation navigates to the inferred type. I think this
is what we want because it means that it correctly accounts for
narrowing (in which case we want to go to the narrowed type because
that's the value's type at the given position). However, it does have
the downside that Goto type definition doesn't work whenever we infer `T
& Unknown` because intersection types aren't supported. I'm not sure
what to do about this specific case, other than maybe ignoring `Unkown`
in Goto type definition if the type is an intersection?
## Known limitations
* Types defined in the vendored typeshed aren't supported because the
client can't open files from the red knot binary (we can either
implement our own file protocol and handler OR extract the typeshed
files and point there). See
https://github.com/astral-sh/ruff/issues/17041
* Red Knot only exposes an API to get types for expressions and
definitions. However, there are many other nodes with identifiers that
can have a type (e.g. go to type of a globals statement, match patterns,
...). We can add support for those in separate PRs (after we figure out
how to query the types from the semantic model). See
https://github.com/astral-sh/ruff/issues/17113
* We should have a higher-level API for the LSP that doesn't directly
call semantic queries. I intentionally decided not to design that API
just yet.
## Test plan
https://github.com/user-attachments/assets/fa077297-a42d-4ec8-b71f-90c0802b4edb
Goto type definition on a union
<img width="1215" alt="Screenshot 2025-04-01 at 13 02 55"
src="https://github.com/user-attachments/assets/689cabcc-4a86-4a18-b14a-c56f56868085"
/>
Note: I recorded this using a custom typeshed path so that navigating to
builtins works.
## Summary
Another salsa upgrade.
The main motivation is to stay on a recent salsa version because there
are still a lot of breaking changes happening.
The most significant changes in this update:
* Salsa no longer derives `Debug` by default. It now requires
`interned(debug)` (or similar)
* This version ships the foundation for garbage collecting interned
values. However, this comes at the cost that queries now track which
interned values they created (or read). The micro benchmarks in the
salsa repo showed a significant perf regression. Will see if this also
visible in our benchmarks.
## Test Plan
`cargo test`
## Summary
This PR introduces a new mdtest option `system` that can either be
`in-memory` or `os`
where `in-memory` is the default.
The motivation for supporting `os` is so that we can write OS/system
specific tests
with mdtests. Specifically, I want to write mdtests for the module
resolver,
testing that module resolution is case sensitive.
## Test Plan
I tested that the case-sensitive module resolver test start failing when
setting `system = "os"`
## Summary
...and remove periods from messages that don't span more than a single
sentence.
This is more consistent with how we present user-facing messages in uv
(which has a defined style guide).
## Summary
This PR changes removes the typeshed stubs from the vendored file system
shipped with ruff
and instead ships an empty "typeshed".
Making the typeshed files optional required extracting the typshed files
into a new `ruff_vendored` crate. I do like this even if all our builds
always include typeshed because it means `red_knot_python_semantic`
contains less code that needs compiling.
This also allows us to use deflate because the compression algorithm
doesn't matter for an archive containing a single, empty file.
## Test Plan
`cargo test`
I verified with ` cargo tree -f "{p} {f}" -p <package> ` that:
* red_knot_wasm: enables `deflate` compression
* red_knot: enables `zstd` compression
* `ruff`: uses stored
I'm not quiet sure how to build the binary that maturin builds but
comparing the release artifact size with `strip = true` shows a `1.5MB`
size reduction
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
Prototype deferred evaluation of type expressions by deferring
evaluation of class bases in a stub file. This allows self-referential
class definitions, as occur with the definition of `str` in typeshed
(which inherits `Sequence[str]`).
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR simplifies the virtual file support in the red knot core,
specifically:
* Update `File::add_virtual_file` method to `File::virtual_file` which
will always create a new virtual file and override the existing entry in
the lookup table
* Add `VirtualFile` which is a wrapper around `File` and provides
methods to increment the file revision / close the virtual file
* Add a new `File::try_virtual_file` to lookup the `VirtualFile` from
`Files`
* Add `File::sync_virtual_path` which takes in the `SystemVirtualPath`,
looks up the `VirtualFile` for it and calls the `sync` method to
increment the file revision
* Removes the `virtual_path_metadata` method on `System` trait
## Test Plan
- [x] Make sure the existing red knot tests pass
- [x] Updated code works well with the LSP
## Summary
This PR adds support for untitled files in the Red Knot project.
Refer to the [design
discussion](https://github.com/astral-sh/ruff/discussions/12336) for
more details.
### Changes
* The `parsed_module` always assumes that the `SystemVirtual` path is of
`PySourceType::Python`.
* For the module resolver, as suggested, I went ahead by adding a new
`SystemOrVendoredPath` enum and renamed `FilePathRef` to
`SystemOrVendoredPathRef` (happy to consider better names here).
* The `file_to_module` query would return if it's a
`FilePath::SystemVirtual` variant because a virtual file doesn't belong
to any module.
* The sync implementation for the system virtual path is basically the
same as that of system path except that it uses the
`virtual_path_metadata`. The reason for this is that the system
(language server) would provide the metadata on whether it still exists
or not and if it exists, the corresponding metadata.
For point (1), VS Code would use `Untitled-1` for Python files and
`Untitled-1.ipynb` for Jupyter Notebooks. We could use this distinction
to determine whether the source type is `Python` or `Ipynb`.
## Test Plan
Added test cases in #12526