## Summary
Use `Type::Divergent` to avoid "too many iterations" panic on an
infinitely-nested tuple in an implicit instance attribute.
The regression here is from checking all tuple elements to see if they
contain a Divergent type. It's 5% on one project, 1% on another, and
zero on the rest. I spent some time looking into eliminating this
regression by tracking a flag on inference results to note if they could
possibly contain any Divergent type, but this doesn't really work --
there are too many different ways a type containing a Divergent type
could enter an inference result. Still thinking about whether there are
other ways to reduce this. One option is if we see certain kinds of
non-atomic types that are commonly expensive to check for Divergent, we
could make `has_divergent_type` a Salsa query on those types.
## Test Plan
Added mdtest.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Summary
--
To take advantage of the new diagnostics, we need to update our caching
model to include all of the information supported by `ruff_db`'s
diagnostic type. Instead of trying to serialize all of this information,
Micha suggested simply not caching files with diagnostics, like we
already do for files with syntax errors. This PR is an attempt at that
approach.
This has the added benefit of trimming down our `Rule` derives since
this was the last place the `FromStr`/`strum_macros::EnumString`
implementation was used, as well as the (de)serialization macros and
`CacheKey`.
Test Plan
--
Existing tests, with their input updated not to include a diagnostic,
plus a new test showing that files with lint diagnostics are not cached.
Benchmarks
--
In addition to tests, we wanted to check that this doesn't degrade
performance too much. I posted part of this new analysis in
https://github.com/astral-sh/ruff/issues/18198#issuecomment-3175048672,
but I'll duplicate it here. In short, there's not much difference
between `main` and this branch for projects with few diagnostics
(`home-assistant`, `airflow`), as expected. The difference for projects
with many diagnostics (`cpython`) is quite a bit bigger (~300 ms vs ~220
ms), but most projects that run ruff regularly are likely to have very
few diagnostics, so this may not be a problem practically.
I guess GitHub isn't really rendering this as I intended, but the extra
separator line is meant to separate the benchmarks on `main` (above the
line) from this branch (below the line).
| Command | Mean [ms] | Min [ms] | Max [ms] |
|:--------------------------------------------------------------|----------:|---------:|---------:|
| `ruff check cpython --no-cache --isolated --exit-zero` | 322.0 | 317.5
| 326.2 |
| `ruff check cpython --isolated --exit-zero` | 217.3 | 209.8 | 237.9 |
| `ruff check home-assistant --no-cache --isolated --exit-zero` | 279.5
| 277.0 | 283.6 |
| `ruff check home-assistant --isolated --exit-zero` | 37.2 | 35.7 |
40.6 |
| `ruff check airflow --no-cache --isolated --exit-zero` | 133.1 | 130.4
| 146.4 |
| `ruff check airflow --isolated --exit-zero` | 34.7 | 32.9 | 41.6 |
|:--------------------------------------------------------------|----------:|---------:|---------:|
| `ruff check cpython --no-cache --isolated --exit-zero` | 330.1 | 324.5
| 333.6 |
| `ruff check cpython --isolated --exit-zero` | 309.2 | 306.1 | 314.7 |
| `ruff check home-assistant --no-cache --isolated --exit-zero` | 288.6
| 279.4 | 302.3 |
| `ruff check home-assistant --isolated --exit-zero` | 39.8 | 36.9 |
42.4 |
| `ruff check airflow --no-cache --isolated --exit-zero` | 134.5 | 131.3
| 140.6 |
| `ruff check airflow --isolated --exit-zero` | 39.1 | 37.2 | 44.3 |
I had Claude adapt one of the
[scripts](https://github.com/sharkdp/hyperfine/blob/master/scripts/plot_whisker.py)
from the hyperfine repo to make this plot, so it's not quite perfect,
but maybe it's still useful. The table is probably more reliable for
close comparisons. I'll put more details about the benchmarks below for
the sake of future reproducibility.
<img width="4472" height="2368" alt="image"
src="https://github.com/user-attachments/assets/1c42d13e-818a-44e7-b34c-247340a936d7"
/>
<details><summary>Benchmark details</summary>
<p>
The versions of each project:
- CPython: 6322edd260e8cad4b09636e05ddfb794a96a0451, the 3.10 branch
from the contributing docs
- `home-assistant`: 5585376b406f099fb29a970b160877b57e5efcb0
- `airflow`: 29a1cb0cfde9d99b1774571688ed86cb60123896
The last two are just the main branches at the time I cloned the repos.
I don't think our Ruff config should be applied since I used
`--isolated`, but these are cloned into my copy of Ruff at
`crates/ruff_linter/resources/test`, and I trimmed the
`./target/release/` prefix from each of the commands, but these are
builds of Ruff in release mode.
And here's the script with the `hyperfine` invocation:
```shell
#!/bin/bash
cargo build --release --bin ruff
# git clone --depth 1 https://github.com/home-assistant/core crates/ruff_linter/resources/test/home-assistant
# git clone --depth 1 https://github.com/apache/airflow crates/ruff_linter/resources/test/airflow
bin=./target/release/ruff
resources=./crates/ruff_linter/resources/test
cpython=$resources/cpython
home_assistant=$resources/home-assistant
airflow=$resources/airflow
base=${1:-bench}
hyperfine --warmup 10 --export-json $base.json --export-markdown $base.md \
"$bin check $cpython --no-cache --isolated --exit-zero" \
"$bin check $cpython --isolated --exit-zero" \
"$bin check $home_assistant --no-cache --isolated --exit-zero" \
"$bin check $home_assistant --isolated --exit-zero" \
"$bin check $airflow --no-cache --isolated --exit-zero" \
"$bin check $airflow --isolated --exit-zero"
```
I ran this once on `main` (`baseline` in the graph, top half of the
table) and once on this branch (`nocache` and bottom of the table).
</p>
</details>
## Summary
This PR adds a new `Type::TypedDict` variant. Before this PR, we treated
`TypedDict`-based types as dynamic Todo-types, and I originally planned
to make this change a no-op. And we do in fact still treat that new
variant similar to a dynamic type when it comes to type properties such
as assignability and subtyping. But then I somehow tricked myself into
implementing some of the things correctly, so here we are. The two main
behavioral changes are: (1) we now also detect generic `TypedDict`s,
which removes a few false positives in the ecosystem, and (2) we now
support *attribute* access (not key-based indexing!) on these types,
i.e. we infer proper types for something like
`MyTypedDict.__required_keys__`. Nothing exciting yet, but gets the
infrastructure into place.
Note that with this PR, the type of (the type) `MyTypedDict` itself is
still represented as a `Type::ClassLiteral` or `Type::GenericAlias` (in
case `MyTypedDict` is generic). Only inhabitants of `MyTypedDict`
(instances of `dict` at runtime) are represented by `Type::TypedDict`.
We may want to revisit this decision in the future, if this turns out to
be too error-prone. Right now, we need to use `.is_typed_dict(db)` in
all the right places to distinguish between actual (generic) classes and
`TypedDict`s. But so far, it seemed unnecessary to add additional `Type`
variants for these as well.
part of https://github.com/astral-sh/ty/issues/154
## Ecosystem impact
The new diagnostics on `cloud-init` look like true positives to me.
## Test Plan
Updated and new Markdown tests
## Summary
This PR updates the server to keep track of open files both system and
virtual files.
This is done by updating the project by adding the file in the open file
set in `didOpen` notification and removing it in `didClose`
notification.
This does mean that for workspace diagnostics, ty will only check open
files because the behavior of different diagnostic builder is to first
check `is_file_open` and only add diagnostics for open files. So, this
required updating the `is_file_open` model to be `should_check_file`
model which validates whether the file needs to be checked based on the
`CheckMode`. If the check mode is open files only then it will check
whether the file is open. If it's all files then it'll return `true` by
default.
Closes: astral-sh/ty#619
## Test Plan
### Before
There are two files in the project: `__init__.py` and `diagnostics.py`.
In the video, I'm demonstrating the old behavior where making changes to
the (open) `diagnostics.py` file results in re-parsing the file:
https://github.com/user-attachments/assets/c2ac0ecd-9c77-42af-a924-c3744b146045
### After
Same setup as above.
In the video, I'm demonstrating the new behavior where making changes to
the (open) `diagnostics.py` file doesn't result in re-parting the file:
https://github.com/user-attachments/assets/7b82fe92-f330-44c7-b527-c841c4545f8f
## Summary
This PR updates Salsa to pull in Ibraheem's multithreading improvements (https://github.com/salsa-rs/salsa/pull/921).
## Performance
A small regression for single-threaded benchmarks is expected because
papaya is slightly slower than a `Mutex<FxHashMap>` in the uncontested
case (~10%). However, this shouldn't matter as much in practice because:
1. Salsa has a fast-path when only using 1 DB instance which is the
common case in production. This fast-path is not impacted by the changes
but we measure the slow paths in our benchmarks (because we use multiple
db instances)
2. Fixing the 10x slowdown for the congested case (multi threading)
outweights the downsides of a 10% perf regression for single threaded
use cases, especially considering that ty is heavily multi threaded.
## Test Plan
`cargo test`
## Summary
Setting `TY_MEMORY_REPORT=full` will generate and print a memory usage
report to the CLI after a `ty check` run:
```
=======SALSA STRUCTS=======
`Definition` metadata=7.24MB fields=17.38MB count=181062
`Expression` metadata=4.45MB fields=5.94MB count=92804
`member_lookup_with_policy_::interned_arguments` metadata=1.97MB fields=2.25MB count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
metadata=11.46MB fields=88.86MB count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
metadata=24.52MB fields=86.68MB count=146018
`File -> ruff_db::parsed::ParsedModule`
metadata=0.12MB fields=69.06MB count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
struct metadata = 29.00MB
struct fields = 35.68MB
memo metadata = 103.87MB
memo fields = 409.06MB
```
Eventually, we should integrate these numbers into CI in some form. The
one limitation currently is that heap allocations in salsa structs (e.g.
interned values) are not tracked, but memoized values should have full
coverage. We may also want a peak memory usage counter (that accounts
for non-salsa memory), but that is relatively simple to profile manually
(e.g. `time -v ty check`) and would require a compile-time option to
avoid runtime overhead.
Summary
--
This PR unifies the remaining differences between `OldDiagnostic` and
`Message` (`OldDiagnostic` was only missing an optional `noqa_offset`
field) and
replaces `Message` with `OldDiagnostic`.
The biggest functional difference is that the combined `OldDiagnostic`
kind no
longer implements `AsRule` for an infallible conversion to `Rule`. This
was
pretty easy to work around with `is_some_and` and `is_none_or` in the
few places
it was needed. In `LintContext::report_diagnostic_if_enabled` we can
just use
the new `Violation::rule` method, which takes care of most cases.
Most of the interesting changes are in [this
range](8156992540)
before I started renaming.
Test Plan
--
Existing tests
Future Work
--
I think it's time to start shifting some of these fields to the new
`Diagnostic`
kind. I believe we want `Fix` for sure, but I'm less sure about the
others. We
may want to keep a thin wrapper type here anyway to implement a `rule`
method,
so we could leave some of these fields on that too.
## Summary
This PR does the following:
1. Remove the symlinks from the `fuzz/` directory
2. Update `init-fuzzer.sh` script to create those symlinks
3. Update `fuzz/.gitignore` to ignore those corpus directories
## Test Plan
Initialize the fuzzer:
```sh
./fuzz/init-fuzzer.sh
```
And, run a fuzz target:
```sh
cargo +nightly fuzz run ruff_parse_simple -- -timeout=1 -only_ascii=1
```
## Summary
This PR extends version-related syntax error detection to red-knot. The
main changes here are:
1. Passing `ParseOptions` specifying a `PythonVersion` to parser calls
2. Adding a `python_version` method to the `Db` trait to make this
possible
3. Converting `UnsupportedSyntaxError`s to `Diagnostic`s
4. Updating existing mdtests to avoid unrelated syntax errors
My initial draft of (1) and (2) in #16090 instead tried passing a
`PythonVersion` down to every parser call, but @MichaReiser suggested
the `Db` approach instead
[here](https://github.com/astral-sh/ruff/pull/16090#discussion_r1969198407),
and I think it turned out much nicer.
All of the new `python_version` methods look like this:
```rust
fn python_version(&self) -> ruff_python_ast::PythonVersion {
Program::get(self).python_version(self)
}
```
with the exception of the `TestDb` in `ruff_db`, which hard-codes
`PythonVersion::latest()`.
## Test Plan
Existing mdtests, plus a new mdtest to see at least one of the new
diagnostics.
This causes spurious query cycles.
This PR also includes an update to Salsa, which gives us db events on
cycle iteration, so we can write tests asserting the absence of a cycle.
## Summary
Update Salsa to pull in https://github.com/salsa-rs/salsa/pull/788 which
fixes the, by now, famous *access to field whilst the value is being
initialized*.
This PR also re-enables all tests that previously triggered the panic.
## Test Plan
`cargo test`
Update to latest Salsa main branch, so as to get a baseline for
measuring the perf effect of https://github.com/salsa-rs/salsa/pull/786
on red-knot in isolation from other recent changes in Salsa main branch.
The single flag `has_syntax_error` on `LinterResult` is replaced with
two (private) flags: `has_valid_syntax` and
`has_no_unsupported_syntax_errors`, which record whether there are
`ParseError`s or `UnsupportedSyntaxError`s, respectively. Only the
former is used to prevent a `FixAll` action.
An attempt has been made to make consistent the usage of the phrases
"valid syntax" (which seems to be used to refer only to _parser_ errors)
and "syntax error" (which refers to both _parser_ errors and
version-specific syntax errors).
Closes#16841
## Summary
Another salsa upgrade.
The main motivation is to stay on a recent salsa version because there
are still a lot of breaking changes happening.
The most significant changes in this update:
* Salsa no longer derives `Debug` by default. It now requires
`interned(debug)` (or similar)
* This version ships the foundation for garbage collecting interned
values. However, this comes at the cost that queries now track which
interned values they created (or read). The micro benchmarks in the
salsa repo showed a significant perf regression. Will see if this also
visible in our benchmarks.
## Test Plan
`cargo test`
Pulls in the latest Salsa main branch, which supports fixpoint
iteration, and uses it to handle all query cycles.
With this, we no longer need to skip any corpus files to avoid panics.
Latest perf results show a 6% incremental and 1% cold-check regression.
This is not a "no cycles" regression, as tomllib and typeshed do trigger
some definition cycles (previously handled by our old
`infer_definition_types` fallback to `Unknown`). We don't currently have
a benchmark we can use to measure the pure no-cycles regression, though
I expect there would still be some regression; the fixpoint iteration
feature in Salsa does add some overhead even for non-cyclic queries.
I think this regression is within the reasonable range for this feature.
We can do further optimization work later, but I don't think it's the
top priority right now. So going ahead and acknowledging the regression
on CodSpeed.
Mypy primer is happy, so this doesn't regress anything on our
currently-checked projects. I expect it probably unlocks adding a number
of new projects to our ecosystem check that previously would have
panicked.
Fixes#13792Fixes#14672
## Summary
This PR introduces a new mdtest option `system` that can either be
`in-memory` or `os`
where `in-memory` is the default.
The motivation for supporting `os` is so that we can write OS/system
specific tests
with mdtests. Specifically, I want to write mdtests for the module
resolver,
testing that module resolution is case sensitive.
## Test Plan
I tested that the case-sensitive module resolver test start failing when
setting `system = "os"`
Summary
--
This is a follow up addressing the comments on #16425. As @dhruvmanila
pointed out, the naming is a bit tricky. I went with `has_no_errors` to
try to differentiate it from `is_valid`. It actually ends up negated in
most uses, so it would be more convenient to have `has_any_errors` or
`has_errors`, but I thought it would sound too much like the opposite of
`is_valid` in that case. I'm definitely open to suggestions here.
Test Plan
--
Existing tests.