## Summary
I played with those numbers a bit locally and `sample_size=3,
sample_count=8` seemed like a rather stable setup. This means a single
sample consistents of 3 iterations of checking pydantic multithreaded.
And this is repeated 8 times for statistics. A single check took ~300 ms
previously on the runners, so this should only take 7 s.
## Summary
The [`DateType`](https://github.com/glyph/DateType) library has some
very large protocols in it. Currently we type-check it quite quickly,
but the current version of https://github.com/astral-sh/ruff/pull/18659
makes our execution time on this library pathologically slow. That PR
doesn't seem to have a big impact on any of our current benchmarks,
however, so it seems we have some missing coverage in this area; I
therefore propose that we add `DateType` as a benchmark.
Currently the benchmark runs pretty quickly (about half the runtime of
attrs, which is our fastest real-world benchmark currently), and the
library has 0 third-party dependencies, so the benchmark is quick to
setup.
## Test Plan
`cargo bench -p ruff_benchmark --bench=ty`
The benchmark is currently very noisy (± 10%). This leads to codspeed
reports on PRs, because we often exceed the trigger threshold. This is
confusing to ty contributors who are not aware about the flakiness.
Let's disable it for now.
## Summary
Adds a new micro-benchmark as a regression test for
https://github.com/astral-sh/ty/issues/627.
## Test Plan
Ran the benchmark on the parent commit of
89d915a1e3,
and verified that it took > 1s, while it takes ~10 ms after the fix.
## Summary
Add a micro-benchmark for the code pattern observed in
https://github.com/astral-sh/ty/issues/362.
This currently takes around 1 second on my machine.
## Test Plan
```bash
cargo bench -p ruff_benchmark -- 'ty_micro\[many_tuple' --sample-size 10
```
We were not inducting into instance types and subclass-of types when
looking for legacy typevars, nor when apply specializations.
This addresses
https://github.com/astral-sh/ruff/pull/17832#discussion_r2081502056
```py
from __future__ import annotations
from typing import TypeVar, Any, reveal_type
S = TypeVar("S")
class Foo[T]:
def method(self, other: Foo[S]) -> Foo[T | S]: ... # type: ignore[invalid-return-type]
def f(x: Foo[Any], y: Foo[Any]):
reveal_type(x.method(y)) # revealed: `Foo[Any | S]`, but should be `Foo[Any]`
```
We were not detecting that `S` made `method` generic, since we were not
finding it when searching the function signature for legacy typevars.
## Summary
Adds a simple progress bar for the `ty check` CLI command. The style is
taken from uv, and like uv the bar is always shown - for smaller
projects it is fast enough that it isn't noticeable. We could
alternatively hide it completely based on some heuristic for the number
of files, or only show it after some amount of time.
I also disabled it when `--watch` is passed, cancelling inflight checks
was leading to zombie progress bars. I think we can fix this by using
[`MultiProgress`](https://docs.rs/indicatif/latest/indicatif/struct.MultiProgress.html)
and managing all the bars globally, but I left that out for now.
Resolves https://github.com/astral-sh/ty/issues/98.
## Summary
This PR is a first step toward integration of the new `Diagnostic` type
into ruff. There are two main changes:
- A new `UnifiedFile` enum wrapping `File` for red-knot and a
`SourceFile` for ruff
- ruff's `Message::SyntaxError` variant is now a `Diagnostic` instead of
a `SyntaxErrorMessage`
The second of these changes was mostly just a proof of concept for the
first, and it went pretty smoothly. Converting `DiagnosticMessage`s will
be most of the work in replacing `Message` entirely.
## Test Plan
Existing tests, which show no changes.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Adds preliminary support for `NamedTuple`s, including:
* No false positives when constructing a `NamedTuple` object
* Correct signature for the synthesized `__new__` method, i.e. proper
checking of constructor calls
* A patched MRO (`NamedTuple` => `tuple`), mainly to make type inference
of named attributes possible, but also to better reflect the runtime
MRO.
All of this works:
```py
from typing import NamedTuple
class Person(NamedTuple):
id: int
name: str
age: int | None = None
alice = Person(1, "Alice", 42)
alice = Person(id=1, name="Alice", age=42)
reveal_type(alice.id) # revealed: int
reveal_type(alice.name) # revealed: str
reveal_type(alice.age) # revealed: int | None
# error: [missing-argument]
Person(3)
# error: [too-many-positional-arguments]
Person(3, "Eve", 99, "extra")
# error: [invalid-argument-type]
Person(id="3", name="Eve")
```
Not included:
* type inference for index-based access.
* support for the functional `MyTuple = NamedTuple("MyTuple", […])`
syntax
## Test Plan
New Markdown tests
## Ecosystem analysis
```
Diagnostic Analysis Report
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
┃ Diagnostic ID ┃ Severity ┃ Removed ┃ Added ┃ Net Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
│ lint:call-non-callable │ error │ 0 │ 3 │ +3 │
│ lint:call-possibly-unbound-method │ warning │ 0 │ 4 │ +4 │
│ lint:invalid-argument-type │ error │ 0 │ 72 │ +72 │
│ lint:invalid-context-manager │ error │ 0 │ 2 │ +2 │
│ lint:invalid-return-type │ error │ 0 │ 2 │ +2 │
│ lint:missing-argument │ error │ 0 │ 46 │ +46 │
│ lint:no-matching-overload │ error │ 19121 │ 0 │ -19121 │
│ lint:not-iterable │ error │ 0 │ 6 │ +6 │
│ lint:possibly-unbound-attribute │ warning │ 13 │ 32 │ +19 │
│ lint:redundant-cast │ warning │ 0 │ 1 │ +1 │
│ lint:unresolved-attribute │ error │ 0 │ 10 │ +10 │
│ lint:unsupported-operator │ error │ 3 │ 9 │ +6 │
│ lint:unused-ignore-comment │ warning │ 15 │ 4 │ -11 │
├───────────────────────────────────┼──────────┼─────────┼───────┼────────────┤
│ TOTAL │ │ 19152 │ 191 │ -18961 │
└───────────────────────────────────┴──────────┴─────────┴───────┴────────────┘
Analysis complete. Found 13 unique diagnostic IDs.
Total diagnostics removed: 19152
Total diagnostics added: 191
Net change: -18961
```
I uploaded the ecosystem full diff (ignoring the 19k
`no-matching-overload` diagnostics)
[here](https://shark.fish/diff-namedtuple.html).
* There are some new `missing-argument` false positives which come from
the fact that named tuples are often created using unpacking as in
`MyNamedTuple(*fields)`, which we do not understand yet.
* There are some new `unresolved-attribute` false positives, because
methods like `_replace` are not available.
* Lots of the `invalid-argument-type` diagnostics look like true
positives
---------
Co-authored-by: Douglas Creager <dcreager@dcreager.net>
We are currently representing type variables using a `KnownInstance`
variant, which wraps a `TypeVarInstance` that contains the information
about the typevar (name, bounds, constraints, default type). We were
previously only constructing that type for PEP 695 typevars. This PR
constructs that type for legacy typevars as well.
It also detects functions that are generic because they use legacy
typevars in their parameter list. With the existing logic for inferring
specializations of function calls (#17301), that means that we are
correctly detecting that the definition of `reveal_type` in the typeshed
is generic, and inferring the correct specialization of `_T` for each
call site.
This does not yet handle legacy generic classes; that will come in a
follow-on PR.
## Summary
Now that we've made the large-unions benchmark fast, let's make it slow
again!
This adds a following operation (checking `len`) on the large union,
which is slow, even though building the large union is now fast. (This
is also observed in a real-world code sample.) It's slow because for
every element of the union, we fetch its `__len__` method and check it
for compatibility with `Sized`.
We can make this fast by extending the grouped-types approach, as
discussed in https://github.com/astral-sh/ruff/pull/17403, so that we
can do this `__len__` operation (which is identical for every literal
string) just once for all literal strings, instead of once per literal
string type in the union.
Until we do that, we can make this acceptably fast again for now by
setting a lowish limit on union size, which we can increase in the
future when we make it fast. This is what I'll do in the next PR.
## Test Plan
`cargo bench --bench red_knot`
## Summary
Add a benchmark for a large-union case that currently has exponential
blow-up in execution time.
## Test Plan
`cargo bench --bench red_knot`
This replaces things like `TypeCheckDiagnostic` with the new Diagnostic`
type.
This is a "surgical" replacement where we retain the existing API of
of diagnostic reporting such that _most_ of Red Knot doesn't need to be
changed to support this update. But it will enable us to start using the
new diagnostic renderer and to delete the old renderer. It also paves
the path for exposing the new `Diagnostic` data model to the broader Red
Knot codebase.
## Summary
This PR adds initial support for `*` imports to red-knot. The approach
is to implement a standalone query, called from semantic indexing, that
visits the module referenced by the `*` import and collects all
global-scope public names that will be imported by the `*` import. The
`SemanticIndexBuilder` then adds separate definitions for each of these
names, all keyed to the same `ast::Alias` node that represents the `*`
import.
There are many pieces of `*`-import semantics that are still yet to be
done, even with this PR:
- This PR does not attempt to implement any of the semantics to do with
`__all__`. (If a module defines `__all__`, then only the symbols
included in `__all__` are imported, _not_ all public global-scope
symbols.
- With the logic implemented in this PR as it currently stands, we
sometimes incorrectly consider a symbol bound even though it is defined
in a branch that is statically known to be dead code, e.g. (assuming the
target Python version is set to 3.11):
```py
# a.py
import sys
if sys.version_info < (3, 10):
class Foo: ...
```
```py
# b.py
from a import *
print(Foo) # this is unbound at runtime on 3.11,
# but we currently consider it bound with the logic in this PR
```
Implementing these features is important, but is for now deferred to
followup PRs.
Many thanks to @ntBre, who contributed to this PR in a pairing session
on Friday!
## Test Plan
Assertions in existing mdtests are adjusted, and several new ones are
added.
## Summary
Here I fix the last English spelling errors I could find in the repo.
Again, I am trying not to touch variable/function names, or anything
that might be misspelled in the API. The goal is to make this PR safe
and easy to merge.
## Test Plan
I have run all the unit tests. Though, again, all of the changes I make
here are to docs and docstrings. I make no code changes, which I believe
should greatly mitigate the testing concerns.
The single flag `has_syntax_error` on `LinterResult` is replaced with
two (private) flags: `has_valid_syntax` and
`has_no_unsupported_syntax_errors`, which record whether there are
`ParseError`s or `UnsupportedSyntaxError`s, respectively. Only the
former is used to prevent a `FixAll` action.
An attempt has been made to make consistent the usage of the phrases
"valid syntax" (which seems to be used to refer only to _parser_ errors)
and "syntax error" (which refers to both _parser_ errors and
version-specific syntax errors).
Closes#16841
## Summary
This mostly fixes#14899
My motivation was similar to the last comment by @sharkdp there. I ran
red_knot on a codebase and the most common error was patterns like this
failing:
```
def foo(x: str): ...
x: Any = ...
if isinstance(x, str):
foo(x) # Object of type `Any & str` cannot be assigned to parameter 1 (`x`) of function `foo`; expected type `str`
```
The desired behavior is pretty much to ignore Any/Unknown when resolving
intersection assignability - `Any & str` should be assignable to `str`,
and `str` should be assignable to `str & Any`
The fix is actually very similar to the existing code in
`is_subtype_of`, we need to correctly handle intersections on either
side, while being careful to handle dynamic types as desired.
This does not fix the second test case from that issue:
```
static_assert(is_assignable_to(Intersection[Unrelated, Any], Not[tuple[Unrelated, Any]]))
```
but that's misleading because the root cause there has nothing to do
with gradual types. I added a simpler test case that also fails:
```
static_assert(is_assignable_to(Unrelated, Not[tuple[Unrelated]]))
```
This is because we don't determine that Unrelated does not subclass from
tuple so we can't rule out this relation. If that logic is improved then
this fix should also handle the case of the intersection
## Test Plan
Added a bunch of is_assignable_to tests, most of which failed before
this fix.
## Summary
This PR introduces a new mdtest option `system` that can either be
`in-memory` or `os`
where `in-memory` is the default.
The motivation for supporting `os` is so that we can write OS/system
specific tests
with mdtests. Specifically, I want to write mdtests for the module
resolver,
testing that module resolution is case sensitive.
## Test Plan
I tested that the case-sensitive module resolver test start failing when
setting `system = "os"`
This trait should eventually go away, so we rename it (and supporting
types) to make room for a new concrete `Diagnostic` type.
This commit is just the rename. In the next commit, we'll move it to a
different module.
## Summary
Add a diagnostic if a pure instance variable is accessed on a class object. For example
```py
class C:
instance_only: str
def __init__(self):
self.instance_only = "a"
# error: Attribute `instance_only` can only be accessed on instances, not on the class object `Literal[C]` itself.
C.instance_only
```
---------
Co-authored-by: David Peter <mail@david-peter.de>
## Summary
This is part of the preparation for detecting syntax errors in the
parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested
in [this
comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509),
I started working on a `ParseOptions` struct that could be stored in the
parser. For this initial refactor, I only made it hold the existing
`Mode` option, but for syntax errors, we will also need it to have a
`PythonVersion`. For that use case, I'm picturing something like a
`ParseOptions::with_python_version` method, so you can extend the
current calls to something like
```rust
ParseOptions::from(mode).with_python_version(settings.target_version)
```
But I thought it was worth adding `ParseOptions` alone without changing
any other behavior first.
Most of the diff is just updating call sites taking `Mode` to take
`ParseOptions::from(Mode)` or those taking `PySourceType`s to take
`ParseOptions::from(PySourceType)`. The interesting changes are in the
new `parser/options.rs` file and smaller parts of `parser/mod.rs` and
`ruff_python_parser/src/lib.rs`.
## Test Plan
Existing tests, this should not change any behavior.
## Summary
This PR updates the formatter and linter to use the `PythonVersion`
struct from the `ruff_python_ast` crate internally. While this doesn't
remove the need for the `linter::PythonVersion` enum, it does remove the
`formatter::PythonVersion` enum and limits the use in the linter to
deserializing from CLI arguments and config files and moves most of the
remaining methods to the `ast::PythonVersion` struct.
## Test Plan
Existing tests, with some inputs and outputs updated to reflect the new
(de)serialization format. I think these are test-specific and shouldn't
affect any external (de)serialization.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR moves the `PythonVersion` struct from the
`red_knot_python_semantic` crate to the `ruff_python_ast` crate so that
it can be used more easily in the syntax error detection work. Compared
to that [prototype](https://github.com/astral-sh/ruff/pull/16090/) these
changes reduce us from 2 `PythonVersion` structs to 1.
This does not unify any of the `PythonVersion` *enums*, but I hope to
make some progress on that in a follow-up.
## Test Plan
Existing tests, this should not change any external behavior.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This essentially makes it impossible to construct a `Diagnostic`
that has a `TextRange` but no `File`.
This is meant to be a precursor to multi-span support.
(Note that I consider this more of a prototyping-change and not
necessarily what this is going to look like longer term.)
Reviewers can probably review this PR as one big diff instead of
commit-by-commit.
This change does a simple swap of the existing renderer for one that
uses our vendored copy of `annotate-snippets`. We don't change anything
about the diagnostic data model, but this alone already makes
diagnostics look a lot nicer!
`FlowSnapshot` now tracks a `reachable` bool, which indicates whether we
have encountered a terminal statement on that control flow path. When
merging flow states together, we skip any that have been marked
unreachable. This ensures that bindings that can only be reached through
unreachable paths are not considered visible.
## Test Plan
The new mdtests failed (with incorrect `reveal_type` results, and
spurious `possibly-unresolved-reference` errors) before adding the new
visibility constraints.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Use `Unknown | T_inferred` as the type for *undeclared* public symbols.
## Test Plan
- Updated existing tests
- New test for external `__slots__` modifications.
- New tests for external modifications of public symbols.