Generic classes are not allowed to bind or reference a typevar from an
enclosing scope:
```py
def f[T](x: T, y: T) -> None:
class Ok[S]: ...
# error: [invalid-generic-class]
class Bad1[T]: ...
# error: [invalid-generic-class]
class Bad2(Iterable[T]): ...
class C[T]:
class Ok1[S]: ...
# error: [invalid-generic-class]
class Bad1[T]: ...
# error: [invalid-generic-class]
class Bad2(Iterable[T]): ...
```
It does not matter if the class uses PEP 695 or legacy syntax. It does
not matter if the enclosing scope is a generic class or function. The
generic class cannot even _reference_ an enclosing typevar in its base
class list.
This PR adds diagnostics for these cases.
In addition, the PR adds better fallback behavior for generic classes
that violate this rule: any enclosing typevars are not included in the
class's generic context. (That ensures that we don't inadvertently try
to infer specializations for those typevars in places where we
shouldn't.) The `dulwich` ecosystem project has [examples of
this](d912eaaffd/dulwich/config.py (L251))
that were causing new false positives on #20677.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This allows us to handle self-referential bounds/constraints/defaults
without panicking.
Handles more cases from https://github.com/astral-sh/ty/issues/256
This also changes the way we infer the types of legacy TypeVars. Rather
than understanding a constructor call to `typing[_extension].TypeVar`
inside of any (arbitrarily nested) expression, and having to use a
special `assigned_to` field of the semantic index to try to best-effort
figure out what name the typevar was assigned to, we instead understand
the creation of a legacy `TypeVar` only in the supported syntactic
position (RHS of a simple un-annotated assignment with one target). In
any other position, we just infer it as creating an opaque instance of
`typing.TypeVar`. (This behavior matches all other type checkers.)
So we now special-case TypeVar creation in `TypeInferenceBuilder`, as a
special case of an assignment definition, rather than deeper inside call
binding. This does mean we re-implement slightly more of
argument-parsing, but in practice this is minimal and easy to handle
correctly.
This is easier to implement if we also make the RHS of a simple (no
unpacking) one-target assignment statement no longer a standalone
expression. Which is fine to do, because simple one-target assignments
don't need to infer the RHS more than once. This is a bonus performance
(0-3% across various projects) and significant memory-usage win, since
most assignment statements are simple one-target assignment statements,
meaning we now create many fewer standalone-expression salsa
ingredients.
This change does mean that inference of manually-constructed
`TypeAliasType` instances can no longer find its Definition in
`assigned_to`, which regresses go-to-definition for these aliases. In a
future PR, `TypeAliasType` will receive the same treatment that
`TypeVar` did in this PR (moving its special-case inference into
`TypeInferenceBuilder` and supporting it only in the correct syntactic
position, and lazily inferring its value type to support recursion),
which will also fix the go-to-definition regression. (I decided a
temporary edge-case regression is better in this case than doubling the
size of this PR.)
This PR also tightens up and fixes various aspects of the validation of
`TypeVar` creation, as seen in the tests.
We still (for now) treat all typevars as instances of `typing.TypeVar`,
even if they were created using `typing_extensions.TypeVar`. This means
we'll wrongly error on e.g. `T.__default__` on Python 3.11, even if `T`
is a `typing_extensions.TypeVar` instance at runtime. We share this
wrong behavior with both mypy and pyrefly. It will be easier to fix
after we pull in https://github.com/python/typeshed/pull/14840.
There are some issues that showed up here with typevar identity and
`MarkTypeVarsInferable`; the fix here (using the new `original` field
and `is_identical_to` methods on `BoundTypeVarInstance` and
`TypeVarInstance`) is a bit kludgy, but it can go away when we eliminate
`MarkTypeVarsInferable`.
## Test Plan
Added and updated mdtests.
### Conformance suite impact
The impact here is all positive:
* We now correctly error on a legacy TypeVar with exactly one constraint
type given.
* We now correctly error on a legacy TypeVar with both an upper bound
and constraints specified.
### Ecosystem impact
Basically none; in the setuptools case we just issue slightly different
errors on an invalid TypeVar definition, due to the modified validation
code.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Modify the (external) signature of instance methods such that the first
parameter uses `Self` unless it is explicitly annotated. This allows us
to correctly type-check more code, and allows us to infer correct return
types for many functions that return `Self`. For example:
```py
from pathlib import Path
from datetime import datetime, timedelta
reveal_type(Path(".config") / ".ty") # now Path, previously Unknown
def _(dt: datetime, delta: timedelta):
reveal_type(dt - delta) # now datetime, previously Unknown
```
part of https://github.com/astral-sh/ty/issues/159
## Performance
I ran benchmarks locally on `attrs`, `freqtrade` and `colour`, the
projects with the largest regressions on CodSpeed. I see much smaller
effects locally, but can definitely reproduce the regression on `attrs`.
From looking at the profiling results (on Codspeed), it seems that we
simply do more type inference work, which seems plausible, given that we
now understand much more return types (of many stdlib functions). In
particular, whenever a function uses an implicit `self` and returns
`Self` (without mentioning `Self` anywhere else in its signature), we
will now infer the correct type, whereas we would previously return
`Unknown`. This also means that we need to invoke the generics solver in
more cases. Comparing half a million lines of log output on attrs, I can
see that we do 5% more "work" (number of lines in the log), and have a
lot more `apply_specialization` events (7108 vs 4304). On freqtrade, I
see similar numbers for `apply_specialization` (11360 vs 5138 calls).
Given these results, I'm not sure if it's generally worth doing more
performance work, especially since none of the code modifications
themselves seem to be likely candidates for regressions.
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/attrs` | 92.6 ± 3.6 | 85.9 |
102.6 | 1.00 |
| `./ty_self check /home/shark/ecosystem/attrs` | 101.7 ± 3.5 | 96.9 |
113.8 | 1.10 ± 0.06 |
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/freqtrade` | 599.0 ± 20.2 |
568.2 | 627.5 | 1.00 |
| `./ty_self check /home/shark/ecosystem/freqtrade` | 607.9 ± 11.5 |
594.9 | 626.4 | 1.01 ± 0.04 |
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/colour` | 423.9 ± 17.9 | 394.6
| 447.4 | 1.00 |
| `./ty_self check /home/shark/ecosystem/colour` | 426.9 ± 24.9 | 373.8
| 456.6 | 1.01 ± 0.07 |
## Test Plan
New Markdown tests
## Ecosystem report
* apprise: ~300 new diagnostics related to problematic stubs in apprise
😩
* attrs: a new true positive, since [this
function](4e2c89c823/tests/test_make.py (L2135))
is missing a `@staticmethod`?
* Some legitimate true positives
* sympy: lots of new `invalid-operator` false positives in [matrix
multiplication](cf9f4b6805/sympy/matrices/matrixbase.py (L3267-L3269))
due to our limited understanding of [generic `Callable[[Callable[[T1,
T2], T3]], Callable[[T1, T2], T3]]` "identity"
types](cf9f4b6805/sympy/core/decorators.py (L83-L84))
of decorators. This is not related to type-of-self.
## Typing conformance results
The changes are all correct, except for
```diff
+generics_self_usage.py:50:5: error[invalid-assignment] Object of type `def foo(self) -> int` is not assignable to `(typing.Self, /) -> int`
```
which is related to an assignability problem involving type variables on
both sides:
```py
class CallableAttribute:
def foo(self) -> int:
return 0
bar: Callable[[Self], int] = foo # <- we currently error on this assignment
```
---------
Co-authored-by: Shaygan Hooshyari <sh.hooshyari@gmail.com>
`Type::TypeVar` now distinguishes whether the typevar in question is
inferable or not.
A typevar is _not inferable_ inside the body of the generic class or
function that binds it:
```py
def f[T](t: T) -> T:
return t
```
The infered type of `t` in the function body is `TypeVar(T,
NotInferable)`. This represents how e.g. assignability checks need to be
valid for all possible specializations of the typevar. Most of the
existing assignability/etc logic only applies to non-inferable typevars.
Outside of the function body, the typevar is _inferable_:
```py
f(4)
```
Here, the parameter type of `f` is `TypeVar(T, Inferable)`. This
represents how e.g. assignability doesn't need to hold for _all_
specializations; instead, we need to find the constraints under which
this specific assignability check holds.
This is in support of starting to perform specialization inference _as
part of_ performing the assignability check at the call site.
In the [[POPL2015][]] paper, this concept is called _monomorphic_ /
_polymorphic_, but I thought _non-inferable_ / _inferable_ would be
clearer for us.
Depends on #19784
[POPL2015]: https://doi.org/10.1145/2676726.2676991
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
This is subtle, and the root cause became more apparent with #19604,
since we now have many more cases of superclasses and subclasses using
different typevars. The issue is easiest to see in the following:
```py
class C[T]:
def __init__(self, t: T) -> None: ...
class D[U](C[T]):
pass
reveal_type(C(1)) # revealed: C[int]
reveal_type(D(1)) # should be: D[int]
```
When instantiating a generic class, the `__init__` method inherits the
generic context of that class. This lets our call binding machinery
infer a specialization for that context.
Prior to this PR, the instantiation of `C` worked just fine. Its
`__init__` method would inherit the `[T]` generic context, and we would
infer `{T = int}` as the specialization based on the argument
parameters.
It didn't work for `D`. The issue is that the `__init__` method was
inheriting the generic context of the class where `__init__` was defined
(here, `C` and `[T]`). At the call site, we would then infer `{T = int}`
as the specialization — but that wouldn't help us specialize `D[U]`,
since `D` does not have `T` in its generic context!
Instead, the `__init__` method should inherit the generic context of the
class that we are performing the lookup on (here, `D` and `[U]`). That
lets us correctly infer `{U = int}` as the specialization, which we can
successfully apply to `D[U]`.
(Note that `__init__` refers to `C`'s typevars in its signature, but
that's okay; our member lookup logic already applies the `T = U`
specialization when returning a member of `C` while performing a lookup
on `D`, transforming its signature from `(Self, T) -> None` to `(Self,
U) -> None`.)
Closes https://github.com/astral-sh/ty/issues/588
This PR introduces a few related changes:
- We now keep track of each time a legacy typevar is bound in a
different generic context (e.g. class, function), and internally create
a new `TypeVarInstance` for each usage. This means the rest of the code
can now assume that salsa-equivalent `TypeVarInstance`s refer to the
same typevar, even taking into account that legacy typevars can be used
more than once.
- We also go ahead and track the binding context of PEP 695 typevars.
That's _much_ easier to track since we have the binding context right
there during type inference.
- With that in place, we can now include the name of the binding context
when rendering typevars (e.g. `T@f` instead of `T`)
We now correctly exclude legacy typevars from enclosing scopes when
constructing the generic context for a generic function.
more detail:
A function is generic if it refers to legacy typevars in its signature:
```py
from typing import TypeVar
T = TypeVar("T")
def f(t: T) -> T:
return t
```
Generic functions are allowed to appear inside of other generic
contexts. When they do, they can refer to the typevars of those
enclosing generic contexts, and that should not rebind the typevar:
```py
from typing import TypeVar, Generic
T = TypeVar("T")
U = TypeVar("U")
class C(Generic[T]):
@staticmethod
def method(t: T, u: U) -> None: ...
# revealed: def method(t: int, u: U) -> None
reveal_type(C[int].method)
```
This substitution was already being performed correctly, but we were
also still including the enclosing legacy typevars in the method's own
generic context, which can be seen via `ty_extensions.generic_context`
(which has been updated to work on generic functions and methods):
```py
from ty_extensions import generic_context
# before: tuple[T, U]
# after: tuple[U]
reveal_type(generic_context(C[int].method))
```
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
An issue seen here https://github.com/astral-sh/ty/issues/500
The `__init__` method of dataclasses had no inherited generic context,
so we could not infer the type of an instance from a constructor call
with generics
## Test Plan
Add tests to classes.md` in generics folder
## Summary
It doesn't seem to be necessary for our generics implementation to carry
the `GenericContext` in the `ClassBase` variants. Removing it simplifies
the code, fixes many TODOs about `Generic` or `Protocol` appearing
multiple times in MROs when each should only appear at most once, and
allows us to more accurately detect runtime errors that occur due to
`Generic` or `Protocol` appearing multiple times in a class's bases.
In order to remove the `GenericContext` from the `ClassBase` variant, it
turns out to be necessary to emulate
`typing._GenericAlias.__mro_entries__`, or we end up with a large number
of false-positive `inconsistent-mro` errors. This PR therefore also does
that.
Lastly, this PR fixes the inferred MROs of PEP-695 generic classes,
which implicitly inherit from `Generic` even if they have no explicit
bases.
## Test Plan
mdtests
This implements the stopgap approach described in
https://github.com/astral-sh/ty/issues/336#issuecomment-2880532213 for
handling literal types in generic class specializations.
With this approach, we will promote any literal to its instance type,
but _only_ when inferring a generic class specialization from a
constructor call:
```py
class C[T]:
def __init__(self, x: T) -> None: ...
reveal_type(C("string")) # revealed: C[str]
```
If you specialize the class explicitly, we still use whatever type you
provide, even if it's a literal:
```py
from typing import Literal
reveal_type(C[Literal[5]](5)) # revealed: C[Literal[5]]
```
And this doesn't apply at all to generic functions:
```py
def f[T](x: T) -> T:
return x
reveal_type(f(5)) # revealed: Literal[5]
```
---
As part of making this happen, we also generalize the `TypeMapping`
machinery. This provides a way to apply a function to type, returning a
new type. Complicating matters is that for function literals, we have to
apply the mapping lazily, since the function's signature is not created
until (and if) someone calls its `signature` method. That means we have
to stash away the mappings that we want to apply to the signatures
parameter/return annotations once we do create it. This requires some
minor `Cow` shenanigans to continue working for partial specializations.
This is a follow-on to #18155. For the example raised in
https://github.com/astral-sh/ty/issues/370:
```py
import tempfile
with tempfile.TemporaryDirectory() as tmp: ...
```
the new logic would notice that both overloads of `TemporaryDirectory`
match, and combine their specializations, resulting in an inferred type
of `str | bytes`.
This PR updates the logic to match our other handling of other calls,
where we only keep the _first_ matching overload. The result for this
example then becomes `str`, matching the runtime behavior. (We still do
not implement the full [overload resolution
algorithm](https://typing.python.org/en/latest/spec/overload.html#overload-call-evaluation)
from the spec.)
This primarily comes up with annotated `self` parameters in
constructors:
```py
class C[T]:
def __init__(self: C[int]): ...
```
Here, we want infer a specialization of `{T = int}` for a call that hits
this overload.
Normally when inferring a specialization of a function call, typevars
appear in the parameter annotations, and not in the argument types. In
this case, this is reversed: we need to verify that the `self` argument
(`C[T]`, as we have not yet completed specialization inference) is
assignable to the parameter type `C[int]`.
To do this, we simply look for a typevar/type in both directions when
performing inference, and apply the inferred specialization to argument
types as well as parameter types before verifying assignability.
As a wrinkle, this exposed that we were not checking
subtyping/assignability for function literals correctly. Our function
literal representation includes an optional specialization that should
be applied to the signature. Before, function literals were considered
subtypes of (assignable to) each other only if they were identical Salsa
objects. Two function literals with different specializations should
still be considered subtypes of (assignable to) each other if those
specializations result in the same function signature (typically because
the function doesn't use the typevars in the specialization).
Closes https://github.com/astral-sh/ty/issues/370
Closes https://github.com/astral-sh/ty/issues/100
Closes https://github.com/astral-sh/ty/issues/258
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Function literals have an optional specialization, which is applied to
the parameter/return type annotations lazily when the function's
signature is requested. We were previously only applying this
specialization to the final overload of an overloaded function.
This manifested most visibly for `list.__add__`, which has an overloaded
definition in the typeshed:
b398b83631/crates/ty_vendored/vendor/typeshed/stdlib/builtins.pyi (L1069-L1072)
Closes https://github.com/astral-sh/ty/issues/314
@AlexWaygood discovered that even though we've been propagating
specializations to _parent_ base classes correctly, we haven't been
passing them on to _grandparent_ base classes:
https://github.com/astral-sh/ruff/pull/17832#issuecomment-2854360969
```py
class Bar[T]:
x: T
class Baz[T](Bar[T]): ...
class Spam[T](Baz[T]): ...
reveal_type(Spam[int]().x) # revealed: `T`, but should be `int`
```
This PR updates the MRO machinery to apply the current specialization
when starting to iterate the MRO of each base class.
If a typevar is declared as having a default, we shouldn't require a
type to be specified for that typevar when explicitly specializing a
generic class:
```py
class WithDefault[T, U = int]: ...
reveal_type(WithDefault[str]()) # revealed: WithDefault[str, int]
```
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Fixes
https://github.com/astral-sh/ruff/pull/17832#issuecomment-2851224968. We
had a comment that we did not need to apply specializations to generic
aliases, or to the bound `self` of a bound method, because they were
already specialized. But they might be specialized with a type variable,
which _does_ need to be specialized, in the case of a "multi-step"
specialization, such as:
```py
class LinkedList[T]: ...
class C[U]:
def method(self) -> LinkedList[U]:
return LinkedList[U]()
```
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>