
## Summary For PEP 695 generic functions and classes, there is an extra "type params scope" (a child of the outer scope, and wrapping the body scope) in which the type parameters are defined; class bases and function parameter/return annotations are resolved in that type-params scope. This PR fixes some longstanding bugs in how we resolve name loads from inside these PEP 695 type parameter scopes, and also defers type inference of PEP 695 typevar bounds/constraints/default, so we can handle cycles without panicking. We were previously treating these type-param scopes as lazy nested scopes, which is wrong. In fact they are eager nested scopes; the class `C` here inherits `int`, not `str`, and previously we got that wrong: ```py Base = int class C[T](Base): ... Base = str ``` But certain syntactic positions within type param scopes (typevar bounds/constraints/defaults) are lazy at runtime, and we should use deferred name resolution for them. This also means they can have cycles; in order to handle that without panicking in type inference, we need to actually defer their type inference until after we have constructed the `TypeVarInstance`. PEP 695 does specify that typevar bounds and constraints cannot be generic, and that typevar defaults can only reference prior typevars, not later ones. This reduces the scope of (valid from the type-system perspective) cycles somewhat, although cycles are still possible (e.g. `class C[T: list[C]]`). And this is a type-system-only restriction; from the runtime perspective an "invalid" case like `class C[T: T]` actually works fine. I debated whether to implement the PEP 695 restrictions as a way to avoid some cycles up-front, but I ended up deciding against that; I'd rather model the runtime name-resolution semantics accurately, and implement the PEP 695 restrictions as a separate diagnostic on top. (This PR doesn't yet implement those diagnostics, thus some `# TODO: error` in the added tests.) Introducing the possibility of cyclic typevars made typevar display potentially stack overflow. For now I've handled this by simply removing typevar details (bounds/constraints/default) from typevar display. This impacts display of two kinds of types. If you `reveal_type(T)` on an unbound `T` you now get just `typing.TypeVar` instead of `typing.TypeVar("T", ...)` where `...` is the bound/constraints/default. This matches pyright and mypy; pyrefly uses `type[TypeVar[T]]` which seems a bit confusing, but does include the name. (We could easily include the name without cycle issues, if there's a syntax we like for that.) It also means that displaying a generic function type like `def f[T: int](x: T) -> T: ...` now displays as `f[T](x: T) -> T` instead of `f[T: int](x: T) -> T`. This matches pyright and pyrefly; mypy does include bound/constraints/defaults of typevars in function/callable type display. If we wanted to add this, we would either need to thread a visitor through all the type display code, or add a `decycle` type transformation that replaced recursive reoccurrence of a type with a marker. ## Test Plan Added mdtests and modified existing tests to improve their correctness. After this PR, there's only a single remaining py-fuzzer seed in the 0-500 range that panics! (Before this PR, there were 10; the fuzzer likes to generate cyclic PEP 695 syntax.) ## Ecosystem report It's all just the changes to `TypeVar` display.
8 KiB
Eager scopes
Some scopes are executed eagerly: references to variables defined in enclosing scopes are resolved immediately. This is in contrast to (for instance) function scopes, where those references are resolved when the function is called.
Function definitions
Function definitions are evaluated lazily.
x = 1
def f():
reveal_type(x) # revealed: Unknown | Literal[1, 2]
x = 2
Class definitions
Class definitions are evaluated eagerly.
def _():
x = 1
class A:
reveal_type(x) # revealed: Literal[1]
y = x
x = 2
reveal_type(A.y) # revealed: Unknown | Literal[1]
List comprehensions
List comprehensions are evaluated eagerly.
def _():
x = 1
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
Set comprehensions
Set comprehensions are evaluated eagerly.
def _():
x = 1
# revealed: Literal[1]
{reveal_type(x) for a in range(1)}
x = 2
Dict comprehensions
Dict comprehensions are evaluated eagerly.
def _():
x = 1
# revealed: Literal[1]
{a: reveal_type(x) for a in range(1)}
x = 2
Generator expressions
Generator expressions don't necessarily run eagerly, but in practice usually they do, so assuming they do is the better default.
def _():
x = 1
# revealed: Literal[1]
list(reveal_type(x) for a in range(1))
x = 2
But that does lead to incorrect results when the generator expression isn't run immediately:
def evaluated_later():
x = 1
# revealed: Literal[1]
y = (reveal_type(x) for a in range(1))
x = 2
# The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
# our inferred type.
print(next(y))
Though note that “the iterable expression in the leftmost for
clause is immediately evaluated”
[spec]:
def iterable_evaluated_eagerly():
x = 1
# revealed: Literal[1]
y = (a for a in [reveal_type(x)])
x = 2
# Even though the generator isn't evaluated until here, the first iterable was evaluated
# immediately, so our inferred type is correct.
print(next(y))
Top-level eager scopes
All of the above examples behave identically when the eager scopes are directly nested in the global scope.
Class definitions
x = 1
class A:
reveal_type(x) # revealed: Literal[1]
y = x
x = 2
reveal_type(A.y) # revealed: Unknown | Literal[1]
List comprehensions
x = 1
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
# error: [unresolved-reference]
[y for a in range(1)]
y = 1
Set comprehensions
x = 1
# revealed: Literal[1]
{reveal_type(x) for a in range(1)}
x = 2
# error: [unresolved-reference]
{y for a in range(1)}
y = 1
Dict comprehensions
x = 1
# revealed: Literal[1]
{a: reveal_type(x) for a in range(1)}
x = 2
# error: [unresolved-reference]
{a: y for a in range(1)}
y = 1
Generator expressions
x = 1
# revealed: Literal[1]
list(reveal_type(x) for a in range(1))
x = 2
# error: [unresolved-reference]
list(y for a in range(1))
y = 1
evaluated_later.py
:
x = 1
# revealed: Literal[1]
y = (reveal_type(x) for a in range(1))
x = 2
# The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
# our inferred type.
print(next(y))
iterable_evaluated_eagerly.py
:
x = 1
# revealed: Literal[1]
y = (a for a in [reveal_type(x)])
x = 2
# Even though the generator isn't evaluated until here, the first iterable was evaluated
# immediately, so our inferred type is correct.
print(next(y))
Lazy scopes are "sticky"
As we look through each enclosing scope when resolving a reference, lookups become lazy as soon as we encounter any lazy scope, even if there are other eager scopes that enclose it.
Eager scope within eager scope
If we don't encounter a lazy scope, lookup remains eager. The resolved binding is not necessarily in
the immediately enclosing scope. Here, the list comprehension and class definition are both eager
scopes, and we immediately resolve the use of x
to (only) the x = 1
binding.
def _():
x = 1
class A:
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
Class definition bindings are not visible in nested scopes
Class definitions are eager scopes, but any bindings in them are explicitly not visible to any nested scopes. (Those nested scopes are typically (lazy) function definitions, but the rule also applies to nested eager scopes like comprehensions and other class definitions.)
def _():
x = 1
class A:
x = 4
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
class B:
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
x = 1
def _():
class C:
# revealed: Unknown | Literal[1]
[reveal_type(x) for _ in [1]]
x = 2
Eager scope within a lazy scope
The list comprehension is an eager scope, and it is enclosed within a function definition, which is a lazy scope. Because we pass through this lazy scope before encountering any bindings or definitions, the lookup is lazy.
def _():
x = 1
def f():
# revealed: Literal[1, 2]
[reveal_type(x) for a in range(1)]
x = 2
Lazy scope within an eager scope
The function definition is a lazy scope, and it is enclosed within a class definition, which is an eager scope. Even though we pass through an eager scope before encountering any bindings or definitions, the lookup remains lazy.
def _():
x = 1
class A:
def f():
# revealed: Literal[1, 2]
reveal_type(x)
x = 2
Lazy scope within a lazy scope
No matter how many lazy scopes we pass through before encountering a binding or definition, the lookup remains lazy.
def _():
x = 1
def f():
def g():
# revealed: Literal[1, 2]
reveal_type(x)
x = 2
Eager scope within a lazy scope within another eager scope
We have a list comprehension (eager scope), enclosed within a function definition (lazy scope),
enclosed within a class definition (eager scope), all of which we must pass through before
encountering any binding of x
. Even though the last scope we pass through is eager, the lookup is
lazy, since we encountered a lazy scope on the way.
def _():
x = 1
class A:
def f():
# revealed: Literal[1, 2]
[reveal_type(x) for a in range(1)]
x = 2
Annotations
Type annotations are sometimes deferred. When they are, the types that are referenced in an annotation are looked up lazily, even if they occur in an eager scope.
Eager annotations in a Python file
from typing import ClassVar
x = int
class C:
var: ClassVar[x]
reveal_type(C.var) # revealed: int
x = str
Deferred annotations in a Python file
from __future__ import annotations
from typing import ClassVar
x = int
class C:
var: ClassVar[x]
reveal_type(C.var) # revealed: Unknown | int | str
x = str
Deferred annotations in a stub file
from typing import ClassVar
x = int
class C:
var: ClassVar[x]
# TODO: should ideally be `str`, but we currently consider all reachable bindings
reveal_type(C.var) # revealed: int | str
x = str
Annotation scopes
[environment]
python-version = "3.12"
Type alias annotation scopes are lazy
type Foo = Bar
class Bar:
pass
def _(x: Foo):
if isinstance(x, Bar):
reveal_type(x) # revealed: Bar
else:
reveal_type(x) # revealed: Never
Type-param scopes are eager, but bounds/constraints are deferred
# error: [unresolved-reference]
class D[T](Bar):
pass
class E[T: Bar]:
pass
# error: [unresolved-reference]
def g[T](x: Bar):
pass
def h[T: Bar](x: T):
pass
class Bar:
pass