mirror of https://github.com/astral-sh/ruff.git synced 2025-10-09 10:00:25 +00:00

[ty] fix deferred name loading in PEP695 generic classes/functions (#19888 )

## Summary

For PEP 695 generic functions and classes, there is an extra "type
params scope" (a child of the outer scope, and wrapping the body scope)
in which the type parameters are defined; class bases and function
parameter/return annotations are resolved in that type-params scope.

This PR fixes some longstanding bugs in how we resolve name loads from
inside these PEP 695 type parameter scopes, and also defers type
inference of PEP 695 typevar bounds/constraints/default, so we can
handle cycles without panicking.

We were previously treating these type-param scopes as lazy nested
scopes, which is wrong. In fact they are eager nested scopes; the class
`C` here inherits `int`, not `str`, and previously we got that wrong:

```py
Base = int

class C[T](Base): ...

Base = str
```

But certain syntactic positions within type param scopes (typevar
bounds/constraints/defaults) are lazy at runtime, and we should use
deferred name resolution for them. This also means they can have cycles;
in order to handle that without panicking in type inference, we need to
actually defer their type inference until after we have constructed the
`TypeVarInstance`.

PEP 695 does specify that typevar bounds and constraints cannot be
generic, and that typevar defaults can only reference prior typevars,
not later ones. This reduces the scope of (valid from the type-system
perspective) cycles somewhat, although cycles are still possible (e.g.
`class C[T: list[C]]`). And this is a type-system-only restriction; from
the runtime perspective an "invalid" case like `class C[T: T]` actually
works fine.

I debated whether to implement the PEP 695 restrictions as a way to
avoid some cycles up-front, but I ended up deciding against that; I'd
rather model the runtime name-resolution semantics accurately, and
implement the PEP 695 restrictions as a separate diagnostic on top.
(This PR doesn't yet implement those diagnostics, thus some `# TODO:
error` in the added tests.)

Introducing the possibility of cyclic typevars made typevar display
potentially stack overflow. For now I've handled this by simply removing
typevar details (bounds/constraints/default) from typevar display. This
impacts display of two kinds of types. If you `reveal_type(T)` on an
unbound `T` you now get just `typing.TypeVar` instead of
`typing.TypeVar("T", ...)` where `...` is the bound/constraints/default.
This matches pyright and mypy; pyrefly uses `type[TypeVar[T]]` which
seems a bit confusing, but does include the name. (We could easily
include the name without cycle issues, if there's a syntax we like for
that.)

It also means that displaying a generic function type like `def f[T:
int](x: T) -> T: ...` now displays as `f[T](x: T) -> T` instead of `f[T:
int](x: T) -> T`. This matches pyright and pyrefly; mypy does include
bound/constraints/defaults of typevars in function/callable type
display. If we wanted to add this, we would either need to thread a
visitor through all the type display code, or add a `decycle` type
transformation that replaced recursive reoccurrence of a type with a
marker.

## Test Plan

Added mdtests and modified existing tests to improve their correctness.

After this PR, there's only a single remaining py-fuzzer seed in the
0-500 range that panics! (Before this PR, there were 10; the fuzzer
likes to generate cyclic PEP 695 syntax.)

## Ecosystem report

It's all just the changes to `TypeVar` display.

2025-08-13 15:51:59 -07:00

8 KiB

Raw Blame History

Eager scopes

Some scopes are executed eagerly: references to variables defined in enclosing scopes are resolved immediately. This is in contrast to (for instance) function scopes, where those references are resolved when the function is called.

Function definitions

Function definitions are evaluated lazily.

x = 1

def f():
    reveal_type(x)  # revealed: Unknown | Literal[1, 2]

x = 2

Class definitions

Class definitions are evaluated eagerly.

def _():
    x = 1

    class A:
        reveal_type(x)  # revealed: Literal[1]

        y = x

    x = 2

    reveal_type(A.y)  # revealed: Unknown | Literal[1]

List comprehensions

List comprehensions are evaluated eagerly.

def _():
    x = 1

    # revealed: Literal[1]
    [reveal_type(x) for a in range(1)]

    x = 2

Set comprehensions

Set comprehensions are evaluated eagerly.

def _():
    x = 1

    # revealed: Literal[1]
    {reveal_type(x) for a in range(1)}

    x = 2

Dict comprehensions

Dict comprehensions are evaluated eagerly.

def _():
    x = 1

    # revealed: Literal[1]
    {a: reveal_type(x) for a in range(1)}

    x = 2

Generator expressions

Generator expressions don't necessarily run eagerly, but in practice usually they do, so assuming they do is the better default.

def _():
    x = 1

    # revealed: Literal[1]
    list(reveal_type(x) for a in range(1))

    x = 2

But that does lead to incorrect results when the generator expression isn't run immediately:

def evaluated_later():
    x = 1

    # revealed: Literal[1]
    y = (reveal_type(x) for a in range(1))

    x = 2

    # The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
    # our inferred type.
    print(next(y))

Though note that “the iterable expression in the leftmost for clause is immediately evaluated” [spec]:

def iterable_evaluated_eagerly():
    x = 1

    # revealed: Literal[1]
    y = (a for a in [reveal_type(x)])

    x = 2

    # Even though the generator isn't evaluated until here, the first iterable was evaluated
    # immediately, so our inferred type is correct.
    print(next(y))

Top-level eager scopes

All of the above examples behave identically when the eager scopes are directly nested in the global scope.

Class definitions

x = 1

class A:
    reveal_type(x)  # revealed: Literal[1]

    y = x

x = 2

reveal_type(A.y)  # revealed: Unknown | Literal[1]

List comprehensions

x = 1

# revealed: Literal[1]
[reveal_type(x) for a in range(1)]

x = 2

# error: [unresolved-reference]
[y for a in range(1)]
y = 1

Set comprehensions

x = 1

# revealed: Literal[1]
{reveal_type(x) for a in range(1)}

x = 2

# error: [unresolved-reference]
{y for a in range(1)}
y = 1

Dict comprehensions

x = 1

# revealed: Literal[1]
{a: reveal_type(x) for a in range(1)}

x = 2

# error: [unresolved-reference]
{a: y for a in range(1)}
y = 1

Generator expressions

x = 1

# revealed: Literal[1]
list(reveal_type(x) for a in range(1))

x = 2

# error: [unresolved-reference]
list(y for a in range(1))
y = 1

evaluated_later.py:

x = 1

# revealed: Literal[1]
y = (reveal_type(x) for a in range(1))

x = 2

# The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
# our inferred type.
print(next(y))

iterable_evaluated_eagerly.py:

x = 1

# revealed: Literal[1]
y = (a for a in [reveal_type(x)])

x = 2

# Even though the generator isn't evaluated until here, the first iterable was evaluated
# immediately, so our inferred type is correct.
print(next(y))

Lazy scopes are "sticky"

As we look through each enclosing scope when resolving a reference, lookups become lazy as soon as we encounter any lazy scope, even if there are other eager scopes that enclose it.

Eager scope within eager scope

If we don't encounter a lazy scope, lookup remains eager. The resolved binding is not necessarily in the immediately enclosing scope. Here, the list comprehension and class definition are both eager scopes, and we immediately resolve the use of x to (only) the x = 1 binding.

def _():
    x = 1

    class A:
        # revealed: Literal[1]
        [reveal_type(x) for a in range(1)]

    x = 2

Class definition bindings are not visible in nested scopes

Class definitions are eager scopes, but any bindings in them are explicitly not visible to any nested scopes. (Those nested scopes are typically (lazy) function definitions, but the rule also applies to nested eager scopes like comprehensions and other class definitions.)

def _():
    x = 1

    class A:
        x = 4

        # revealed: Literal[1]
        [reveal_type(x) for a in range(1)]

        class B:
            # revealed: Literal[1]
            [reveal_type(x) for a in range(1)]

    x = 2

x = 1

def _():
    class C:
        # revealed: Unknown | Literal[1]
        [reveal_type(x) for _ in [1]]
        x = 2

Eager scope within a lazy scope

The list comprehension is an eager scope, and it is enclosed within a function definition, which is a lazy scope. Because we pass through this lazy scope before encountering any bindings or definitions, the lookup is lazy.

def _():
    x = 1

    def f():
        # revealed: Literal[1, 2]
        [reveal_type(x) for a in range(1)]
    x = 2

Lazy scope within an eager scope

The function definition is a lazy scope, and it is enclosed within a class definition, which is an eager scope. Even though we pass through an eager scope before encountering any bindings or definitions, the lookup remains lazy.

def _():
    x = 1

    class A:
        def f():
            # revealed: Literal[1, 2]
            reveal_type(x)

    x = 2

Lazy scope within a lazy scope

No matter how many lazy scopes we pass through before encountering a binding or definition, the lookup remains lazy.

def _():
    x = 1

    def f():
        def g():
            # revealed: Literal[1, 2]
            reveal_type(x)
    x = 2

Eager scope within a lazy scope within another eager scope

We have a list comprehension (eager scope), enclosed within a function definition (lazy scope), enclosed within a class definition (eager scope), all of which we must pass through before encountering any binding of x. Even though the last scope we pass through is eager, the lookup is lazy, since we encountered a lazy scope on the way.

def _():
    x = 1

    class A:
        def f():
            # revealed: Literal[1, 2]
            [reveal_type(x) for a in range(1)]

    x = 2

Annotations

Type annotations are sometimes deferred. When they are, the types that are referenced in an annotation are looked up lazily, even if they occur in an eager scope.

Eager annotations in a Python file

from typing import ClassVar

x = int

class C:
    var: ClassVar[x]

reveal_type(C.var)  # revealed: int

x = str

Deferred annotations in a Python file

from __future__ import annotations

from typing import ClassVar

x = int

class C:
    var: ClassVar[x]

reveal_type(C.var)  # revealed: Unknown | int | str

x = str

Deferred annotations in a stub file

from typing import ClassVar

x = int

class C:
    var: ClassVar[x]

# TODO: should ideally be `str`, but we currently consider all reachable bindings
reveal_type(C.var)  # revealed: int | str

x = str

Annotation scopes

[environment]
python-version = "3.12"

Type alias annotation scopes are lazy

type Foo = Bar

class Bar:
    pass

def _(x: Foo):
    if isinstance(x, Bar):
        reveal_type(x)  # revealed: Bar
    else:
        reveal_type(x)  # revealed: Never

Type-param scopes are eager, but bounds/constraints are deferred

# error: [unresolved-reference]
class D[T](Bar):
    pass

class E[T: Bar]:
    pass

# error: [unresolved-reference]
def g[T](x: Bar):
    pass

def h[T: Bar](x: T):
    pass

class Bar:
    pass

8 KiB Raw Blame History

Eager scopes

Function definitions

Class definitions

List comprehensions

Set comprehensions

Dict comprehensions

Generator expressions

Top-level eager scopes

Class definitions

List comprehensions

Set comprehensions

Dict comprehensions

Generator expressions

Lazy scopes are "sticky"

Eager scope within eager scope

Class definition bindings are not visible in nested scopes

Eager scope within a lazy scope

Lazy scope within an eager scope

Lazy scope within a lazy scope

Eager scope within a lazy scope within another eager scope

Annotations

Eager annotations in a Python file

Deferred annotations in a Python file

Deferred annotations in a stub file

Annotation scopes

Type alias annotation scopes are lazy

Type-param scopes are eager, but bounds/constraints are deferred

8 KiB

Raw Blame History