ruff/crates/ty_python_semantic/resources/mdtest/generics/pep695/variance.md
Carl Meyer 62975b3ab2
[ty] eliminate is_fully_static (#18799)
## Summary

Having a recursive type method to check whether a type is fully static
is inefficient, unnecessary, and makes us overly strict about subtyping
relations.

It's inefficient because we end up re-walking the same types many times
to check for fully-static-ness.

It's unnecessary because we can check relations involving the dynamic
type appropriately, depending whether the relation is subtyping or
assignability.

We use the subtyping relation to simplify unions and intersections. We
can usefully consider that `S <: T` for gradual types also, as long as
it remains true that `S | T` is equivalent to `T` and `S & T` is
equivalent to `S`.

One conservative definition (implemented here) that satisfies this
requirement is that we consider `S <: T` if, for every possible pair of
materializations `S'` and `T'`, `S' <: T'`. Or put differently the top
materialization of `S` (`S+` -- the union of all possible
materializations of `S`) is a subtype of the bottom materialization of
`T` (`T-` -- the intersection of all possible materializations of `T`).
In the most basic cases we can usefully say that `Any <: object` and
that `Never <: Any`, and we can handle more complex cases inductively
from there.

This definition of subtyping for gradual subtypes is not reflexive
(`Any` is not a subtype of `Any`).

As a corollary, we also remove `is_gradual_equivalent_to` --
`is_equivalent_to` now has the meaning that `is_gradual_equivalent_to`
used to have. If necessary, we could restore an
`is_fully_static_equivalent_to` or similar (which would not do an
`is_fully_static` pre-check of the types, but would instead pass a
relation-kind enum down through a recursive equivalence check, similar
to `has_relation_to`), but so far this doesn't appear to be necessary.

Credit to @JelleZijlstra for the observation that `is_fully_static` is
unnecessary and overly restrictive on subtyping.

There is another possible definition of gradual subtyping: instead of
requiring that `S+ <: T-`, we could instead require that `S+ <: T+` and
`S- <: T-`. In other words, instead of requiring all materializations of
`S` to be a subtype of every materialization of `T`, we just require
that every materialization of `S` be a subtype of _some_ materialization
of `T`, and that every materialization of `T` be a supertype of some
materialization of `S`. This definition also preserves the core
invariant that `S <: T` implies that `S | T = T` and `S & T = S`, and it
restores reflexivity: under this definition, `Any` is a subtype of
`Any`, and for any equivalent types `S` and `T`, `S <: T` and `T <: S`.
But unfortunately, this definition breaks transitivity of subtyping,
because nominal subclasses in Python use assignability ("consistent
subtyping") to define acceptable overrides. This means that we may have
a class `A` with `def method(self) -> Any` and a subtype `B(A)` with
`def method(self) -> int`, since `int` is assignable to `Any`. This
means that if we have a protocol `P` with `def method(self) -> Any`, we
would have `B <: A` (from nominal subtyping) and `A <: P` (`Any` is a
subtype of `Any`), but not `B <: P` (`int` is not a subtype of `Any`).
Breaking transitivity of subtyping is not tenable, so we don't use this
definition of subtyping.

## Test Plan

Existing tests (modified in some cases to account for updated
semantics.)

Stable property tests pass at a million iterations:
`QUICKCHECK_TESTS=1000000 cargo test -p ty_python_semantic -- --ignored
types::property_tests::stable`

### Changes to property test type generation

Since we no longer have a method of categorizing built types as
fully-static or not-fully-static, I had to add a previously-discussed
feature to the property tests so that some tests can build types that
are known by construction to be fully static, because there are still
properties that only apply to fully-static types (for example,
reflexiveness of subtyping.)

## Changes to handling of `*args, **kwargs` signatures

This PR "discovered" that, once we allow non-fully-static types to
participate in subtyping under the above definitions, `(*args: Any,
**kwargs: Any) -> Any` is now a subtype of `() -> object`. This is true,
if we take a literal interpretation of the former signature: all
materializations of the parameters `*args: Any, **kwargs: Any` can
accept zero arguments, making the former signature a subtype of the
latter. But the spec actually says that `*args: Any, **kwargs: Any`
should be interpreted as equivalent to `...`, and that makes a
difference here: `(...) -> Any` is not a subtype of `() -> object`,
because (unlike a literal reading of `(*args: Any, **kwargs: Any)`),
`...` can materialize to _any_ signature, including a signature with
required positional arguments.

This matters for this PR because it makes the "any two types are both
assignable to their union" property test fail if we don't implement the
equivalence to `...`. Because `FunctionType.__call__` has the signature
`(*args: Any, **kwargs: Any) -> Any`, and if we take that at face value
it's a subtype of `() -> object`, making `FunctionType` a subtype of `()
-> object)` -- but then a function with a required argument is also a
subtype of `FunctionType`, but not a subtype of `() -> object`. So I
went ahead and implemented the equivalence to `...` in this PR.

## Ecosystem analysis

* Most of the ecosystem report are cases of improved union/intersection
simplification. For example, we can now simplify a union like `bool |
(bool & Unknown) | Unknown` to simply `bool | Unknown`, because we can
now observe that every possible materialization of `bool & Unknown` is
still a subtype of `bool` (whereas before we would set aside `bool &
Unknown` as a not-fully-static type.) This is clearly an improvement.
* The `possibly-unresolved-reference` errors in sockeye, pymongo,
ignite, scrapy and others are true positives for conditional imports
that were formerly silenced by bogus conflicting-declarations (which we
currently don't issue a diagnostic for), because we considered two
different declarations of `Unknown` to be conflicting (we used
`is_equivalent_to` not `is_gradual_equivalent_to`). In this PR that
distinction disappears and all equivalence is gradual, so a declaration
of `Unknown` no longer conflicts with a declaration of `Unknown`, which
then results in us surfacing the possibly-unbound error.
* We will now issue "redundant cast" for casting from a typevar with a
gradual bound to the same typevar (the hydra-zen diagnostic). This seems
like an improvement.
* The new diagnostics in bandersnatch are interesting. For some reason
primer in CI seems to be checking bandersnatch on Python 3.10 (not yet
sure why; this doesn't happen when I run it locally). But bandersnatch
uses `enum.StrEnum`, which doesn't exist on 3.10. That makes the `class
SimpleDigest(StrEnum)` a class that inherits from `Unknown` (and
bypasses our current TODO handling for accessing attributes on enum
classes, since we don't recognize it as an enum class at all). This PR
improves our understanding of assignability to classes that inherit from
`Any` / `Unknown`, and we now recognize that a string literal is not
assignable to a class inheriting `Any` or `Unknown`.
2025-06-24 18:02:05 -07:00

14 KiB

Variance: PEP 695 syntax

[environment]
python-version = "3.12"

Type variables have a property called variance that affects the subtyping and assignability relations. Much more detail can be found in the spec. To summarize, each typevar is either covariant, contravariant, invariant, or bivariant. (Note that bivariance is not currently mentioned in the typing spec, but is a fourth case that we must consider.)

For all of the examples below, we will consider typevars T and U, two generic classes using those typevars C[T] and D[U], and two types A and B.

(Note that dynamic types like Any never participate in subtyping, so C[Any] is neither a subtype nor supertype of any other specialization of C, regardless of T's variance. It is, however, assignable to any specialization of C, regardless of variance, via materialization.)

Covariance

With a covariant typevar, subtyping and assignability are in "alignment": if A <: B and C <: D, then C[A] <: C[B] and C[A] <: D[B].

Types that "produce" data on demand are covariant in their typevar. If you expect a sequence of ints, someone can safely provide a sequence of bools, since each bool element that you would get from the sequence is a valid int.

from ty_extensions import is_assignable_to, is_equivalent_to, is_subtype_of, static_assert, Unknown
from typing import Any

class A: ...
class B(A): ...

class C[T]:
    def receive(self) -> T:
        raise ValueError

class D[U](C[U]):
    pass

# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(C[B], C[A]))
static_assert(not is_assignable_to(C[A], C[B]))
static_assert(is_assignable_to(C[A], C[Any]))
static_assert(is_assignable_to(C[B], C[Any]))
static_assert(is_assignable_to(C[Any], C[A]))
static_assert(is_assignable_to(C[Any], C[B]))

# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(D[B], C[A]))
static_assert(not is_assignable_to(D[A], C[B]))
static_assert(is_assignable_to(D[A], C[Any]))
static_assert(is_assignable_to(D[B], C[Any]))
static_assert(is_assignable_to(D[Any], C[A]))
static_assert(is_assignable_to(D[Any], C[B]))

# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(C[B], C[A]))
static_assert(not is_subtype_of(C[A], C[B]))
static_assert(not is_subtype_of(C[A], C[Any]))
static_assert(not is_subtype_of(C[B], C[Any]))
static_assert(not is_subtype_of(C[Any], C[A]))
static_assert(not is_subtype_of(C[Any], C[B]))

# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(D[B], C[A]))
static_assert(not is_subtype_of(D[A], C[B]))
static_assert(not is_subtype_of(D[A], C[Any]))
static_assert(not is_subtype_of(D[B], C[Any]))
static_assert(not is_subtype_of(D[Any], C[A]))
static_assert(not is_subtype_of(D[Any], C[B]))

static_assert(is_equivalent_to(C[A], C[A]))
static_assert(is_equivalent_to(C[B], C[B]))
static_assert(not is_equivalent_to(C[B], C[A]))
static_assert(not is_equivalent_to(C[A], C[B]))
static_assert(not is_equivalent_to(C[A], C[Any]))
static_assert(not is_equivalent_to(C[B], C[Any]))
static_assert(not is_equivalent_to(C[Any], C[A]))
static_assert(not is_equivalent_to(C[Any], C[B]))

static_assert(not is_equivalent_to(D[A], C[A]))
static_assert(not is_equivalent_to(D[B], C[B]))
static_assert(not is_equivalent_to(D[B], C[A]))
static_assert(not is_equivalent_to(D[A], C[B]))
static_assert(not is_equivalent_to(D[A], C[Any]))
static_assert(not is_equivalent_to(D[B], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[A]))
static_assert(not is_equivalent_to(D[Any], C[B]))

static_assert(is_equivalent_to(C[Any], C[Any]))
static_assert(is_equivalent_to(C[Any], C[Unknown]))

static_assert(not is_equivalent_to(D[Any], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[Unknown]))

Contravariance

With a contravariant typevar, subtyping and assignability are in "opposition": if A <: B and C <: D, then C[B] <: C[A] and D[B] <: C[A].

Types that "consume" data are contravariant in their typevar. If you expect a consumer that receives bools, someone can safely provide a consumer that expects to receive ints, since each bool that you pass into the consumer is a valid int.

from ty_extensions import is_assignable_to, is_equivalent_to, is_subtype_of, static_assert, Unknown
from typing import Any

class A: ...
class B(A): ...

class C[T]:
    def send(self, value: T): ...

class D[U](C[U]):
    pass

static_assert(not is_assignable_to(C[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(C[A], C[B]))
static_assert(is_assignable_to(C[A], C[Any]))
static_assert(is_assignable_to(C[B], C[Any]))
static_assert(is_assignable_to(C[Any], C[A]))
static_assert(is_assignable_to(C[Any], C[B]))

static_assert(not is_assignable_to(D[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(D[A], C[B]))
static_assert(is_assignable_to(D[A], C[Any]))
static_assert(is_assignable_to(D[B], C[Any]))
static_assert(is_assignable_to(D[Any], C[A]))
static_assert(is_assignable_to(D[Any], C[B]))

static_assert(not is_subtype_of(C[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(C[A], C[B]))
static_assert(not is_subtype_of(C[A], C[Any]))
static_assert(not is_subtype_of(C[B], C[Any]))
static_assert(not is_subtype_of(C[Any], C[A]))
static_assert(not is_subtype_of(C[Any], C[B]))

static_assert(not is_subtype_of(D[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(D[A], C[B]))
static_assert(not is_subtype_of(D[A], C[Any]))
static_assert(not is_subtype_of(D[B], C[Any]))
static_assert(not is_subtype_of(D[Any], C[A]))
static_assert(not is_subtype_of(D[Any], C[B]))

static_assert(is_equivalent_to(C[A], C[A]))
static_assert(is_equivalent_to(C[B], C[B]))
static_assert(not is_equivalent_to(C[B], C[A]))
static_assert(not is_equivalent_to(C[A], C[B]))
static_assert(not is_equivalent_to(C[A], C[Any]))
static_assert(not is_equivalent_to(C[B], C[Any]))
static_assert(not is_equivalent_to(C[Any], C[A]))
static_assert(not is_equivalent_to(C[Any], C[B]))

static_assert(not is_equivalent_to(D[A], C[A]))
static_assert(not is_equivalent_to(D[B], C[B]))
static_assert(not is_equivalent_to(D[B], C[A]))
static_assert(not is_equivalent_to(D[A], C[B]))
static_assert(not is_equivalent_to(D[A], C[Any]))
static_assert(not is_equivalent_to(D[B], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[A]))
static_assert(not is_equivalent_to(D[Any], C[B]))

static_assert(is_equivalent_to(C[Any], C[Any]))
static_assert(is_equivalent_to(C[Any], C[Unknown]))

static_assert(not is_equivalent_to(D[Any], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[Unknown]))

Invariance

With an invariant typevar, only equivalent specializations of the generic class are subtypes of or assignable to each other.

This often occurs for types that are both producers and consumers, like a mutable list. Iterating over the elements in a list would work with a covariant typevar, just like with the "producer" type above. Appending elements to a list would work with a contravariant typevar, just like with the "consumer" type above. However, a typevar cannot be both covariant and contravariant at the same time!

If you expect a mutable list of ints, it's not safe for someone to provide you with a mutable list of bools, since you might try to add an element to the list: if you try to add an int, the list would no longer only contain elements that are subtypes of bool.

Conversely, if you expect a mutable list of bools, it's not safe for someone to provide you with a mutable list of ints, since you might try to extract elements from the list: you expect every element that you extract to be a subtype of bool, but the list can contain any int.

In the end, if you expect a mutable list, you must always be given a list of exactly that type, since we can't know in advance which of the allowed methods you'll want to use.

from ty_extensions import is_assignable_to, is_equivalent_to, is_subtype_of, static_assert, Unknown
from typing import Any

class A: ...
class B(A): ...

class C[T]:
    def send(self, value: T): ...
    def receive(self) -> T:
        raise ValueError

class D[U](C[U]):
    pass

static_assert(not is_assignable_to(C[B], C[A]))
static_assert(not is_assignable_to(C[A], C[B]))
static_assert(is_assignable_to(C[A], C[Any]))
static_assert(is_assignable_to(C[B], C[Any]))
static_assert(is_assignable_to(C[Any], C[A]))
static_assert(is_assignable_to(C[Any], C[B]))

static_assert(not is_assignable_to(D[B], C[A]))
static_assert(not is_assignable_to(D[A], C[B]))
static_assert(is_assignable_to(D[A], C[Any]))
static_assert(is_assignable_to(D[B], C[Any]))
static_assert(is_assignable_to(D[Any], C[A]))
static_assert(is_assignable_to(D[Any], C[B]))

static_assert(not is_subtype_of(C[B], C[A]))
static_assert(not is_subtype_of(C[A], C[B]))
static_assert(not is_subtype_of(C[A], C[Any]))
static_assert(not is_subtype_of(C[B], C[Any]))
static_assert(not is_subtype_of(C[Any], C[A]))
static_assert(not is_subtype_of(C[Any], C[B]))

static_assert(not is_subtype_of(D[B], C[A]))
static_assert(not is_subtype_of(D[A], C[B]))
static_assert(not is_subtype_of(D[A], C[Any]))
static_assert(not is_subtype_of(D[B], C[Any]))
static_assert(not is_subtype_of(D[Any], C[A]))
static_assert(not is_subtype_of(D[Any], C[B]))

static_assert(is_equivalent_to(C[A], C[A]))
static_assert(is_equivalent_to(C[B], C[B]))
static_assert(not is_equivalent_to(C[B], C[A]))
static_assert(not is_equivalent_to(C[A], C[B]))
static_assert(not is_equivalent_to(C[A], C[Any]))
static_assert(not is_equivalent_to(C[B], C[Any]))
static_assert(not is_equivalent_to(C[Any], C[A]))
static_assert(not is_equivalent_to(C[Any], C[B]))

static_assert(not is_equivalent_to(D[A], C[A]))
static_assert(not is_equivalent_to(D[B], C[B]))
static_assert(not is_equivalent_to(D[B], C[A]))
static_assert(not is_equivalent_to(D[A], C[B]))
static_assert(not is_equivalent_to(D[A], C[Any]))
static_assert(not is_equivalent_to(D[B], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[A]))
static_assert(not is_equivalent_to(D[Any], C[B]))

static_assert(is_equivalent_to(C[Any], C[Any]))
static_assert(is_equivalent_to(C[Any], C[Unknown]))

static_assert(not is_equivalent_to(D[Any], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[Unknown]))

Bivariance

With a bivariant typevar, all specializations of the generic class are assignable to (and in fact, gradually equivalent to) each other, and all fully static specializations are subtypes of (and equivalent to) each other.

This is a bit of pathological case, which really only happens when the class doesn't use the typevar at all. (If it did, it would have to be covariant, contravariant, or invariant, depending on how the typevar was used.)

from ty_extensions import is_assignable_to, is_equivalent_to, is_subtype_of, static_assert, Unknown
from typing import Any

class A: ...
class B(A): ...

class C[T]:
    pass

class D[U](C[U]):
    pass

# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(C[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(C[A], C[B]))
static_assert(is_assignable_to(C[A], C[Any]))
static_assert(is_assignable_to(C[B], C[Any]))
static_assert(is_assignable_to(C[Any], C[A]))
static_assert(is_assignable_to(C[Any], C[B]))

# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(D[B], C[A]))
static_assert(is_subtype_of(C[A], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_assignable_to(D[A], C[B]))
static_assert(is_assignable_to(D[A], C[Any]))
static_assert(is_assignable_to(D[B], C[Any]))
static_assert(is_assignable_to(D[Any], C[A]))
static_assert(is_assignable_to(D[Any], C[B]))

# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(C[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(C[A], C[B]))
static_assert(not is_subtype_of(C[A], C[Any]))
static_assert(not is_subtype_of(C[B], C[Any]))
static_assert(not is_subtype_of(C[Any], C[A]))
static_assert(not is_subtype_of(C[Any], C[B]))
static_assert(not is_subtype_of(C[Any], C[Any]))

# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(D[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_subtype_of(D[A], C[B]))
static_assert(not is_subtype_of(D[A], C[Any]))
static_assert(not is_subtype_of(D[B], C[Any]))
static_assert(not is_subtype_of(D[Any], C[A]))
static_assert(not is_subtype_of(D[Any], C[B]))

static_assert(is_equivalent_to(C[A], C[A]))
static_assert(is_equivalent_to(C[B], C[B]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_equivalent_to(C[B], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_equivalent_to(C[A], C[B]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_equivalent_to(C[A], C[Any]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_equivalent_to(C[B], C[Any]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_equivalent_to(C[Any], C[A]))
# TODO: no error
# error: [static-assert-error]
static_assert(is_equivalent_to(C[Any], C[B]))

static_assert(not is_equivalent_to(D[A], C[A]))
static_assert(not is_equivalent_to(D[B], C[B]))
static_assert(not is_equivalent_to(D[B], C[A]))
static_assert(not is_equivalent_to(D[A], C[B]))
static_assert(not is_equivalent_to(D[A], C[Any]))
static_assert(not is_equivalent_to(D[B], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[A]))
static_assert(not is_equivalent_to(D[Any], C[B]))

static_assert(is_equivalent_to(C[Any], C[Any]))
static_assert(is_equivalent_to(C[Any], C[Unknown]))

static_assert(not is_equivalent_to(D[Any], C[Any]))
static_assert(not is_equivalent_to(D[Any], C[Unknown]))