mirror of
https://github.com/astral-sh/ruff.git
synced 2025-08-03 18:28:24 +00:00
![]() ## Summary Having a recursive type method to check whether a type is fully static is inefficient, unnecessary, and makes us overly strict about subtyping relations. It's inefficient because we end up re-walking the same types many times to check for fully-static-ness. It's unnecessary because we can check relations involving the dynamic type appropriately, depending whether the relation is subtyping or assignability. We use the subtyping relation to simplify unions and intersections. We can usefully consider that `S <: T` for gradual types also, as long as it remains true that `S | T` is equivalent to `T` and `S & T` is equivalent to `S`. One conservative definition (implemented here) that satisfies this requirement is that we consider `S <: T` if, for every possible pair of materializations `S'` and `T'`, `S' <: T'`. Or put differently the top materialization of `S` (`S+` -- the union of all possible materializations of `S`) is a subtype of the bottom materialization of `T` (`T-` -- the intersection of all possible materializations of `T`). In the most basic cases we can usefully say that `Any <: object` and that `Never <: Any`, and we can handle more complex cases inductively from there. This definition of subtyping for gradual subtypes is not reflexive (`Any` is not a subtype of `Any`). As a corollary, we also remove `is_gradual_equivalent_to` -- `is_equivalent_to` now has the meaning that `is_gradual_equivalent_to` used to have. If necessary, we could restore an `is_fully_static_equivalent_to` or similar (which would not do an `is_fully_static` pre-check of the types, but would instead pass a relation-kind enum down through a recursive equivalence check, similar to `has_relation_to`), but so far this doesn't appear to be necessary. Credit to @JelleZijlstra for the observation that `is_fully_static` is unnecessary and overly restrictive on subtyping. There is another possible definition of gradual subtyping: instead of requiring that `S+ <: T-`, we could instead require that `S+ <: T+` and `S- <: T-`. In other words, instead of requiring all materializations of `S` to be a subtype of every materialization of `T`, we just require that every materialization of `S` be a subtype of _some_ materialization of `T`, and that every materialization of `T` be a supertype of some materialization of `S`. This definition also preserves the core invariant that `S <: T` implies that `S | T = T` and `S & T = S`, and it restores reflexivity: under this definition, `Any` is a subtype of `Any`, and for any equivalent types `S` and `T`, `S <: T` and `T <: S`. But unfortunately, this definition breaks transitivity of subtyping, because nominal subclasses in Python use assignability ("consistent subtyping") to define acceptable overrides. This means that we may have a class `A` with `def method(self) -> Any` and a subtype `B(A)` with `def method(self) -> int`, since `int` is assignable to `Any`. This means that if we have a protocol `P` with `def method(self) -> Any`, we would have `B <: A` (from nominal subtyping) and `A <: P` (`Any` is a subtype of `Any`), but not `B <: P` (`int` is not a subtype of `Any`). Breaking transitivity of subtyping is not tenable, so we don't use this definition of subtyping. ## Test Plan Existing tests (modified in some cases to account for updated semantics.) Stable property tests pass at a million iterations: `QUICKCHECK_TESTS=1000000 cargo test -p ty_python_semantic -- --ignored types::property_tests::stable` ### Changes to property test type generation Since we no longer have a method of categorizing built types as fully-static or not-fully-static, I had to add a previously-discussed feature to the property tests so that some tests can build types that are known by construction to be fully static, because there are still properties that only apply to fully-static types (for example, reflexiveness of subtyping.) ## Changes to handling of `*args, **kwargs` signatures This PR "discovered" that, once we allow non-fully-static types to participate in subtyping under the above definitions, `(*args: Any, **kwargs: Any) -> Any` is now a subtype of `() -> object`. This is true, if we take a literal interpretation of the former signature: all materializations of the parameters `*args: Any, **kwargs: Any` can accept zero arguments, making the former signature a subtype of the latter. But the spec actually says that `*args: Any, **kwargs: Any` should be interpreted as equivalent to `...`, and that makes a difference here: `(...) -> Any` is not a subtype of `() -> object`, because (unlike a literal reading of `(*args: Any, **kwargs: Any)`), `...` can materialize to _any_ signature, including a signature with required positional arguments. This matters for this PR because it makes the "any two types are both assignable to their union" property test fail if we don't implement the equivalence to `...`. Because `FunctionType.__call__` has the signature `(*args: Any, **kwargs: Any) -> Any`, and if we take that at face value it's a subtype of `() -> object`, making `FunctionType` a subtype of `() -> object)` -- but then a function with a required argument is also a subtype of `FunctionType`, but not a subtype of `() -> object`. So I went ahead and implemented the equivalence to `...` in this PR. ## Ecosystem analysis * Most of the ecosystem report are cases of improved union/intersection simplification. For example, we can now simplify a union like `bool | (bool & Unknown) | Unknown` to simply `bool | Unknown`, because we can now observe that every possible materialization of `bool & Unknown` is still a subtype of `bool` (whereas before we would set aside `bool & Unknown` as a not-fully-static type.) This is clearly an improvement. * The `possibly-unresolved-reference` errors in sockeye, pymongo, ignite, scrapy and others are true positives for conditional imports that were formerly silenced by bogus conflicting-declarations (which we currently don't issue a diagnostic for), because we considered two different declarations of `Unknown` to be conflicting (we used `is_equivalent_to` not `is_gradual_equivalent_to`). In this PR that distinction disappears and all equivalence is gradual, so a declaration of `Unknown` no longer conflicts with a declaration of `Unknown`, which then results in us surfacing the possibly-unbound error. * We will now issue "redundant cast" for casting from a typevar with a gradual bound to the same typevar (the hydra-zen diagnostic). This seems like an improvement. * The new diagnostics in bandersnatch are interesting. For some reason primer in CI seems to be checking bandersnatch on Python 3.10 (not yet sure why; this doesn't happen when I run it locally). But bandersnatch uses `enum.StrEnum`, which doesn't exist on 3.10. That makes the `class SimpleDigest(StrEnum)` a class that inherits from `Unknown` (and bypasses our current TODO handling for accessing attributes on enum classes, since we don't recognize it as an enum class at all). This PR improves our understanding of assignability to classes that inherit from `Any` / `Unknown`, and we now recognize that a string literal is not assignable to a class inheriting `Any` or `Unknown`. |
||
---|---|---|
.. | ||
legacy | ||
pep695 | ||
builtins.md | ||
scoping.md |