mirror of https://github.com/astral-sh/ruff.git synced 2025-10-09 18:02:19 +00:00

[ty] Use typing.Self for the first parameter of instance methods (#20517 )

## Summary

Modify the (external) signature of instance methods such that the first
parameter uses `Self` unless it is explicitly annotated. This allows us
to correctly type-check more code, and allows us to infer correct return
types for many functions that return `Self`. For example:

```py
from pathlib import Path
from datetime import datetime, timedelta

reveal_type(Path(".config") / ".ty")  # now Path, previously Unknown

def _(dt: datetime, delta: timedelta):
    reveal_type(dt - delta)  # now datetime, previously Unknown
```

part of https://github.com/astral-sh/ty/issues/159

## Performance

I ran benchmarks locally on `attrs`, `freqtrade` and `colour`, the
projects with the largest regressions on CodSpeed. I see much smaller
effects locally, but can definitely reproduce the regression on `attrs`.
From looking at the profiling results (on Codspeed), it seems that we
simply do more type inference work, which seems plausible, given that we
now understand much more return types (of many stdlib functions). In
particular, whenever a function uses an implicit `self` and returns
`Self` (without mentioning `Self` anywhere else in its signature), we
will now infer the correct type, whereas we would previously return
`Unknown`. This also means that we need to invoke the generics solver in
more cases. Comparing half a million lines of log output on attrs, I can
see that we do 5% more "work" (number of lines in the log), and have a
lot more `apply_specialization` events (7108 vs 4304). On freqtrade, I
see similar numbers for `apply_specialization` (11360 vs 5138 calls).
Given these results, I'm not sure if it's generally worth doing more
performance work, especially since none of the code modifications
themselves seem to be likely candidates for regressions.

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/attrs` | 92.6 ± 3.6 | 85.9 |
102.6 | 1.00 |
| `./ty_self check /home/shark/ecosystem/attrs` | 101.7 ± 3.5 | 96.9 |
113.8 | 1.10 ± 0.06 |

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/freqtrade` | 599.0 ± 20.2 |
568.2 | 627.5 | 1.00 |
| `./ty_self check /home/shark/ecosystem/freqtrade` | 607.9 ± 11.5 |
594.9 | 626.4 | 1.01 ± 0.04 |

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/colour` | 423.9 ± 17.9 | 394.6
| 447.4 | 1.00 |
| `./ty_self check /home/shark/ecosystem/colour` | 426.9 ± 24.9 | 373.8
| 456.6 | 1.01 ± 0.07 |

## Test Plan

New Markdown tests

## Ecosystem report

* apprise: ~300 new diagnostics related to problematic stubs in apprise
😩
* attrs: a new true positive, since [this
function](4e2c89c823/tests/test_make.py (L2135))
is missing a `@staticmethod`?
* Some legitimate true positives
* sympy: lots of new `invalid-operator` false positives in [matrix
multiplication](cf9f4b6805/sympy/matrices/matrixbase.py (L3267-L3269))
due to our limited understanding of [generic `Callable[[Callable[[T1,
T2], T3]], Callable[[T1, T2], T3]]` "identity"
types](cf9f4b6805/sympy/core/decorators.py (L83-L84))
of decorators. This is not related to type-of-self.

## Typing conformance results

The changes are all correct, except for
```diff
+generics_self_usage.py:50:5: error[invalid-assignment] Object of type `def foo(self) -> int` is not assignable to `(typing.Self, /) -> int`
```
which is related to an assignability problem involving type variables on
both sides:
```py
class CallableAttribute:
    def foo(self) -> int:
        return 0

    bar: Callable[[Self], int] = foo  # <- we currently error on this assignment
```

---------

Co-authored-by: Shaygan Hooshyari <sh.hooshyari@gmail.com>

2025-09-29 21:08:08 +02:00

10 KiB

Raw Permalink Blame History

Narrowing for `isinstance` checks

Narrowing for isinstance(object, classinfo) expressions.

`classinfo` is a single type

def _(flag: bool):
    x = 1 if flag else "a"

    if isinstance(x, int):
        reveal_type(x)  # revealed: Literal[1]

    if isinstance(x, str):
        reveal_type(x)  # revealed: Literal["a"]
        if isinstance(x, int):
            reveal_type(x)  # revealed: Never

    if isinstance(x, (int, object)):
        reveal_type(x)  # revealed: Literal[1, "a"]

`classinfo` is a tuple of types

Note: isinstance(x, (int, str)) should not be confused with isinstance(x, tuple[(int, str)]). The former is equivalent to isinstance(x, int | str):

def _(flag: bool, flag1: bool, flag2: bool):
    x = 1 if flag else "a"

    if isinstance(x, (int, str)):
        reveal_type(x)  # revealed: Literal[1, "a"]
    else:
        reveal_type(x)  # revealed: Never

    if isinstance(x, (int, bytes)):
        reveal_type(x)  # revealed: Literal[1]

    if isinstance(x, (bytes, str)):
        reveal_type(x)  # revealed: Literal["a"]

    # No narrowing should occur if a larger type is also
    # one of the possibilities:
    if isinstance(x, (int, object)):
        reveal_type(x)  # revealed: Literal[1, "a"]
    else:
        reveal_type(x)  # revealed: Never

    y = 1 if flag1 else "a" if flag2 else b"b"
    if isinstance(y, (int, str)):
        reveal_type(y)  # revealed: Literal[1, "a"]

    if isinstance(y, (int, bytes)):
        reveal_type(y)  # revealed: Literal[1, b"b"]

    if isinstance(y, (str, bytes)):
        reveal_type(y)  # revealed: Literal["a", b"b"]

`classinfo` is a nested tuple of types

def _(flag: bool):
    x = 1 if flag else "a"

    if isinstance(x, (bool, (bytes, int))):
        reveal_type(x)  # revealed: Literal[1]
    else:
        reveal_type(x)  # revealed: Literal["a"]

Class types

class A: ...
class B: ...
class C: ...

x = object()

if isinstance(x, A):
    reveal_type(x)  # revealed: A
    if isinstance(x, B):
        reveal_type(x)  # revealed: A & B
    else:
        reveal_type(x)  # revealed: A & ~B

if isinstance(x, (A, B)):
    reveal_type(x)  # revealed: A | B
elif isinstance(x, (A, C)):
    reveal_type(x)  # revealed: C & ~A & ~B
else:
    reveal_type(x)  # revealed: ~A & ~B & ~C

No narrowing for instances of `builtins.type`

def _(flag: bool, t: type):
    x = 1 if flag else "foo"

    if isinstance(x, t):
        reveal_type(x)  # revealed: Literal[1, "foo"]

Do not use custom `isinstance` for narrowing

def _(flag: bool):
    def isinstance(x, t):
        return True
    x = 1 if flag else "a"

    if isinstance(x, int):
        reveal_type(x)  # revealed: Literal[1, "a"]

Do support narrowing if `isinstance` is aliased

def _(flag: bool):
    isinstance_alias = isinstance

    x = 1 if flag else "a"

    if isinstance_alias(x, int):
        reveal_type(x)  # revealed: Literal[1]

Do support narrowing if `isinstance` is imported

from builtins import isinstance as imported_isinstance

def _(flag: bool):
    x = 1 if flag else "a"

    if imported_isinstance(x, int):
        reveal_type(x)  # revealed: Literal[1]

Do not narrow if second argument is not a type

def _(flag: bool):
    x = 1 if flag else "a"

    # TODO: this should cause us to emit a diagnostic during
    # type checking
    if isinstance(x, "a"):
        reveal_type(x)  # revealed: Literal[1, "a"]

    # TODO: this should cause us to emit a diagnostic during
    # type checking
    if isinstance(x, "int"):
        reveal_type(x)  # revealed: Literal[1, "a"]

Do not narrow if there are keyword arguments

def _(flag: bool):
    x = 1 if flag else "a"

    # error: [unknown-argument]
    if isinstance(x, int, foo="bar"):
        reveal_type(x)  # revealed: Literal[1, "a"]

`type[]` types are narrowed as well as class-literal types

def _(x: object, y: type[int]):
    if isinstance(x, y):
        reveal_type(x)  # revealed: int

Adding a disjoint element to an existing intersection

We used to incorrectly infer Literal booleans for some of these.

from ty_extensions import Not, Intersection, AlwaysTruthy, AlwaysFalsy

class P: ...

def f(
    a: Intersection[P, AlwaysTruthy],
    b: Intersection[P, AlwaysFalsy],
    c: Intersection[P, Not[AlwaysTruthy]],
    d: Intersection[P, Not[AlwaysFalsy]],
):
    if isinstance(a, bool):
        reveal_type(a)  # revealed: Never
    else:
        reveal_type(a)  # revealed: P & AlwaysTruthy

    if isinstance(b, bool):
        reveal_type(b)  # revealed: Never
    else:
        reveal_type(b)  # revealed: P & AlwaysFalsy

    if isinstance(c, bool):
        reveal_type(c)  # revealed: Never
    else:
        reveal_type(c)  # revealed: P & ~AlwaysTruthy

    if isinstance(d, bool):
        reveal_type(d)  # revealed: Never
    else:
        reveal_type(d)  # revealed: P & ~AlwaysFalsy

Narrowing if an object of type `Any` or `Unknown` is used as the second argument

In order to preserve the gradual guarantee, we intersect with the type of the second argument if the type of the second argument is a dynamic type:

from typing import Any
from something_unresolvable import SomethingUnknown  # error: [unresolved-import]

class Foo: ...

def f(a: Foo, b: Any):
    if isinstance(a, SomethingUnknown):
        reveal_type(a)  # revealed: Foo & Unknown

    if isinstance(a, b):
        reveal_type(a)  # revealed: Foo & Any

Narrowing if an object with an intersection/union/TypeVar type is used as the second argument

If an intersection with only positive members is used as the second argument, and all positive members of the intersection are valid arguments for the second argument to isinstance(), we intersect with each positive member of the intersection:

[environment]
python-version = "3.12"

from typing import Any
from ty_extensions import Intersection

class Foo: ...

class Bar:
    attribute: int

class Baz:
    attribute: str

def f(x: Foo, y: Intersection[type[Bar], type[Baz]], z: type[Any]):
    if isinstance(x, y):
        reveal_type(x)  # revealed: Foo & Bar & Baz

    if isinstance(x, z):
        reveal_type(x)  # revealed: Foo & Any

The same if a union type is used:

def g(x: Foo, y: type[Bar | Baz]):
    if isinstance(x, y):
        reveal_type(x)  # revealed: (Foo & Bar) | (Foo & Baz)

And even if a TypeVar is used, providing it has valid upper bounds/constraints:

from typing import TypeVar

T = TypeVar("T", bound=type[Bar])

def h_old_syntax(x: Foo, y: T) -> T:
    if isinstance(x, y):
        reveal_type(x)  # revealed: Foo & Bar
        reveal_type(x.attribute)  # revealed: int

    return y

def h[U: type[Bar | Baz]](x: Foo, y: U) -> U:
    if isinstance(x, y):
        reveal_type(x)  # revealed: (Foo & Bar) | (Foo & Baz)
        reveal_type(x.attribute)  # revealed: int | str

    return y

Or even a tuple of tuple of typevars that have intersection bounds...

from ty_extensions import Intersection

class Spam: ...
class Eggs: ...
class Ham: ...
class Mushrooms: ...

def i[T: Intersection[type[Bar], type[Baz | Spam]], U: (type[Eggs], type[Ham])](x: Foo, y: T, z: U) -> tuple[T, U]:
    if isinstance(x, (y, (z, Mushrooms))):
        reveal_type(x)  # revealed: (Foo & Bar & Baz) | (Foo & Bar & Spam) | (Foo & Eggs) | (Foo & Ham) | (Foo & Mushrooms)

    return (y, z)

Narrowing with generics

[environment]
python-version = "3.12"

Narrowing to a generic class using isinstance() uses the top materialization of the generic. With a covariant generic, this is equivalent to using the upper bound of the type parameter (by default, object):

from typing import Self

class Covariant[T]:
    def get(self) -> T:
        raise NotImplementedError

def _(x: object):
    if isinstance(x, Covariant):
        reveal_type(x)  # revealed: Covariant[object]
        reveal_type(x.get())  # revealed: object

Similarly, contravariant type parameters use their lower bound of Never:

class Contravariant[T]:
    def push(self, x: T) -> None: ...

def _(x: object):
    if isinstance(x, Contravariant):
        reveal_type(x)  # revealed: Contravariant[Never]
        # error: [invalid-argument-type] "Argument to bound method `push` is incorrect: Expected `Never`, found `Literal[42]`"
        x.push(42)

Invariant generics are trickiest. The top materialization, conceptually the type that includes all instances of the generic class regardless of the type parameter, cannot be represented directly in the type system, so we represent it with the internal Top[] special form.

class Invariant[T]:
    def push(self, x: T) -> None: ...
    def get(self) -> T:
        raise NotImplementedError

def _(x: object):
    if isinstance(x, Invariant):
        reveal_type(x)  # revealed: Top[Invariant[Unknown]]
        reveal_type(x.get())  # revealed: object
        # error: [invalid-argument-type] "Argument to bound method `push` is incorrect: Expected `Never`, found `Literal[42]`"
        x.push(42)

When more complex types are involved, the Top[] type may get simplified away.

def _(x: list[int] | set[str]):
    if isinstance(x, list):
        reveal_type(x)  # revealed: list[int]
    else:
        reveal_type(x)  # revealed: set[str]

Though if the types involved are not disjoint bases, we necessarily keep a more complex type.

def _(x: Invariant[int] | Covariant[str]):
    if isinstance(x, Invariant):
        reveal_type(x)  # revealed: Invariant[int] | (Covariant[str] & Top[Invariant[Unknown]])
    else:
        reveal_type(x)  # revealed: Covariant[str] & ~Top[Invariant[Unknown]]

The behavior of issubclass() is similar.

def _(x: type[object], y: type[object], z: type[object]):
    if issubclass(x, Covariant):
        reveal_type(x)  # revealed: type[Covariant[object]]
    if issubclass(y, Contravariant):
        reveal_type(y)  # revealed: type[Contravariant[Never]]
    if issubclass(z, Invariant):
        reveal_type(z)  # revealed: type[Top[Invariant[Unknown]]]

10 KiB Raw Permalink Blame History

Narrowing for isinstance checks

classinfo is a single type

classinfo is a tuple of types

classinfo is a nested tuple of types