[ty] type narrowing by attribute/subscript assignments (#18041)
Some checks are pending
CI / Determine changes (push) Waiting to run
CI / cargo fmt (push) Waiting to run
CI / cargo clippy (push) Blocked by required conditions
CI / cargo test (linux) (push) Blocked by required conditions
CI / cargo test (linux, release) (push) Blocked by required conditions
CI / cargo test (windows) (push) Blocked by required conditions
CI / cargo test (wasm) (push) Blocked by required conditions
CI / cargo build (release) (push) Waiting to run
CI / cargo build (msrv) (push) Blocked by required conditions
CI / cargo fuzz build (push) Blocked by required conditions
CI / fuzz parser (push) Blocked by required conditions
CI / test scripts (push) Blocked by required conditions
CI / ecosystem (push) Blocked by required conditions
CI / Fuzz for new ty panics (push) Blocked by required conditions
CI / cargo shear (push) Blocked by required conditions
CI / python package (push) Waiting to run
CI / pre-commit (push) Waiting to run
CI / mkdocs (push) Waiting to run
CI / formatter instabilities and black similarity (push) Blocked by required conditions
CI / test ruff-lsp (push) Blocked by required conditions
CI / check playground (push) Blocked by required conditions
CI / benchmarks (push) Blocked by required conditions
[ty Playground] Release / publish (push) Waiting to run

## Summary

This PR partially solves https://github.com/astral-sh/ty/issues/164
(derived from #17643).

Currently, the definitions we manage are limited to those for simple
name (symbol) targets, but we expand this to track definitions for
attribute and subscript targets as well.

This was originally planned as part of the work in #17643, but the
changes are significant, so I made it a separate PR.
After merging this PR, I will reflect this changes in #17643.

There is still some incomplete work remaining, but the basic features
have been implemented, so I am publishing it as a draft PR.
Here is the TODO list (there may be more to come):
* [x] Complete rewrite and refactoring of documentation (removing
`Symbol` and replacing it with `Place`)
* [x] More thorough testing
* [x] Consolidation of duplicated code (maybe we can consolidate the
handling related to name, attribute, and subscript)

This PR replaces the current `Symbol` API with the `Place` API, which is
a concept that includes attributes and subscripts (the term is borrowed
from Rust).

## Test Plan

`mdtest/narrow/assignment.md` is added.

---------

Co-authored-by: David Peter <sharkdp@users.noreply.github.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
This commit is contained in:
Shunsuke Shibayama 2025-06-05 09:24:27 +09:00 committed by GitHub
parent ce8b744f17
commit 0858896bc4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
38 changed files with 3432 additions and 2404 deletions

View file

@ -37,7 +37,9 @@ reveal_type(c_instance.inferred_from_other_attribute) # revealed: Unknown
# See https://github.com/astral-sh/ruff/issues/15960 for a related discussion.
reveal_type(c_instance.inferred_from_param) # revealed: Unknown | int | None
reveal_type(c_instance.declared_only) # revealed: bytes
# TODO: Should be `bytes` with no error, like mypy and pyright?
# error: [unresolved-attribute]
reveal_type(c_instance.declared_only) # revealed: Unknown
reveal_type(c_instance.declared_and_bound) # revealed: bool
@ -64,12 +66,10 @@ C.inferred_from_value = "overwritten on class"
# This assignment is fine:
c_instance.declared_and_bound = False
# TODO: After this assignment to the attribute within this scope, we may eventually want to narrow
# the `bool` type (see above) for this instance variable to `Literal[False]` here. This is unsound
# in general (we don't know what else happened to `c_instance` between the assignment and the use
# here), but mypy and pyright support this. In conclusion, this could be `bool` but should probably
# be `Literal[False]`.
reveal_type(c_instance.declared_and_bound) # revealed: bool
# Strictly speaking, inferring this as `Literal[False]` rather than `bool` is unsound in general
# (we don't know what else happened to `c_instance` between the assignment and the use here),
# but mypy and pyright support this.
reveal_type(c_instance.declared_and_bound) # revealed: Literal[False]
```
#### Variable declared in class body and possibly bound in `__init__`
@ -149,14 +149,16 @@ class C:
c_instance = C(True)
reveal_type(c_instance.only_declared_in_body) # revealed: str | None
reveal_type(c_instance.only_declared_in_init) # revealed: str | None
# TODO: should be `str | None` without error
# error: [unresolved-attribute]
reveal_type(c_instance.only_declared_in_init) # revealed: Unknown
reveal_type(c_instance.declared_in_body_and_init) # revealed: str | None
reveal_type(c_instance.declared_in_body_defined_in_init) # revealed: str | None
# TODO: This should be `str | None`. Fixing this requires an overhaul of the `Symbol` API,
# which is planned in https://github.com/astral-sh/ruff/issues/14297
reveal_type(c_instance.bound_in_body_declared_in_init) # revealed: Unknown | str | None
reveal_type(c_instance.bound_in_body_declared_in_init) # revealed: Unknown | Literal["a"]
reveal_type(c_instance.bound_in_body_and_init) # revealed: Unknown | None | Literal["a"]
```
@ -187,7 +189,9 @@ reveal_type(c_instance.inferred_from_other_attribute) # revealed: Unknown
reveal_type(c_instance.inferred_from_param) # revealed: Unknown | int | None
reveal_type(c_instance.declared_only) # revealed: bytes
# TODO: should be `bytes` with no error, like mypy and pyright?
# error: [unresolved-attribute]
reveal_type(c_instance.declared_only) # revealed: Unknown
reveal_type(c_instance.declared_and_bound) # revealed: bool
@ -260,8 +264,8 @@ class C:
self.w += None
# TODO: Mypy and pyright do not support this, but it would be great if we could
# infer `Unknown | str` or at least `Unknown | Weird | str` here.
reveal_type(C().w) # revealed: Unknown | Weird
# infer `Unknown | str` here (`Weird` is not a possible type for the `w` attribute).
reveal_type(C().w) # revealed: Unknown
```
#### Attributes defined in tuple unpackings
@ -410,14 +414,41 @@ class C:
[... for self.a in IntIterable()]
[... for (self.b, self.c) in TupleIterable()]
[... for self.d in IntIterable() for self.e in IntIterable()]
[[... for self.f in IntIterable()] for _ in IntIterable()]
[[... for self.g in IntIterable()] for self in [D()]]
class D:
g: int
c_instance = C()
reveal_type(c_instance.a) # revealed: Unknown | int
reveal_type(c_instance.b) # revealed: Unknown | int
reveal_type(c_instance.c) # revealed: Unknown | str
reveal_type(c_instance.d) # revealed: Unknown | int
reveal_type(c_instance.e) # revealed: Unknown | int
# TODO: no error, reveal Unknown | int
# error: [unresolved-attribute]
reveal_type(c_instance.a) # revealed: Unknown
# TODO: no error, reveal Unknown | int
# error: [unresolved-attribute]
reveal_type(c_instance.b) # revealed: Unknown
# TODO: no error, reveal Unknown | str
# error: [unresolved-attribute]
reveal_type(c_instance.c) # revealed: Unknown
# TODO: no error, reveal Unknown | int
# error: [unresolved-attribute]
reveal_type(c_instance.d) # revealed: Unknown
# TODO: no error, reveal Unknown | int
# error: [unresolved-attribute]
reveal_type(c_instance.e) # revealed: Unknown
# TODO: no error, reveal Unknown | int
# error: [unresolved-attribute]
reveal_type(c_instance.f) # revealed: Unknown
# This one is correctly not resolved as an attribute:
# error: [unresolved-attribute]
reveal_type(c_instance.g) # revealed: Unknown
```
#### Conditionally declared / bound attributes
@ -721,10 +752,7 @@ reveal_type(C.pure_class_variable) # revealed: Unknown
# error: [invalid-attribute-access] "Cannot assign to instance attribute `pure_class_variable` from the class object `<class 'C'>`"
C.pure_class_variable = "overwritten on class"
# TODO: should be `Unknown | Literal["value set in class method"]` or
# Literal["overwritten on class"]`, once/if we support local narrowing.
# error: [unresolved-attribute]
reveal_type(C.pure_class_variable) # revealed: Unknown
reveal_type(C.pure_class_variable) # revealed: Literal["overwritten on class"]
c_instance = C()
reveal_type(c_instance.pure_class_variable) # revealed: Unknown | Literal["value set in class method"]
@ -762,19 +790,12 @@ reveal_type(c_instance.variable_with_class_default2) # revealed: Unknown | Lite
c_instance.variable_with_class_default1 = "value set on instance"
reveal_type(C.variable_with_class_default1) # revealed: str
# TODO: Could be Literal["value set on instance"], or still `str` if we choose not to
# narrow the type.
reveal_type(c_instance.variable_with_class_default1) # revealed: str
reveal_type(c_instance.variable_with_class_default1) # revealed: Literal["value set on instance"]
C.variable_with_class_default1 = "overwritten on class"
# TODO: Could be `Literal["overwritten on class"]`, or still `str` if we choose not to
# narrow the type.
reveal_type(C.variable_with_class_default1) # revealed: str
# TODO: should still be `Literal["value set on instance"]`, or `str`.
reveal_type(c_instance.variable_with_class_default1) # revealed: str
reveal_type(C.variable_with_class_default1) # revealed: Literal["overwritten on class"]
reveal_type(c_instance.variable_with_class_default1) # revealed: Literal["value set on instance"]
```
#### Descriptor attributes as class variables

View file

@ -699,9 +699,7 @@ class C:
descriptor = Descriptor()
C.descriptor = "something else"
# This could also be `Literal["something else"]` if we support narrowing of attribute types based on assignments
reveal_type(C.descriptor) # revealed: Unknown | int
reveal_type(C.descriptor) # revealed: Literal["something else"]
```
### Possibly unbound descriptor attributes

View file

@ -0,0 +1,318 @@
# Narrowing by assignment
## Attribute
### Basic
```py
class A:
x: int | None = None
y = None
def __init__(self):
self.z = None
a = A()
a.x = 0
a.y = 0
a.z = 0
reveal_type(a.x) # revealed: Literal[0]
reveal_type(a.y) # revealed: Literal[0]
reveal_type(a.z) # revealed: Literal[0]
# Make sure that we infer the narrowed type for eager
# scopes (class, comprehension) and the non-narrowed
# public type for lazy scopes (function)
class _:
reveal_type(a.x) # revealed: Literal[0]
reveal_type(a.y) # revealed: Literal[0]
reveal_type(a.z) # revealed: Literal[0]
[reveal_type(a.x) for _ in range(1)] # revealed: Literal[0]
[reveal_type(a.y) for _ in range(1)] # revealed: Literal[0]
[reveal_type(a.z) for _ in range(1)] # revealed: Literal[0]
def _():
reveal_type(a.x) # revealed: Unknown | int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
if False:
a = A()
reveal_type(a.x) # revealed: Literal[0]
reveal_type(a.y) # revealed: Literal[0]
reveal_type(a.z) # revealed: Literal[0]
if True:
a = A()
reveal_type(a.x) # revealed: int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
a.x = 0
a.y = 0
a.z = 0
reveal_type(a.x) # revealed: Literal[0]
reveal_type(a.y) # revealed: Literal[0]
reveal_type(a.z) # revealed: Literal[0]
class _:
a = A()
reveal_type(a.x) # revealed: int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
def cond() -> bool:
return True
class _:
if False:
a = A()
reveal_type(a.x) # revealed: Literal[0]
reveal_type(a.y) # revealed: Literal[0]
reveal_type(a.z) # revealed: Literal[0]
if cond():
a = A()
reveal_type(a.x) # revealed: int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
class _:
a = A()
class Inner:
reveal_type(a.x) # revealed: int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
# error: [unresolved-reference]
does.nt.exist = 0
# error: [unresolved-reference]
reveal_type(does.nt.exist) # revealed: Unknown
```
### Narrowing chain
```py
class D: ...
class C:
d: D | None = None
class B:
c1: C | None = None
c2: C | None = None
class A:
b: B | None = None
a = A()
a.b = B()
a.b.c1 = C()
a.b.c2 = C()
a.b.c1.d = D()
a.b.c2.d = D()
reveal_type(a.b) # revealed: B
reveal_type(a.b.c1) # revealed: C
reveal_type(a.b.c1.d) # revealed: D
a.b.c1 = C()
reveal_type(a.b) # revealed: B
reveal_type(a.b.c1) # revealed: C
reveal_type(a.b.c1.d) # revealed: D | None
reveal_type(a.b.c2.d) # revealed: D
a.b.c1.d = D()
a.b = B()
reveal_type(a.b) # revealed: B
reveal_type(a.b.c1) # revealed: C | None
reveal_type(a.b.c2) # revealed: C | None
# error: [possibly-unbound-attribute]
reveal_type(a.b.c1.d) # revealed: D | None
# error: [possibly-unbound-attribute]
reveal_type(a.b.c2.d) # revealed: D | None
```
### Do not narrow the type of a `property` by assignment
```py
class C:
def __init__(self):
self._x: int = 0
@property
def x(self) -> int:
return self._x
@x.setter
def x(self, value: int) -> None:
self._x = abs(value)
c = C()
c.x = -1
# Don't infer `c.x` to be `Literal[-1]`
reveal_type(c.x) # revealed: int
```
### Do not narrow the type of a descriptor by assignment
```py
class Descriptor:
def __get__(self, instance: object, owner: type) -> int:
return 1
def __set__(self, instance: object, value: int) -> None:
pass
class C:
desc: Descriptor = Descriptor()
c = C()
c.desc = -1
# Don't infer `c.desc` to be `Literal[-1]`
reveal_type(c.desc) # revealed: int
```
## Subscript
### Specialization for builtin types
Type narrowing based on assignment to a subscript expression is generally unsound, because arbitrary
`__getitem__`/`__setitem__` methods on a class do not necessarily guarantee that the passed-in value
for `__setitem__` is stored and can be retrieved unmodified via `__getitem__`. Therefore, we
currently only perform assignment-based narrowing on a few built-in classes (`list`, `dict`,
`bytesarray`, `TypedDict` and `collections` types) where we are confident that this kind of
narrowing can be performed soundly. This is the same approach as pyright.
```py
from typing import TypedDict
from collections import ChainMap, defaultdict
l: list[int | None] = [None]
l[0] = 0
d: dict[int, int] = {1: 1}
d[0] = 0
b: bytearray = bytearray(b"abc")
b[0] = 0
dd: defaultdict[int, int] = defaultdict(int)
dd[0] = 0
cm: ChainMap[int, int] = ChainMap({1: 1}, {0: 0})
cm[0] = 0
# TODO: should be ChainMap[int, int]
reveal_type(cm) # revealed: ChainMap[Unknown, Unknown]
reveal_type(l[0]) # revealed: Literal[0]
reveal_type(d[0]) # revealed: Literal[0]
reveal_type(b[0]) # revealed: Literal[0]
reveal_type(dd[0]) # revealed: Literal[0]
# TODO: should be Literal[0]
reveal_type(cm[0]) # revealed: Unknown
class C:
reveal_type(l[0]) # revealed: Literal[0]
reveal_type(d[0]) # revealed: Literal[0]
reveal_type(b[0]) # revealed: Literal[0]
reveal_type(dd[0]) # revealed: Literal[0]
# TODO: should be Literal[0]
reveal_type(cm[0]) # revealed: Unknown
[reveal_type(l[0]) for _ in range(1)] # revealed: Literal[0]
[reveal_type(d[0]) for _ in range(1)] # revealed: Literal[0]
[reveal_type(b[0]) for _ in range(1)] # revealed: Literal[0]
[reveal_type(dd[0]) for _ in range(1)] # revealed: Literal[0]
# TODO: should be Literal[0]
[reveal_type(cm[0]) for _ in range(1)] # revealed: Unknown
def _():
reveal_type(l[0]) # revealed: int | None
reveal_type(d[0]) # revealed: int
reveal_type(b[0]) # revealed: int
reveal_type(dd[0]) # revealed: int
reveal_type(cm[0]) # revealed: int
class D(TypedDict):
x: int
label: str
td = D(x=1, label="a")
td["x"] = 0
# TODO: should be Literal[0]
reveal_type(td["x"]) # revealed: @Todo(TypedDict)
# error: [unresolved-reference]
does["not"]["exist"] = 0
# error: [unresolved-reference]
reveal_type(does["not"]["exist"]) # revealed: Unknown
non_subscriptable = 1
# error: [non-subscriptable]
non_subscriptable[0] = 0
# error: [non-subscriptable]
reveal_type(non_subscriptable[0]) # revealed: Unknown
```
### No narrowing for custom classes with arbitrary `__getitem__` / `__setitem__`
```py
class C:
def __init__(self):
self.l: list[str] = []
def __getitem__(self, index: int) -> str:
return self.l[index]
def __setitem__(self, index: int, value: str | int) -> None:
if len(self.l) == index:
self.l.append(str(value))
else:
self.l[index] = str(value)
c = C()
c[0] = 0
reveal_type(c[0]) # revealed: str
```
## Complex target
```py
class A:
x: list[int | None] = []
class B:
a: A | None = None
b = B()
b.a = A()
b.a.x[0] = 0
reveal_type(b.a.x[0]) # revealed: Literal[0]
class C:
reveal_type(b.a.x[0]) # revealed: Literal[0]
def _():
# error: [possibly-unbound-attribute]
reveal_type(b.a.x[0]) # revealed: Unknown | int | None
# error: [possibly-unbound-attribute]
reveal_type(b.a.x) # revealed: Unknown | list[int | None]
reveal_type(b.a) # revealed: Unknown | A | None
```
## Invalid assignments are not used for narrowing
```py
class C:
x: int | None
l: list[int]
def f(c: C, s: str):
c.x = s # error: [invalid-assignment]
reveal_type(c.x) # revealed: int | None
s = c.x # error: [invalid-assignment]
# TODO: This assignment is invalid and should result in an error.
c.l[0] = s
reveal_type(c.l[0]) # revealed: int
```

View file

@ -53,11 +53,114 @@ constraints may no longer be valid due to a "time lag". However, it may be possi
that some of them are valid by performing a more detailed analysis (e.g. checking that the narrowing
target has not changed in all places where the function is called).
### Narrowing by attribute/subscript assignments
```py
class A:
x: str | None = None
def update_x(self, value: str | None):
self.x = value
a = A()
a.x = "a"
class B:
reveal_type(a.x) # revealed: Literal["a"]
def f():
reveal_type(a.x) # revealed: Unknown | str | None
[reveal_type(a.x) for _ in range(1)] # revealed: Literal["a"]
a = A()
class C:
reveal_type(a.x) # revealed: str | None
def g():
reveal_type(a.x) # revealed: Unknown | str | None
[reveal_type(a.x) for _ in range(1)] # revealed: str | None
a = A()
a.x = "a"
a.update_x("b")
class D:
# TODO: should be `str | None`
reveal_type(a.x) # revealed: Literal["a"]
def h():
reveal_type(a.x) # revealed: Unknown | str | None
# TODO: should be `str | None`
[reveal_type(a.x) for _ in range(1)] # revealed: Literal["a"]
```
### Narrowing by attribute/subscript assignments in nested scopes
```py
class D: ...
class C:
d: D | None = None
class B:
c1: C | None = None
c2: C | None = None
class A:
b: B | None = None
a = A()
a.b = B()
class _:
a.b.c1 = C()
class _:
a.b.c1.d = D()
a = 1
class _3:
reveal_type(a) # revealed: A
reveal_type(a.b.c1.d) # revealed: D
class _:
a = 1
# error: [unresolved-attribute]
a.b.c1.d = D()
class _3:
reveal_type(a) # revealed: A
# TODO: should be `D | None`
reveal_type(a.b.c1.d) # revealed: D
a.b.c1 = C()
a.b.c1.d = D()
class _:
a.b = B()
class _:
# error: [possibly-unbound-attribute]
reveal_type(a.b.c1.d) # revealed: D | None
reveal_type(a.b.c1) # revealed: C | None
```
### Narrowing constraints introduced in eager nested scopes
```py
g: str | None = "a"
class A:
x: str | None = None
a = A()
l: list[str | None] = [None]
def f(x: str | None):
def _():
if x is not None:
@ -69,6 +172,14 @@ def f(x: str | None):
if g is not None:
reveal_type(g) # revealed: str
if a.x is not None:
# TODO(#17643): should be `Unknown | str`
reveal_type(a.x) # revealed: Unknown | str | None
if l[0] is not None:
# TODO(#17643): should be `str`
reveal_type(l[0]) # revealed: str | None
class C:
if x is not None:
reveal_type(x) # revealed: str
@ -79,6 +190,14 @@ def f(x: str | None):
if g is not None:
reveal_type(g) # revealed: str
if a.x is not None:
# TODO(#17643): should be `Unknown | str`
reveal_type(a.x) # revealed: Unknown | str | None
if l[0] is not None:
# TODO(#17643): should be `str`
reveal_type(l[0]) # revealed: str | None
# TODO: should be str
# This could be fixed if we supported narrowing with if clauses in comprehensions.
[reveal_type(x) for _ in range(1) if x is not None] # revealed: str | None
@ -89,6 +208,13 @@ def f(x: str | None):
```py
g: str | None = "a"
class A:
x: str | None = None
a = A()
l: list[str | None] = [None]
def f(x: str | None):
if x is not None:
def _():
@ -109,6 +235,28 @@ def f(x: str | None):
reveal_type(g) # revealed: str
[reveal_type(g) for _ in range(1)] # revealed: str
if a.x is not None:
def _():
reveal_type(a.x) # revealed: Unknown | str | None
class D:
# TODO(#17643): should be `Unknown | str`
reveal_type(a.x) # revealed: Unknown | str | None
# TODO(#17643): should be `Unknown | str`
[reveal_type(a.x) for _ in range(1)] # revealed: Unknown | str | None
if l[0] is not None:
def _():
reveal_type(l[0]) # revealed: str | None
class D:
# TODO(#17643): should be `str`
reveal_type(l[0]) # revealed: str | None
# TODO(#17643): should be `str`
[reveal_type(l[0]) for _ in range(1)] # revealed: str | None
```
### Narrowing constraints introduced in multiple scopes
@ -118,6 +266,13 @@ from typing import Literal
g: str | Literal[1] | None = "a"
class A:
x: str | Literal[1] | None = None
a = A()
l: list[str | Literal[1] | None] = [None]
def f(x: str | Literal[1] | None):
class C:
if x is not None:
@ -140,6 +295,28 @@ def f(x: str | Literal[1] | None):
class D:
if g != 1:
reveal_type(g) # revealed: str
if a.x is not None:
def _():
if a.x != 1:
# TODO(#17643): should be `Unknown | str | None`
reveal_type(a.x) # revealed: Unknown | str | Literal[1] | None
class D:
if a.x != 1:
# TODO(#17643): should be `Unknown | str`
reveal_type(a.x) # revealed: Unknown | str | Literal[1] | None
if l[0] is not None:
def _():
if l[0] != 1:
# TODO(#17643): should be `str | None`
reveal_type(l[0]) # revealed: str | Literal[1] | None
class D:
if l[0] != 1:
# TODO(#17643): should be `str`
reveal_type(l[0]) # revealed: str | Literal[1] | None
```
### Narrowing constraints with bindings in class scope, and nested scopes