ruff/crates/ty_python_semantic/resources/mdtest/narrow/assignment.md
David Peter 71d711257a
[ty] No union with Unknown for module-global symbols (#20664)
## Summary

Quoting from the newly added comment:

Module-level globals can be mutated externally. A `MY_CONSTANT = 1`
global might be changed to `"some string"` from code outside of the
module that we're looking at, and so from a gradual-guarantee
perspective, it makes sense to infer a type of `Literal[1] | Unknown`
for global symbols. This allows the code that does the mutation to type
check correctly, and for code that uses the global, it accurately
reflects the lack of knowledge about the type.

External modifications (or modifications through `global` statements)
that would require a wider type are relatively rare. From a practical
perspective, we can therefore achieve a better user experience by
trusting the inferred type. Users who need the external mutation to work
can always annotate the global with the wider type. And everyone else
benefits from more precise type inference.

I initially implemented this by applying literal promotion to the type
of the unannotated module globals (as suggested in
https://github.com/astral-sh/ty/issues/1069), but the ecosystem impact
showed a lot of problems (https://github.com/astral-sh/ruff/pull/20643).
I fixed/patched some of these problems, but this PR seems like a good
first step, and it seems sensible to apply the literal promotion change
in a second step that can be evaluated separately.

closes https://github.com/astral-sh/ty/issues/1069

## Ecosystem impact

This seems like an (unexpectedly large) net positive with 650 fewer
diagnostics overall.. even though this change will certainly catch more
true positives.

* There are 666 removed `type-assertion-failure` diagnostics, where we
were previously used the correct type already, but removing the
`Unknown` now leads to an "exact" match.
* 1464 of the 1805 total new diagnostics are `unresolved-attribute`
errors, most (1365) of which were previously
`possibly-missing-attribute` errors. So they could also be counted as
"changed" diagnostics.
* For code that uses constants like
  ```py
  IS_PYTHON_AT_LEAST_3_10 = sys.version_info >= (3, 10)
  ```
where we would have previously inferred a type of `Literal[True/False] |
Unknown`, removing the `Unknown` now allows us to do reachability
analysis on branches that use these constants, and so we get a lot of
favorable ecosystem changes because of that.
* There is code like the following, where we previously emitted
`conflicting-argument-forms` diagnostics on calls to the aliased
`assert_type`, because its type was `Unknown | def …` (and the call to
`Unknown` "used" the type form argument in a non type-form way):
  ```py
  if sys.version_info >= (3, 11):
      import typing
  
      assert_type = typing.assert_type
  else:
      import typing_extensions
  
      assert_type = typing_extensions.assert_type
  ```
* ~100 new `invalid-argument-type` false positives, due to missing
`**kwargs` support (https://github.com/astral-sh/ty/issues/247)

## Typing conformance

```diff
+protocols_modules.py:25:1: error[invalid-assignment] Object of type `<module '_protocols_modules1'>` is not assignable to `Options1`
```

This diagnostic should apparently not be there, but it looks like we
also fail other tests in that file, so it seems to be a limitation that
was previously hidden by `Unknown` somehow.

## Test Plan

Updated tests and relatively thorough ecosystem analysis.
2025-10-01 16:40:30 +02:00

7.8 KiB

Narrowing by assignment

Attribute

Basic

class A:
    x: int | None = None
    y = None

    def __init__(self):
        self.z = None

a = A()
a.x = 0
a.y = 0
a.z = 0

reveal_type(a.x)  # revealed: Literal[0]
reveal_type(a.y)  # revealed: Literal[0]
reveal_type(a.z)  # revealed: Literal[0]

# Make sure that we infer the narrowed type for eager
# scopes (class, comprehension) and the non-narrowed
# public type for lazy scopes (function)
class _:
    reveal_type(a.x)  # revealed: Literal[0]
    reveal_type(a.y)  # revealed: Literal[0]
    reveal_type(a.z)  # revealed: Literal[0]

[reveal_type(a.x) for _ in range(1)]  # revealed: Literal[0]
[reveal_type(a.y) for _ in range(1)]  # revealed: Literal[0]
[reveal_type(a.z) for _ in range(1)]  # revealed: Literal[0]

def _():
    reveal_type(a.x)  # revealed: int | None
    reveal_type(a.y)  # revealed: Unknown | None
    reveal_type(a.z)  # revealed: Unknown | None

if False:
    a = A()
reveal_type(a.x)  # revealed: Literal[0]
reveal_type(a.y)  # revealed: Literal[0]
reveal_type(a.z)  # revealed: Literal[0]

if True:
    a = A()
reveal_type(a.x)  # revealed: int | None
reveal_type(a.y)  # revealed: Unknown | None
reveal_type(a.z)  # revealed: Unknown | None

a.x = 0
a.y = 0
a.z = 0
reveal_type(a.x)  # revealed: Literal[0]
reveal_type(a.y)  # revealed: Literal[0]
reveal_type(a.z)  # revealed: Literal[0]

class _:
    a = A()
    reveal_type(a.x)  # revealed: int | None
    reveal_type(a.y)  # revealed: Unknown | None
    reveal_type(a.z)  # revealed: Unknown | None

def cond() -> bool:
    return True

class _:
    if False:
        a = A()
    reveal_type(a.x)  # revealed: Literal[0]
    reveal_type(a.y)  # revealed: Literal[0]
    reveal_type(a.z)  # revealed: Literal[0]

    if cond():
        a = A()
    reveal_type(a.x)  # revealed: int | None
    reveal_type(a.y)  # revealed: Unknown | None
    reveal_type(a.z)  # revealed: Unknown | None

class _:
    a = A()

    class Inner:
        reveal_type(a.x)  # revealed: int | None
        reveal_type(a.y)  # revealed: Unknown | None
        reveal_type(a.z)  # revealed: Unknown | None

a = A()
# error: [unresolved-attribute]
a.dynamically_added = 0
# error: [unresolved-attribute]
reveal_type(a.dynamically_added)  # revealed: Literal[0]

# error: [unresolved-reference]
does.nt.exist = 0
# error: [unresolved-reference]
reveal_type(does.nt.exist)  # revealed: Unknown

Narrowing chain

class D: ...

class C:
    d: D | None = None

class B:
    c1: C | None = None
    c2: C | None = None

class A:
    b: B | None = None

a = A()
a.b = B()
a.b.c1 = C()
a.b.c2 = C()
a.b.c1.d = D()
a.b.c2.d = D()
reveal_type(a.b)  # revealed: B
reveal_type(a.b.c1)  # revealed: C
reveal_type(a.b.c1.d)  # revealed: D

a.b.c1 = C()
reveal_type(a.b)  # revealed: B
reveal_type(a.b.c1)  # revealed: C
reveal_type(a.b.c1.d)  # revealed: D | None
reveal_type(a.b.c2.d)  # revealed: D

a.b.c1.d = D()
a.b = B()
reveal_type(a.b)  # revealed: B
reveal_type(a.b.c1)  # revealed: C | None
reveal_type(a.b.c2)  # revealed: C | None
# error: [possibly-missing-attribute]
reveal_type(a.b.c1.d)  # revealed: D | None
# error: [possibly-missing-attribute]
reveal_type(a.b.c2.d)  # revealed: D | None

Do not narrow the type of a property by assignment

class C:
    def __init__(self):
        self._x: int = 0

    @property
    def x(self) -> int:
        return self._x

    @x.setter
    def x(self, value: int) -> None:
        self._x = abs(value)

c = C()
c.x = -1
# Don't infer `c.x` to be `Literal[-1]`
reveal_type(c.x)  # revealed: int

Do not narrow the type of a descriptor by assignment

class Descriptor:
    def __get__(self, instance: object, owner: type) -> int:
        return 1

    def __set__(self, instance: object, value: int) -> None:
        pass

class C:
    desc: Descriptor = Descriptor()

c = C()
c.desc = -1
# Don't infer `c.desc` to be `Literal[-1]`
reveal_type(c.desc)  # revealed: int

Subscript

Specialization for builtin types

Type narrowing based on assignment to a subscript expression is generally unsound, because arbitrary __getitem__/__setitem__ methods on a class do not necessarily guarantee that the passed-in value for __setitem__ is stored and can be retrieved unmodified via __getitem__. Therefore, we currently only perform assignment-based narrowing on a few built-in classes (list, dict, bytesarray, TypedDict and collections types) where we are confident that this kind of narrowing can be performed soundly. This is the same approach as pyright.

from typing import TypedDict
from collections import ChainMap, defaultdict

l: list[int | None] = [None]
l[0] = 0
d: dict[int, int] = {1: 1}
d[0] = 0
b: bytearray = bytearray(b"abc")
b[0] = 0
dd: defaultdict[int, int] = defaultdict(int)
dd[0] = 0
cm: ChainMap[int, int] = ChainMap({1: 1}, {0: 0})
cm[0] = 0
reveal_type(cm)  # revealed: ChainMap[Unknown | int, Unknown | int]

reveal_type(l[0])  # revealed: Literal[0]
reveal_type(d[0])  # revealed: Literal[0]
reveal_type(b[0])  # revealed: Literal[0]
reveal_type(dd[0])  # revealed: Literal[0]
reveal_type(cm[0])  # revealed: Literal[0]

class C:
    reveal_type(l[0])  # revealed: Literal[0]
    reveal_type(d[0])  # revealed: Literal[0]
    reveal_type(b[0])  # revealed: Literal[0]
    reveal_type(dd[0])  # revealed: Literal[0]
    reveal_type(cm[0])  # revealed: Literal[0]

[reveal_type(l[0]) for _ in range(1)]  # revealed: Literal[0]
[reveal_type(d[0]) for _ in range(1)]  # revealed: Literal[0]
[reveal_type(b[0]) for _ in range(1)]  # revealed: Literal[0]
[reveal_type(dd[0]) for _ in range(1)]  # revealed: Literal[0]
[reveal_type(cm[0]) for _ in range(1)]  # revealed: Literal[0]

def _():
    reveal_type(l[0])  # revealed: int | None
    reveal_type(d[0])  # revealed: int
    reveal_type(b[0])  # revealed: int
    reveal_type(dd[0])  # revealed: int
    reveal_type(cm[0])  # revealed: int

class D(TypedDict):
    x: int
    label: str

td = D(x=1, label="a")
td["x"] = 0
reveal_type(td["x"])  # revealed: Literal[0]

# error: [unresolved-reference]
does["not"]["exist"] = 0
# error: [unresolved-reference]
reveal_type(does["not"]["exist"])  # revealed: Unknown

non_subscriptable = 1
# error: [invalid-assignment]
non_subscriptable[0] = 0
# error: [non-subscriptable]
reveal_type(non_subscriptable[0])  # revealed: Unknown

No narrowing for custom classes with arbitrary __getitem__ / __setitem__

class C:
    def __init__(self):
        self.l: list[str] = []

    def __getitem__(self, index: int) -> str:
        return self.l[index]

    def __setitem__(self, index: int, value: str | int) -> None:
        if len(self.l) == index:
            self.l.append(str(value))
        else:
            self.l[index] = str(value)

c = C()
c[0] = 0
reveal_type(c[0])  # revealed: str

Complex target

class A:
    x: list[int | None] = []

class B:
    a: A | None = None

b = B()
b.a = A()
b.a.x[0] = 0

reveal_type(b.a.x[0])  # revealed: Literal[0]

class C:
    reveal_type(b.a.x[0])  # revealed: Literal[0]

def _():
    # error: [possibly-missing-attribute]
    reveal_type(b.a.x[0])  # revealed: int | None
    # error: [possibly-missing-attribute]
    reveal_type(b.a.x)  # revealed: list[int | None]
    reveal_type(b.a)  # revealed: A | None

Invalid assignments are not used for narrowing

class C:
    x: int | None
    l: list[int]

def f(c: C, s: str):
    c.x = s  # error: [invalid-assignment]
    reveal_type(c.x)  # revealed: int | None
    s = c.x  # error: [invalid-assignment]

    # error: [invalid-assignment] "Method `__setitem__` of type `Overload[(key: SupportsIndex, value: int, /) -> None, (key: slice[Any, Any, Any], value: Iterable[int], /) -> None]` cannot be called with a key of type `Literal[0]` and a value of type `str` on object of type `list[int]`"
    c.l[0] = s
    reveal_type(c.l[0])  # revealed: int