
## Summary Quoting from the newly added comment: Module-level globals can be mutated externally. A `MY_CONSTANT = 1` global might be changed to `"some string"` from code outside of the module that we're looking at, and so from a gradual-guarantee perspective, it makes sense to infer a type of `Literal[1] | Unknown` for global symbols. This allows the code that does the mutation to type check correctly, and for code that uses the global, it accurately reflects the lack of knowledge about the type. External modifications (or modifications through `global` statements) that would require a wider type are relatively rare. From a practical perspective, we can therefore achieve a better user experience by trusting the inferred type. Users who need the external mutation to work can always annotate the global with the wider type. And everyone else benefits from more precise type inference. I initially implemented this by applying literal promotion to the type of the unannotated module globals (as suggested in https://github.com/astral-sh/ty/issues/1069), but the ecosystem impact showed a lot of problems (https://github.com/astral-sh/ruff/pull/20643). I fixed/patched some of these problems, but this PR seems like a good first step, and it seems sensible to apply the literal promotion change in a second step that can be evaluated separately. closes https://github.com/astral-sh/ty/issues/1069 ## Ecosystem impact This seems like an (unexpectedly large) net positive with 650 fewer diagnostics overall.. even though this change will certainly catch more true positives. * There are 666 removed `type-assertion-failure` diagnostics, where we were previously used the correct type already, but removing the `Unknown` now leads to an "exact" match. * 1464 of the 1805 total new diagnostics are `unresolved-attribute` errors, most (1365) of which were previously `possibly-missing-attribute` errors. So they could also be counted as "changed" diagnostics. * For code that uses constants like ```py IS_PYTHON_AT_LEAST_3_10 = sys.version_info >= (3, 10) ``` where we would have previously inferred a type of `Literal[True/False] | Unknown`, removing the `Unknown` now allows us to do reachability analysis on branches that use these constants, and so we get a lot of favorable ecosystem changes because of that. * There is code like the following, where we previously emitted `conflicting-argument-forms` diagnostics on calls to the aliased `assert_type`, because its type was `Unknown | def …` (and the call to `Unknown` "used" the type form argument in a non type-form way): ```py if sys.version_info >= (3, 11): import typing assert_type = typing.assert_type else: import typing_extensions assert_type = typing_extensions.assert_type ``` * ~100 new `invalid-argument-type` false positives, due to missing `**kwargs` support (https://github.com/astral-sh/ty/issues/247) ## Typing conformance ```diff +protocols_modules.py:25:1: error[invalid-assignment] Object of type `<module '_protocols_modules1'>` is not assignable to `Options1` ``` This diagnostic should apparently not be there, but it looks like we also fail other tests in that file, so it seems to be a limitation that was previously hidden by `Unknown` somehow. ## Test Plan Updated tests and relatively thorough ecosystem analysis.
8 KiB
Eager scopes
Some scopes are executed eagerly: references to variables defined in enclosing scopes are resolved immediately. This is in contrast to (for instance) function scopes, where those references are resolved when the function is called.
Function definitions
Function definitions are evaluated lazily.
x = 1
def f():
reveal_type(x) # revealed: Literal[1, 2]
x = 2
Class definitions
Class definitions are evaluated eagerly.
def _():
x = 1
class A:
reveal_type(x) # revealed: Literal[1]
y = x
x = 2
reveal_type(A.y) # revealed: Unknown | Literal[1]
List comprehensions
List comprehensions are evaluated eagerly.
def _():
x = 1
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
Set comprehensions
Set comprehensions are evaluated eagerly.
def _():
x = 1
# revealed: Literal[1]
{reveal_type(x) for a in range(1)}
x = 2
Dict comprehensions
Dict comprehensions are evaluated eagerly.
def _():
x = 1
# revealed: Literal[1]
{a: reveal_type(x) for a in range(1)}
x = 2
Generator expressions
Generator expressions don't necessarily run eagerly, but in practice usually they do, so assuming they do is the better default.
def _():
x = 1
# revealed: Literal[1]
list(reveal_type(x) for a in range(1))
x = 2
But that does lead to incorrect results when the generator expression isn't run immediately:
def evaluated_later():
x = 1
# revealed: Literal[1]
y = (reveal_type(x) for a in range(1))
x = 2
# The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
# our inferred type.
print(next(y))
Though note that “the iterable expression in the leftmost for
clause is immediately evaluated”
[spec]:
def iterable_evaluated_eagerly():
x = 1
# revealed: Literal[1]
y = (a for a in [reveal_type(x)])
x = 2
# Even though the generator isn't evaluated until here, the first iterable was evaluated
# immediately, so our inferred type is correct.
print(next(y))
Top-level eager scopes
All of the above examples behave identically when the eager scopes are directly nested in the global scope.
Class definitions
x = 1
class A:
reveal_type(x) # revealed: Literal[1]
y = x
x = 2
reveal_type(A.y) # revealed: Unknown | Literal[1]
List comprehensions
x = 1
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
# error: [unresolved-reference]
[y for a in range(1)]
y = 1
Set comprehensions
x = 1
# revealed: Literal[1]
{reveal_type(x) for a in range(1)}
x = 2
# error: [unresolved-reference]
{y for a in range(1)}
y = 1
Dict comprehensions
x = 1
# revealed: Literal[1]
{a: reveal_type(x) for a in range(1)}
x = 2
# error: [unresolved-reference]
{a: y for a in range(1)}
y = 1
Generator expressions
x = 1
# revealed: Literal[1]
list(reveal_type(x) for a in range(1))
x = 2
# error: [unresolved-reference]
list(y for a in range(1))
y = 1
evaluated_later.py
:
x = 1
# revealed: Literal[1]
y = (reveal_type(x) for a in range(1))
x = 2
# The generator isn't evaluated until here, so at runtime, `x` will evaluate to 2, contradicting
# our inferred type.
print(next(y))
iterable_evaluated_eagerly.py
:
x = 1
# revealed: Literal[1]
y = (a for a in [reveal_type(x)])
x = 2
# Even though the generator isn't evaluated until here, the first iterable was evaluated
# immediately, so our inferred type is correct.
print(next(y))
Lazy scopes are "sticky"
As we look through each enclosing scope when resolving a reference, lookups become lazy as soon as we encounter any lazy scope, even if there are other eager scopes that enclose it.
Eager scope within eager scope
If we don't encounter a lazy scope, lookup remains eager. The resolved binding is not necessarily in
the immediately enclosing scope. Here, the list comprehension and class definition are both eager
scopes, and we immediately resolve the use of x
to (only) the x = 1
binding.
def _():
x = 1
class A:
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
Class definition bindings are not visible in nested scopes
Class definitions are eager scopes, but any bindings in them are explicitly not visible to any nested scopes. (Those nested scopes are typically (lazy) function definitions, but the rule also applies to nested eager scopes like comprehensions and other class definitions.)
def _():
x = 1
class A:
x = 4
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
class B:
# revealed: Literal[1]
[reveal_type(x) for a in range(1)]
x = 2
x = 1
def _():
class C:
# revealed: Literal[1]
[reveal_type(x) for _ in [1]]
x = 2
Eager scope within a lazy scope
The list comprehension is an eager scope, and it is enclosed within a function definition, which is a lazy scope. Because we pass through this lazy scope before encountering any bindings or definitions, the lookup is lazy.
def _():
x = 1
def f():
# revealed: Literal[1, 2]
[reveal_type(x) for a in range(1)]
x = 2
Lazy scope within an eager scope
The function definition is a lazy scope, and it is enclosed within a class definition, which is an eager scope. Even though we pass through an eager scope before encountering any bindings or definitions, the lookup remains lazy.
def _():
x = 1
class A:
def f():
# revealed: Literal[1, 2]
reveal_type(x)
x = 2
Lazy scope within a lazy scope
No matter how many lazy scopes we pass through before encountering a binding or definition, the lookup remains lazy.
def _():
x = 1
def f():
def g():
# revealed: Literal[1, 2]
reveal_type(x)
x = 2
Eager scope within a lazy scope within another eager scope
We have a list comprehension (eager scope), enclosed within a function definition (lazy scope),
enclosed within a class definition (eager scope), all of which we must pass through before
encountering any binding of x
. Even though the last scope we pass through is eager, the lookup is
lazy, since we encountered a lazy scope on the way.
def _():
x = 1
class A:
def f():
# revealed: Literal[1, 2]
[reveal_type(x) for a in range(1)]
x = 2
Annotations
Type annotations are sometimes deferred. When they are, the types that are referenced in an annotation are looked up lazily, even if they occur in an eager scope.
Eager annotations in a Python file
from typing import ClassVar
x = int
class C:
var: ClassVar[x]
reveal_type(C.var) # revealed: int
x = str
Deferred annotations in a Python file
from __future__ import annotations
from typing import ClassVar
x = int
class C:
var: ClassVar[x]
reveal_type(C.var) # revealed: int | str
x = str
Deferred annotations in a stub file
from typing import ClassVar
x = int
class C:
var: ClassVar[x]
# TODO: should ideally be `str`, but we currently consider all reachable bindings
reveal_type(C.var) # revealed: int | str
x = str
Annotation scopes
[environment]
python-version = "3.12"
Type alias annotation scopes are lazy
type Foo = Bar
class Bar:
pass
def _(x: Foo):
if isinstance(x, Bar):
reveal_type(x) # revealed: Bar
else:
reveal_type(x) # revealed: Never
Type-param scopes are eager, but bounds/constraints are deferred
# error: [unresolved-reference]
class D[T](Bar):
pass
class E[T: Bar]:
pass
# error: [unresolved-reference]
def g[T](x: Bar):
pass
def h[T: Bar](x: T):
pass
class Bar:
pass