[ty] No union with Unknown for module-global symbols (#20664)

## Summary

Quoting from the newly added comment:

Module-level globals can be mutated externally. A `MY_CONSTANT = 1`
global might be changed to `"some string"` from code outside of the
module that we're looking at, and so from a gradual-guarantee
perspective, it makes sense to infer a type of `Literal[1] | Unknown`
for global symbols. This allows the code that does the mutation to type
check correctly, and for code that uses the global, it accurately
reflects the lack of knowledge about the type.

External modifications (or modifications through `global` statements)
that would require a wider type are relatively rare. From a practical
perspective, we can therefore achieve a better user experience by
trusting the inferred type. Users who need the external mutation to work
can always annotate the global with the wider type. And everyone else
benefits from more precise type inference.

I initially implemented this by applying literal promotion to the type
of the unannotated module globals (as suggested in
https://github.com/astral-sh/ty/issues/1069), but the ecosystem impact
showed a lot of problems (https://github.com/astral-sh/ruff/pull/20643).
I fixed/patched some of these problems, but this PR seems like a good
first step, and it seems sensible to apply the literal promotion change
in a second step that can be evaluated separately.

closes https://github.com/astral-sh/ty/issues/1069

## Ecosystem impact

This seems like an (unexpectedly large) net positive with 650 fewer
diagnostics overall.. even though this change will certainly catch more
true positives.

* There are 666 removed `type-assertion-failure` diagnostics, where we
were previously used the correct type already, but removing the
`Unknown` now leads to an "exact" match.
* 1464 of the 1805 total new diagnostics are `unresolved-attribute`
errors, most (1365) of which were previously
`possibly-missing-attribute` errors. So they could also be counted as
"changed" diagnostics.
* For code that uses constants like
  ```py
  IS_PYTHON_AT_LEAST_3_10 = sys.version_info >= (3, 10)
  ```
where we would have previously inferred a type of `Literal[True/False] |
Unknown`, removing the `Unknown` now allows us to do reachability
analysis on branches that use these constants, and so we get a lot of
favorable ecosystem changes because of that.
* There is code like the following, where we previously emitted
`conflicting-argument-forms` diagnostics on calls to the aliased
`assert_type`, because its type was `Unknown | def …` (and the call to
`Unknown` "used" the type form argument in a non type-form way):
  ```py
  if sys.version_info >= (3, 11):
      import typing
  
      assert_type = typing.assert_type
  else:
      import typing_extensions
  
      assert_type = typing_extensions.assert_type
  ```
* ~100 new `invalid-argument-type` false positives, due to missing
`**kwargs` support (https://github.com/astral-sh/ty/issues/247)

## Typing conformance

```diff
+protocols_modules.py:25:1: error[invalid-assignment] Object of type `<module '_protocols_modules1'>` is not assignable to `Options1`
```

This diagnostic should apparently not be there, but it looks like we
also fail other tests in that file, so it seems to be a limitation that
was previously hidden by `Unknown` somehow.

## Test Plan

Updated tests and relatively thorough ecosystem analysis.
This commit is contained in:
David Peter 2025-10-01 16:40:30 +02:00 committed by GitHub
parent eb34d12151
commit 71d711257a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
16 changed files with 82 additions and 92 deletions

View file

@ -108,7 +108,7 @@ def foo():
global x
def bar():
# allowed, refers to `x` in the global scope
reveal_type(x) # revealed: Unknown | Literal[1]
reveal_type(x) # revealed: Literal[1]
bar()
del x # allowed, deletes `x` in the global scope (though we don't track that)
```

View file

@ -25,8 +25,8 @@ reveal_type(y)
# error: [possibly-missing-import] "Member `y` of module `maybe_unbound` may be missing"
from maybe_unbound import x, y
reveal_type(x) # revealed: Unknown | Literal[3]
reveal_type(y) # revealed: Unknown | Literal[3]
reveal_type(x) # revealed: Literal[3]
reveal_type(y) # revealed: Literal[3]
```
## Maybe unbound annotated
@ -56,7 +56,7 @@ Importing an annotated name prefers the declared type over the inferred type:
# error: [possibly-missing-import] "Member `y` of module `maybe_unbound_annotated` may be missing"
from maybe_unbound_annotated import x, y
reveal_type(x) # revealed: Unknown | Literal[3]
reveal_type(x) # revealed: Literal[3]
reveal_type(y) # revealed: int
```

View file

@ -783,8 +783,8 @@ class A: ...
```py
from subexporter import *
# TODO: Should we avoid including `Unknown` for this case?
reveal_type(__all__) # revealed: Unknown | list[Unknown | str]
# TODO: we could potentially infer `list[str] | tuple[str, ...]` here
reveal_type(__all__) # revealed: list[Unknown | str]
__all__.append("B")

View file

@ -40,7 +40,7 @@ def __getattr__(name: str) -> int:
import mixed_module
# Explicit attribute should take precedence
reveal_type(mixed_module.explicit_attr) # revealed: Unknown | Literal["explicit"]
reveal_type(mixed_module.explicit_attr) # revealed: Literal["explicit"]
# `__getattr__` should handle unknown attributes
reveal_type(mixed_module.dynamic_attr) # revealed: str

View file

@ -103,7 +103,7 @@ x = "namespace"
```py
from foo import x
reveal_type(x) # revealed: Unknown | Literal["module"]
reveal_type(x) # revealed: Literal["module"]
import foo.bar # error: [unresolved-import]
```

View file

@ -144,8 +144,8 @@ X = (Y := 3) + 4
```py
from exporter import *
reveal_type(X) # revealed: Unknown | Literal[7]
reveal_type(Y) # revealed: Unknown | Literal[3]
reveal_type(X) # revealed: Literal[7]
reveal_type(Y) # revealed: Literal[3]
```
### Global-scope symbols defined in many other ways
@ -781,9 +781,9 @@ else:
from exporter import *
# error: [possibly-unresolved-reference]
reveal_type(A) # revealed: Unknown | Literal[1]
reveal_type(A) # revealed: Literal[1]
reveal_type(B) # revealed: Unknown | Literal[2, 3]
reveal_type(B) # revealed: Literal[2, 3]
```
### Reachability constraints in the importing module
@ -804,7 +804,7 @@ if coinflip():
from exporter import *
# error: [possibly-unresolved-reference]
reveal_type(A) # revealed: Unknown | Literal[1]
reveal_type(A) # revealed: Literal[1]
```
### Reachability constraints in the exporting module *and* the importing module

View file

@ -34,7 +34,7 @@ class _:
[reveal_type(a.z) for _ in range(1)] # revealed: Literal[0]
def _():
reveal_type(a.x) # revealed: Unknown | int | None
reveal_type(a.x) # revealed: int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
@ -75,7 +75,7 @@ class _:
if cond():
a = A()
reveal_type(a.x) # revealed: int | None | Unknown
reveal_type(a.x) # revealed: int | None
reveal_type(a.y) # revealed: Unknown | None
reveal_type(a.z) # revealed: Unknown | None
@ -295,10 +295,10 @@ class C:
def _():
# error: [possibly-missing-attribute]
reveal_type(b.a.x[0]) # revealed: Unknown | int | None
reveal_type(b.a.x[0]) # revealed: int | None
# error: [possibly-missing-attribute]
reveal_type(b.a.x) # revealed: Unknown | list[int | None]
reveal_type(b.a) # revealed: Unknown | A | None
reveal_type(b.a.x) # revealed: list[int | None]
reveal_type(b.a) # revealed: A | None
```
## Invalid assignments are not used for narrowing

View file

@ -167,11 +167,11 @@ if c.x is not None:
if c.x is not None:
def _():
reveal_type(c.x) # revealed: Unknown | int | None
reveal_type(c.x) # revealed: int | None
def _():
if c.x is not None:
reveal_type(c.x) # revealed: (Unknown & ~None) | int
reveal_type(c.x) # revealed: int
```
## Subscript narrowing

View file

@ -86,7 +86,7 @@ class B:
reveal_type(a.x) # revealed: Literal["a"]
def f():
reveal_type(a.x) # revealed: Unknown | str | None
reveal_type(a.x) # revealed: str | None
[reveal_type(a.x) for _ in range(1)] # revealed: Literal["a"]
@ -96,7 +96,7 @@ class C:
reveal_type(a.x) # revealed: str | None
def g():
reveal_type(a.x) # revealed: Unknown | str | None
reveal_type(a.x) # revealed: str | None
[reveal_type(a.x) for _ in range(1)] # revealed: str | None
@ -109,7 +109,7 @@ class D:
reveal_type(a.x) # revealed: Literal["a"]
def h():
reveal_type(a.x) # revealed: Unknown | str | None
reveal_type(a.x) # revealed: str | None
# TODO: should be `str | None`
[reveal_type(a.x) for _ in range(1)] # revealed: Literal["a"]
@ -190,7 +190,7 @@ def f(x: str | None):
reveal_type(g) # revealed: str
if a.x is not None:
reveal_type(a.x) # revealed: (Unknown & ~None) | str
reveal_type(a.x) # revealed: str
if l[0] is not None:
reveal_type(l[0]) # revealed: str
@ -206,7 +206,7 @@ def f(x: str | None):
reveal_type(g) # revealed: str
if a.x is not None:
reveal_type(a.x) # revealed: (Unknown & ~None) | str
reveal_type(a.x) # revealed: str
if l[0] is not None:
reveal_type(l[0]) # revealed: str
@ -382,12 +382,12 @@ def f():
if a.x is not None:
def _():
# Lazy nested scope narrowing is not performed on attributes/subscripts because it's difficult to track their changes.
reveal_type(a.x) # revealed: Unknown | str | None
reveal_type(a.x) # revealed: str | None
class D:
reveal_type(a.x) # revealed: (Unknown & ~None) | str
reveal_type(a.x) # revealed: str
[reveal_type(a.x) for _ in range(1)] # revealed: (Unknown & ~None) | str
[reveal_type(a.x) for _ in range(1)] # revealed: str
if l[0] is not None:
def _():
@ -473,11 +473,11 @@ def f():
if a.x is not None:
def _():
if a.x != 1:
reveal_type(a.x) # revealed: (Unknown & ~Literal[1]) | str | None
reveal_type(a.x) # revealed: str | None
class D:
if a.x != 1:
reveal_type(a.x) # revealed: (Unknown & ~Literal[1] & ~None) | str
reveal_type(a.x) # revealed: str
if l[0] is not None:
def _():

View file

@ -263,7 +263,7 @@ if flag():
x = 1
def f() -> None:
reveal_type(x) # revealed: Unknown | Literal[1, 2]
reveal_type(x) # revealed: Literal[1, 2]
# Function only used inside this branch
f()

View file

@ -29,8 +29,8 @@ if flag():
chr: int = 1
def _():
# TODO: Should ideally be `Unknown | Literal[1] | (def abs(x: SupportsAbs[_T], /) -> _T)`
reveal_type(abs) # revealed: Unknown | Literal[1]
# TODO: Should ideally be `Literal[1] | (def abs(x: SupportsAbs[_T], /) -> _T)`
reveal_type(abs) # revealed: Literal[1]
# TODO: Should ideally be `int | (def chr(i: SupportsIndex, /) -> str)`
reveal_type(chr) # revealed: int
```

View file

@ -12,7 +12,7 @@ Function definitions are evaluated lazily.
x = 1
def f():
reveal_type(x) # revealed: Unknown | Literal[1, 2]
reveal_type(x) # revealed: Literal[1, 2]
x = 2
```
@ -283,7 +283,7 @@ x = 1
def _():
class C:
# revealed: Unknown | Literal[1]
# revealed: Literal[1]
[reveal_type(x) for _ in [1]]
x = 2
```
@ -389,7 +389,7 @@ x = int
class C:
var: ClassVar[x]
reveal_type(C.var) # revealed: Unknown | int | str
reveal_type(C.var) # revealed: int | str
x = str
```

View file

@ -8,7 +8,7 @@ A name reference to a never-defined symbol in a function is implicitly a global
x = 1
def f():
reveal_type(x) # revealed: Unknown | Literal[1]
reveal_type(x) # revealed: Literal[1]
```
## Explicit global in function
@ -18,7 +18,7 @@ x = 1
def f():
global x
reveal_type(x) # revealed: Unknown | Literal[1]
reveal_type(x) # revealed: Literal[1]
```
## Unassignable type in function
@ -201,7 +201,7 @@ x = 42
def f():
global x
reveal_type(x) # revealed: Unknown | Literal[42]
reveal_type(x) # revealed: Literal[42]
x = "56"
reveal_type(x) # revealed: Literal["56"]
```

View file

@ -73,10 +73,10 @@ __spec__ = 42 # error: [invalid-assignment] "Object of type `Literal[42]` is no
```py
import module
reveal_type(module.__file__) # revealed: Unknown | None
reveal_type(module.__file__) # revealed: None
reveal_type(module.__path__) # revealed: list[str]
reveal_type(module.__doc__) # revealed: Unknown
reveal_type(module.__spec__) # revealed: Unknown | ModuleSpec | None
reveal_type(module.__spec__) # revealed: ModuleSpec | None
# error: [unresolved-attribute]
reveal_type(module.__warningregistry__) # revealed: Unknown

View file

@ -810,7 +810,7 @@ fn place_by_id<'db>(
// modified externally, but those changes do not take effect. We therefore issue
// a diagnostic if we see it being modified externally. In type inference, we
// can assign a "narrow" type to it even if it is not *declared*. This means, we
// do not have to call [`widen_type_for_undeclared_public_symbol`].
// do not have to union with `Unknown`.
//
// `TYPE_CHECKING` is a special variable that should only be assigned `False`
// at runtime, but is always considered `True` in type checking.
@ -822,18 +822,37 @@ fn place_by_id<'db>(
)
});
if scope.file(db).is_stub(db) || scope.scope(db).visibility().is_private() {
// We generally trust module-level undeclared places in stubs and do not union
// with `Unknown`. If we don't do this, simple aliases like `IOError = OSError` in
// stubs would result in `IOError` being a union of `OSError` and `Unknown`, which
// leads to all sorts of downstream problems. Similarly, type variables are often
// defined as `_T = TypeVar("_T")`, without being declared.
// Also, if the scope is private, such as a function scope,
// meaning that the place cannot be rewritten from elsewhere, we do not union with `Unknown`.
// Module-level globals can be mutated externally. A `MY_CONSTANT = 1` global might
// be changed to `"some string"` from code outside of the module that we're looking
// at, and so from a gradual-guarantee perspective, it makes sense to infer a type
// of `Literal[1] | Unknown` for global symbols. This allows the code that does the
// mutation to type check correctly, and for code that uses the global, it accurately
// reflects the lack of knowledge about the type.
//
// However, external modifications (or modifications through `global` statements) that
// would require a wider type are relatively rare. From a practical perspective, we can
// therefore achieve a better user experience by trusting the inferred type. Users who
// need the external mutation to work can always annotate the global with the wider
// type. And everyone else benefits from more precise type inference.
let is_module_global = scope.node(db).scope_kind().is_module();
// If the visibility of the scope is private (like for a function scope), we also do
// not union with `Unknown`, because the symbol cannot be modified externally.
let scope_has_private_visibility = scope.scope(db).visibility().is_private();
// We generally trust undeclared places in stubs and do not union with `Unknown`.
let in_stub_file = scope.file(db).is_stub(db);
if is_considered_non_modifiable
|| is_module_global
|| scope_has_private_visibility
|| in_stub_file
{
inferred.into()
} else {
widen_type_for_undeclared_public_symbol(db, inferred, is_considered_non_modifiable)
// Widen the inferred type of undeclared public symbols by unioning with `Unknown`
inferred
.map_type(|ty| UnionType::from_elements(db, [Type::unknown(), ty]))
.into()
}
}
@ -1585,29 +1604,6 @@ pub(crate) enum BoundnessAnalysis {
BasedOnUnboundVisibility,
}
/// Computes a possibly-widened type `Unknown | T_inferred` from the inferred type `T_inferred`
/// of a symbol, unless the type is a known-instance type (e.g. `typing.Any`) or the symbol is
/// considered non-modifiable (e.g. when the symbol is `@Final`). We need this for public uses
/// of symbols that have no declared type.
fn widen_type_for_undeclared_public_symbol<'db>(
db: &'db dyn Db,
inferred: Place<'db>,
is_considered_non_modifiable: bool,
) -> Place<'db> {
// We special-case known-instance types here since symbols like `typing.Any` are typically
// not declared in the stubs (e.g. `Any = object()`), but we still want to treat them as
// such.
let is_known_instance = inferred
.ignore_possibly_unbound()
.is_some_and(|ty| matches!(ty, Type::SpecialForm(_) | Type::KnownInstance(_)));
if is_considered_non_modifiable || is_known_instance {
inferred
} else {
inferred.map_type(|ty| UnionType::from_elements(db, [Type::unknown(), ty]))
}
}
#[cfg(test)]
mod tests {
use super::*;

View file

@ -5,7 +5,7 @@ use crate::place::{ConsideredDefinitions, Place, global_symbol};
use crate::semantic_index::definition::Definition;
use crate::semantic_index::scope::FileScopeId;
use crate::semantic_index::{global_scope, place_table, semantic_index, use_def_map};
use crate::types::{KnownClass, KnownInstanceType, UnionType, check_types};
use crate::types::{KnownClass, KnownInstanceType, check_types};
use ruff_db::diagnostic::Diagnostic;
use ruff_db::files::{File, system_path_to_file};
use ruff_db::system::DbWithWritableSystem as _;
@ -424,7 +424,7 @@ fn dependency_implicit_instance_attribute() -> anyhow::Result<()> {
let file_main = system_path_to_file(&db, "/src/main.py").unwrap();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | int | None");
assert_eq!(attr_ty.display(&db).to_string(), "int | None");
// Change the type of `attr` to `str | None`; this should trigger the type of `x` to be re-inferred
db.write_dedented(
@ -439,7 +439,7 @@ fn dependency_implicit_instance_attribute() -> anyhow::Result<()> {
let events = {
db.clear_salsa_events();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | str | None");
assert_eq!(attr_ty.display(&db).to_string(), "str | None");
db.take_salsa_events()
};
assert_function_query_was_run(
@ -463,7 +463,7 @@ fn dependency_implicit_instance_attribute() -> anyhow::Result<()> {
let events = {
db.clear_salsa_events();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | str | None");
assert_eq!(attr_ty.display(&db).to_string(), "str | None");
db.take_salsa_events()
};
@ -514,7 +514,7 @@ fn dependency_own_instance_member() -> anyhow::Result<()> {
let file_main = system_path_to_file(&db, "/src/main.py").unwrap();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | int | None");
assert_eq!(attr_ty.display(&db).to_string(), "int | None");
// Change the type of `attr` to `str | None`; this should trigger the type of `x` to be re-inferred
db.write_dedented(
@ -531,7 +531,7 @@ fn dependency_own_instance_member() -> anyhow::Result<()> {
let events = {
db.clear_salsa_events();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | str | None");
assert_eq!(attr_ty.display(&db).to_string(), "str | None");
db.take_salsa_events()
};
assert_function_query_was_run(
@ -557,7 +557,7 @@ fn dependency_own_instance_member() -> anyhow::Result<()> {
let events = {
db.clear_salsa_events();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | str | None");
assert_eq!(attr_ty.display(&db).to_string(), "str | None");
db.take_salsa_events()
};
@ -609,7 +609,7 @@ fn dependency_implicit_class_member() -> anyhow::Result<()> {
let file_main = system_path_to_file(&db, "/src/main.py").unwrap();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | int");
assert_eq!(attr_ty.display(&db).to_string(), "int");
// Change the type of `class_attr` to `str`; this should trigger the type of `x` to be re-inferred
db.write_dedented(
@ -628,7 +628,7 @@ fn dependency_implicit_class_member() -> anyhow::Result<()> {
let events = {
db.clear_salsa_events();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | str");
assert_eq!(attr_ty.display(&db).to_string(), "str");
db.take_salsa_events()
};
assert_function_query_was_run(
@ -656,7 +656,7 @@ fn dependency_implicit_class_member() -> anyhow::Result<()> {
let events = {
db.clear_salsa_events();
let attr_ty = global_symbol(&db, file_main, "x").place.expect_type();
assert_eq!(attr_ty.display(&db).to_string(), "Unknown | str");
assert_eq!(attr_ty.display(&db).to_string(), "str");
db.take_salsa_events()
};
@ -695,10 +695,7 @@ fn call_type_doesnt_rerun_when_only_callee_changed() -> anyhow::Result<()> {
let bar = system_path_to_file(&db, "src/bar.py")?;
let a = global_symbol(&db, bar, "a").place;
assert_eq!(
a.expect_type(),
UnionType::from_elements(&db, [Type::unknown(), KnownClass::Int.to_instance(&db)])
);
assert_eq!(a.expect_type(), KnownClass::Int.to_instance(&db));
let events = db.take_salsa_events();
let module = parsed_module(&db, bar).load(&db);
@ -726,10 +723,7 @@ fn call_type_doesnt_rerun_when_only_callee_changed() -> anyhow::Result<()> {
let a = global_symbol(&db, bar, "a").place;
assert_eq!(
a.expect_type(),
UnionType::from_elements(&db, [Type::unknown(), KnownClass::Int.to_instance(&db)])
);
assert_eq!(a.expect_type(), KnownClass::Int.to_instance(&db));
let events = db.take_salsa_events();
let module = parsed_module(&db, bar).load(&db);