ruff/crates/ty_python_semantic/src/semantic_index/scope.rs
Carl Meyer 5a570c8e6d
[ty] fix deferred name loading in PEP695 generic classes/functions (#19888)
## Summary

For PEP 695 generic functions and classes, there is an extra "type
params scope" (a child of the outer scope, and wrapping the body scope)
in which the type parameters are defined; class bases and function
parameter/return annotations are resolved in that type-params scope.

This PR fixes some longstanding bugs in how we resolve name loads from
inside these PEP 695 type parameter scopes, and also defers type
inference of PEP 695 typevar bounds/constraints/default, so we can
handle cycles without panicking.

We were previously treating these type-param scopes as lazy nested
scopes, which is wrong. In fact they are eager nested scopes; the class
`C` here inherits `int`, not `str`, and previously we got that wrong:

```py
Base = int

class C[T](Base): ...

Base = str
```

But certain syntactic positions within type param scopes (typevar
bounds/constraints/defaults) are lazy at runtime, and we should use
deferred name resolution for them. This also means they can have cycles;
in order to handle that without panicking in type inference, we need to
actually defer their type inference until after we have constructed the
`TypeVarInstance`.

PEP 695 does specify that typevar bounds and constraints cannot be
generic, and that typevar defaults can only reference prior typevars,
not later ones. This reduces the scope of (valid from the type-system
perspective) cycles somewhat, although cycles are still possible (e.g.
`class C[T: list[C]]`). And this is a type-system-only restriction; from
the runtime perspective an "invalid" case like `class C[T: T]` actually
works fine.

I debated whether to implement the PEP 695 restrictions as a way to
avoid some cycles up-front, but I ended up deciding against that; I'd
rather model the runtime name-resolution semantics accurately, and
implement the PEP 695 restrictions as a separate diagnostic on top.
(This PR doesn't yet implement those diagnostics, thus some `# TODO:
error` in the added tests.)

Introducing the possibility of cyclic typevars made typevar display
potentially stack overflow. For now I've handled this by simply removing
typevar details (bounds/constraints/default) from typevar display. This
impacts display of two kinds of types. If you `reveal_type(T)` on an
unbound `T` you now get just `typing.TypeVar` instead of
`typing.TypeVar("T", ...)` where `...` is the bound/constraints/default.
This matches pyright and mypy; pyrefly uses `type[TypeVar[T]]` which
seems a bit confusing, but does include the name. (We could easily
include the name without cycle issues, if there's a syntax we like for
that.)

It also means that displaying a generic function type like `def f[T:
int](x: T) -> T: ...` now displays as `f[T](x: T) -> T` instead of `f[T:
int](x: T) -> T`. This matches pyright and pyrefly; mypy does include
bound/constraints/defaults of typevars in function/callable type
display. If we wanted to add this, we would either need to thread a
visitor through all the type display code, or add a `decycle` type
transformation that replaced recursive reoccurrence of a type with a
marker.

## Test Plan

Added mdtests and modified existing tests to improve their correctness.

After this PR, there's only a single remaining py-fuzzer seed in the
0-500 range that panics! (Before this PR, there were 10; the fuzzer
likes to generate cyclic PEP 695 syntax.)

## Ecosystem report

It's all just the changes to `TypeVar` display.
2025-08-13 15:51:59 -07:00

462 lines
15 KiB
Rust

use std::ops::Range;
use ruff_db::{files::File, parsed::ParsedModuleRef};
use ruff_index::newtype_index;
use ruff_python_ast as ast;
use crate::{
Db,
ast_node_ref::AstNodeRef,
node_key::NodeKey,
semantic_index::{
SemanticIndex, reachability_constraints::ScopedReachabilityConstraintId, semantic_index,
},
};
/// A cross-module identifier of a scope that can be used as a salsa query parameter.
#[salsa::tracked(debug, heap_size=ruff_memory_usage::heap_size)]
pub struct ScopeId<'db> {
pub file: File,
pub file_scope_id: FileScopeId,
}
// The Salsa heap is tracked separately.
impl get_size2::GetSize for ScopeId<'_> {}
impl<'db> ScopeId<'db> {
pub(crate) fn is_function_like(self, db: &'db dyn Db) -> bool {
self.node(db).scope_kind().is_function_like()
}
pub(crate) fn is_annotation(self, db: &'db dyn Db) -> bool {
self.node(db).scope_kind().is_annotation()
}
pub(crate) fn node(self, db: &dyn Db) -> &NodeWithScopeKind {
self.scope(db).node()
}
pub(crate) fn scope(self, db: &dyn Db) -> &Scope {
semantic_index(db, self.file(db)).scope(self.file_scope_id(db))
}
#[cfg(test)]
pub(crate) fn name<'ast>(self, db: &'db dyn Db, module: &'ast ParsedModuleRef) -> &'ast str {
match self.node(db) {
NodeWithScopeKind::Module => "<module>",
NodeWithScopeKind::Class(class) | NodeWithScopeKind::ClassTypeParameters(class) => {
class.node(module).name.as_str()
}
NodeWithScopeKind::Function(function)
| NodeWithScopeKind::FunctionTypeParameters(function) => {
function.node(module).name.as_str()
}
NodeWithScopeKind::TypeAlias(type_alias)
| NodeWithScopeKind::TypeAliasTypeParameters(type_alias) => type_alias
.node(module)
.name
.as_name_expr()
.map(|name| name.id.as_str())
.unwrap_or("<type alias>"),
NodeWithScopeKind::Lambda(_) => "<lambda>",
NodeWithScopeKind::ListComprehension(_) => "<listcomp>",
NodeWithScopeKind::SetComprehension(_) => "<setcomp>",
NodeWithScopeKind::DictComprehension(_) => "<dictcomp>",
NodeWithScopeKind::GeneratorExpression(_) => "<generator>",
}
}
}
/// ID that uniquely identifies a scope inside of a module.
#[newtype_index]
#[derive(salsa::Update, get_size2::GetSize)]
pub struct FileScopeId;
impl FileScopeId {
/// Returns the scope id of the module-global scope.
pub fn global() -> Self {
FileScopeId::from_u32(0)
}
pub fn is_global(self) -> bool {
self == FileScopeId::global()
}
pub fn to_scope_id(self, db: &dyn Db, file: File) -> ScopeId<'_> {
let index = semantic_index(db, file);
index.scope_ids_by_scope[self]
}
pub(crate) fn is_generator_function(self, index: &SemanticIndex) -> bool {
index.generator_functions.contains(&self)
}
}
#[derive(Debug, salsa::Update, get_size2::GetSize)]
pub(crate) struct Scope {
/// The parent scope, if any.
parent: Option<FileScopeId>,
/// The node that introduces this scope.
node: NodeWithScopeKind,
/// The range of [`FileScopeId`]s that are descendants of this scope.
descendants: Range<FileScopeId>,
/// The constraint that determines the reachability of this scope.
reachability: ScopedReachabilityConstraintId,
/// Whether this scope is defined inside an `if TYPE_CHECKING:` block.
in_type_checking_block: bool,
}
impl Scope {
pub(super) fn new(
parent: Option<FileScopeId>,
node: NodeWithScopeKind,
descendants: Range<FileScopeId>,
reachability: ScopedReachabilityConstraintId,
in_type_checking_block: bool,
) -> Self {
Scope {
parent,
node,
descendants,
reachability,
in_type_checking_block,
}
}
pub(crate) fn parent(&self) -> Option<FileScopeId> {
self.parent
}
pub(crate) fn node(&self) -> &NodeWithScopeKind {
&self.node
}
pub(crate) fn kind(&self) -> ScopeKind {
self.node().scope_kind()
}
pub(crate) fn visibility(&self) -> ScopeVisibility {
self.kind().visibility()
}
pub(crate) fn descendants(&self) -> Range<FileScopeId> {
self.descendants.clone()
}
pub(super) fn extend_descendants(&mut self, children_end: FileScopeId) {
self.descendants = self.descendants.start..children_end;
}
pub(crate) fn is_eager(&self) -> bool {
self.kind().is_eager()
}
pub(crate) fn reachability(&self) -> ScopedReachabilityConstraintId {
self.reachability
}
pub(crate) fn in_type_checking_block(&self) -> bool {
self.in_type_checking_block
}
}
#[derive(Debug, PartialEq, Eq, Clone, Copy, Hash, get_size2::GetSize)]
pub(crate) enum ScopeVisibility {
/// The scope is private (e.g. function, type alias, comprehension scope).
Private,
/// The scope is public (e.g. module, class scope).
Public,
}
impl ScopeVisibility {
pub(crate) const fn is_public(self) -> bool {
matches!(self, ScopeVisibility::Public)
}
pub(crate) const fn is_private(self) -> bool {
matches!(self, ScopeVisibility::Private)
}
}
#[derive(Debug, PartialEq, Eq, Clone, Copy, Hash, get_size2::GetSize)]
pub(crate) enum ScopeLaziness {
/// The scope is evaluated lazily (e.g. function, type alias scope).
Lazy,
/// The scope is evaluated eagerly (e.g. module, class, comprehension scope).
Eager,
}
impl ScopeLaziness {
pub(crate) const fn is_eager(self) -> bool {
matches!(self, ScopeLaziness::Eager)
}
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub(crate) enum ScopeKind {
Module,
TypeParams,
Class,
Function,
Lambda,
Comprehension,
TypeAlias,
}
impl ScopeKind {
pub(crate) const fn is_eager(self) -> bool {
self.laziness().is_eager()
}
pub(crate) const fn laziness(self) -> ScopeLaziness {
match self {
ScopeKind::Module
| ScopeKind::Class
| ScopeKind::Comprehension
| ScopeKind::TypeParams => ScopeLaziness::Eager,
ScopeKind::Function | ScopeKind::Lambda | ScopeKind::TypeAlias => ScopeLaziness::Lazy,
}
}
pub(crate) const fn visibility(self) -> ScopeVisibility {
match self {
ScopeKind::Module | ScopeKind::Class => ScopeVisibility::Public,
ScopeKind::TypeParams
| ScopeKind::TypeAlias
| ScopeKind::Function
| ScopeKind::Lambda
| ScopeKind::Comprehension => ScopeVisibility::Private,
}
}
pub(crate) const fn is_function_like(self) -> bool {
// Type parameter scopes behave like function scopes in terms of name resolution; CPython
// symbol table also uses the term "function-like" for these scopes.
matches!(
self,
ScopeKind::TypeParams
| ScopeKind::Function
| ScopeKind::Lambda
| ScopeKind::TypeAlias
| ScopeKind::Comprehension
)
}
pub(crate) const fn is_class(self) -> bool {
matches!(self, ScopeKind::Class)
}
pub(crate) const fn is_module(self) -> bool {
matches!(self, ScopeKind::Module)
}
pub(crate) const fn is_annotation(self) -> bool {
matches!(self, ScopeKind::TypeParams | ScopeKind::TypeAlias)
}
pub(crate) const fn is_non_lambda_function(self) -> bool {
matches!(self, ScopeKind::Function)
}
}
/// Reference to a node that introduces a new scope.
#[derive(Copy, Clone, Debug)]
pub(crate) enum NodeWithScopeRef<'a> {
Module,
Class(&'a ast::StmtClassDef),
Function(&'a ast::StmtFunctionDef),
Lambda(&'a ast::ExprLambda),
FunctionTypeParameters(&'a ast::StmtFunctionDef),
ClassTypeParameters(&'a ast::StmtClassDef),
TypeAlias(&'a ast::StmtTypeAlias),
TypeAliasTypeParameters(&'a ast::StmtTypeAlias),
ListComprehension(&'a ast::ExprListComp),
SetComprehension(&'a ast::ExprSetComp),
DictComprehension(&'a ast::ExprDictComp),
GeneratorExpression(&'a ast::ExprGenerator),
}
impl NodeWithScopeRef<'_> {
/// Converts the unowned reference to an owned [`NodeWithScopeKind`].
///
/// Note that node wrapped by `self` must be a child of `module`.
pub(super) fn to_kind(self, module: &ParsedModuleRef) -> NodeWithScopeKind {
match self {
NodeWithScopeRef::Module => NodeWithScopeKind::Module,
NodeWithScopeRef::Class(class) => {
NodeWithScopeKind::Class(AstNodeRef::new(module, class))
}
NodeWithScopeRef::Function(function) => {
NodeWithScopeKind::Function(AstNodeRef::new(module, function))
}
NodeWithScopeRef::TypeAlias(type_alias) => {
NodeWithScopeKind::TypeAlias(AstNodeRef::new(module, type_alias))
}
NodeWithScopeRef::TypeAliasTypeParameters(type_alias) => {
NodeWithScopeKind::TypeAliasTypeParameters(AstNodeRef::new(module, type_alias))
}
NodeWithScopeRef::Lambda(lambda) => {
NodeWithScopeKind::Lambda(AstNodeRef::new(module, lambda))
}
NodeWithScopeRef::FunctionTypeParameters(function) => {
NodeWithScopeKind::FunctionTypeParameters(AstNodeRef::new(module, function))
}
NodeWithScopeRef::ClassTypeParameters(class) => {
NodeWithScopeKind::ClassTypeParameters(AstNodeRef::new(module, class))
}
NodeWithScopeRef::ListComprehension(comprehension) => {
NodeWithScopeKind::ListComprehension(AstNodeRef::new(module, comprehension))
}
NodeWithScopeRef::SetComprehension(comprehension) => {
NodeWithScopeKind::SetComprehension(AstNodeRef::new(module, comprehension))
}
NodeWithScopeRef::DictComprehension(comprehension) => {
NodeWithScopeKind::DictComprehension(AstNodeRef::new(module, comprehension))
}
NodeWithScopeRef::GeneratorExpression(generator) => {
NodeWithScopeKind::GeneratorExpression(AstNodeRef::new(module, generator))
}
}
}
pub(crate) fn node_key(self) -> NodeWithScopeKey {
match self {
NodeWithScopeRef::Module => NodeWithScopeKey::Module,
NodeWithScopeRef::Class(class) => NodeWithScopeKey::Class(NodeKey::from_node(class)),
NodeWithScopeRef::Function(function) => {
NodeWithScopeKey::Function(NodeKey::from_node(function))
}
NodeWithScopeRef::Lambda(lambda) => {
NodeWithScopeKey::Lambda(NodeKey::from_node(lambda))
}
NodeWithScopeRef::FunctionTypeParameters(function) => {
NodeWithScopeKey::FunctionTypeParameters(NodeKey::from_node(function))
}
NodeWithScopeRef::ClassTypeParameters(class) => {
NodeWithScopeKey::ClassTypeParameters(NodeKey::from_node(class))
}
NodeWithScopeRef::TypeAlias(type_alias) => {
NodeWithScopeKey::TypeAlias(NodeKey::from_node(type_alias))
}
NodeWithScopeRef::TypeAliasTypeParameters(type_alias) => {
NodeWithScopeKey::TypeAliasTypeParameters(NodeKey::from_node(type_alias))
}
NodeWithScopeRef::ListComprehension(comprehension) => {
NodeWithScopeKey::ListComprehension(NodeKey::from_node(comprehension))
}
NodeWithScopeRef::SetComprehension(comprehension) => {
NodeWithScopeKey::SetComprehension(NodeKey::from_node(comprehension))
}
NodeWithScopeRef::DictComprehension(comprehension) => {
NodeWithScopeKey::DictComprehension(NodeKey::from_node(comprehension))
}
NodeWithScopeRef::GeneratorExpression(generator) => {
NodeWithScopeKey::GeneratorExpression(NodeKey::from_node(generator))
}
}
}
}
/// Node that introduces a new scope.
#[derive(Clone, Debug, salsa::Update, get_size2::GetSize)]
pub(crate) enum NodeWithScopeKind {
Module,
Class(AstNodeRef<ast::StmtClassDef>),
ClassTypeParameters(AstNodeRef<ast::StmtClassDef>),
Function(AstNodeRef<ast::StmtFunctionDef>),
FunctionTypeParameters(AstNodeRef<ast::StmtFunctionDef>),
TypeAliasTypeParameters(AstNodeRef<ast::StmtTypeAlias>),
TypeAlias(AstNodeRef<ast::StmtTypeAlias>),
Lambda(AstNodeRef<ast::ExprLambda>),
ListComprehension(AstNodeRef<ast::ExprListComp>),
SetComprehension(AstNodeRef<ast::ExprSetComp>),
DictComprehension(AstNodeRef<ast::ExprDictComp>),
GeneratorExpression(AstNodeRef<ast::ExprGenerator>),
}
impl NodeWithScopeKind {
pub(crate) const fn scope_kind(&self) -> ScopeKind {
match self {
Self::Module => ScopeKind::Module,
Self::Class(_) => ScopeKind::Class,
Self::Function(_) => ScopeKind::Function,
Self::Lambda(_) => ScopeKind::Lambda,
Self::FunctionTypeParameters(_)
| Self::ClassTypeParameters(_)
| Self::TypeAliasTypeParameters(_) => ScopeKind::TypeParams,
Self::TypeAlias(_) => ScopeKind::TypeAlias,
Self::ListComprehension(_)
| Self::SetComprehension(_)
| Self::DictComprehension(_)
| Self::GeneratorExpression(_) => ScopeKind::Comprehension,
}
}
pub(crate) fn expect_class<'ast>(
&self,
module: &'ast ParsedModuleRef,
) -> &'ast ast::StmtClassDef {
match self {
Self::Class(class) => class.node(module),
_ => panic!("expected class"),
}
}
pub(crate) fn as_class<'ast>(
&self,
module: &'ast ParsedModuleRef,
) -> Option<&'ast ast::StmtClassDef> {
match self {
Self::Class(class) => Some(class.node(module)),
_ => None,
}
}
pub(crate) fn expect_function<'ast>(
&self,
module: &'ast ParsedModuleRef,
) -> &'ast ast::StmtFunctionDef {
self.as_function(module).expect("expected function")
}
pub(crate) fn expect_type_alias<'ast>(
&self,
module: &'ast ParsedModuleRef,
) -> &'ast ast::StmtTypeAlias {
match self {
Self::TypeAlias(type_alias) => type_alias.node(module),
_ => panic!("expected type alias"),
}
}
pub(crate) fn as_function<'ast>(
&self,
module: &'ast ParsedModuleRef,
) -> Option<&'ast ast::StmtFunctionDef> {
match self {
Self::Function(function) => Some(function.node(module)),
_ => None,
}
}
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash, get_size2::GetSize)]
pub(crate) enum NodeWithScopeKey {
Module,
Class(NodeKey),
ClassTypeParameters(NodeKey),
Function(NodeKey),
FunctionTypeParameters(NodeKey),
TypeAlias(NodeKey),
TypeAliasTypeParameters(NodeKey),
Lambda(NodeKey),
ListComprehension(NodeKey),
SetComprehension(NodeKey),
DictComprehension(NodeKey),
GeneratorExpression(NodeKey),
}