[ty] Reachability constraints (#18621)

## Summary



* Completely removes the concept of visibility constraints. Reachability
constraints are now used to model the static visibility of bindings and
declarations. Reachability constraints are *much* easier to reason about
/ work with, since they are applied at the beginning of a branch, and
not applied retroactively. Removing the duplication between visibility
and reachability constraints also leads to major code simplifications
[^1]. For an overview of how the new constraint system works, see the
updated doc comment in `reachability_constraints.rs`.
* Fixes a [control-flow modeling bug
(panic)](https://github.com/astral-sh/ty/issues/365) involving `break`
statements in loops
* Fixes a [bug where](https://github.com/astral-sh/ty/issues/624) where
`elif` branches would have wrong reachability constraints
* Fixes a [bug where](https://github.com/astral-sh/ty/issues/648) code
after infinite loops would not be considered unreachble
* Fixes a panic on the `pywin32` ecosystem project, which we should be
able to move to `good.txt` once this has been merged.
* Removes some false positives in unreachable code because we infer
`Never` more often, due to the fact that reachability constraints now
apply retroactively to *all* active bindings, not just to bindings
inside a branch.
* As one example, this removes the `division-by-zero` diagnostic from
https://github.com/astral-sh/ty/issues/443 because we now infer `Never`
for the divisor.
* Supersedes and includes similar test changes as
https://github.com/astral-sh/ruff/pull/18392


closes https://github.com/astral-sh/ty/issues/365
closes https://github.com/astral-sh/ty/issues/624
closes https://github.com/astral-sh/ty/issues/642
closes https://github.com/astral-sh/ty/issues/648

## Benchmarks

Benchmarks on black, pandas, and sympy showed that this is neither a
performance improvement, nor a regression.

## Test Plan

Regression tests for:
- [x] https://github.com/astral-sh/ty/issues/365
- [x] https://github.com/astral-sh/ty/issues/624
- [x] https://github.com/astral-sh/ty/issues/642
- [x] https://github.com/astral-sh/ty/issues/648

[^1]: I'm afraid this is something that @carljm advocated for since the
beginning, and I'm not sure anymore why we have never seriously tried
this before. So I suggest we do *not* attempt to do a historical deep
dive to find out exactly why this ever became so complicated, and just
enjoy the fact that we eventually arrived here.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
This commit is contained in:
David Peter 2025-06-17 09:24:28 +02:00 committed by GitHub
parent c22f809049
commit 3a77768f79
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
18 changed files with 683 additions and 806 deletions

View file

@ -40,12 +40,12 @@ use crate::semantic_index::predicate::{
StarImportPlaceholderPredicate,
};
use crate::semantic_index::re_exports::exported_names;
use crate::semantic_index::reachability_constraints::{
ReachabilityConstraintsBuilder, ScopedReachabilityConstraintId,
};
use crate::semantic_index::use_def::{
EagerSnapshotKey, FlowSnapshot, ScopedEagerSnapshotId, UseDefMapBuilder,
};
use crate::semantic_index::visibility_constraints::{
ScopedVisibilityConstraintId, VisibilityConstraintsBuilder,
};
use crate::unpack::{Unpack, UnpackKind, UnpackPosition, UnpackValue};
use crate::{Db, Program};
@ -157,7 +157,7 @@ impl<'db, 'ast> SemanticIndexBuilder<'db, 'ast> {
builder.push_scope_with_parent(
NodeWithScopeRef::Module,
None,
ScopedVisibilityConstraintId::ALWAYS_TRUE,
ScopedReachabilityConstraintId::ALWAYS_TRUE,
);
builder
@ -237,7 +237,7 @@ impl<'db, 'ast> SemanticIndexBuilder<'db, 'ast> {
&mut self,
node: NodeWithScopeRef,
parent: Option<FileScopeId>,
reachability: ScopedVisibilityConstraintId,
reachability: ScopedReachabilityConstraintId,
) {
let children_start = self.scopes.next_index() + 1;
@ -354,9 +354,9 @@ impl<'db, 'ast> SemanticIndexBuilder<'db, 'ast> {
&self.use_def_maps[scope_id]
}
fn current_visibility_constraints_mut(&mut self) -> &mut VisibilityConstraintsBuilder {
fn current_reachability_constraints_mut(&mut self) -> &mut ReachabilityConstraintsBuilder {
let scope_id = self.current_scope();
&mut self.use_def_maps[scope_id].visibility_constraints
&mut self.use_def_maps[scope_id].reachability_constraints
}
fn current_ast_ids(&mut self) -> &mut AstIdsBuilder {
@ -576,55 +576,15 @@ impl<'db, 'ast> SemanticIndexBuilder<'db, 'ast> {
id
}
/// Records a previously added visibility constraint by applying it to all live bindings
/// and declarations.
fn record_visibility_constraint_id(&mut self, constraint: ScopedVisibilityConstraintId) {
self.current_use_def_map_mut()
.record_visibility_constraint(constraint);
}
/// Negates the given visibility constraint and then adds it to all live bindings and declarations.
fn record_negated_visibility_constraint(
&mut self,
constraint: ScopedVisibilityConstraintId,
) -> ScopedVisibilityConstraintId {
let id = self
.current_visibility_constraints_mut()
.add_not_constraint(constraint);
self.record_visibility_constraint_id(id);
id
}
/// Records a visibility constraint by applying it to all live bindings and declarations.
fn record_visibility_constraint(
&mut self,
predicate: Predicate<'db>,
) -> ScopedVisibilityConstraintId {
let predicate_id = self.current_use_def_map_mut().add_predicate(predicate);
let id = self
.current_visibility_constraints_mut()
.add_atom(predicate_id);
self.record_visibility_constraint_id(id);
id
}
/// Records that all remaining statements in the current block are unreachable, and therefore
/// not visible.
/// Records that all remaining statements in the current block are unreachable.
fn mark_unreachable(&mut self) {
self.current_use_def_map_mut().mark_unreachable();
}
/// Records a visibility constraint that always evaluates to "ambiguous".
fn record_ambiguous_visibility(&mut self) {
/// Records a reachability constraint that always evaluates to "ambiguous".
fn record_ambiguous_reachability(&mut self) {
self.current_use_def_map_mut()
.record_visibility_constraint(ScopedVisibilityConstraintId::AMBIGUOUS);
}
/// Simplifies (resets) visibility constraints on all live bindings and declarations that did
/// not see any new definitions since the given snapshot.
fn simplify_visibility_constraints(&mut self, snapshot: FlowSnapshot) {
self.current_use_def_map_mut()
.simplify_visibility_constraints(snapshot);
.record_reachability_constraint(ScopedReachabilityConstraintId::AMBIGUOUS);
}
/// Record a constraint that affects the reachability of the current position in the semantic
@ -634,7 +594,7 @@ impl<'db, 'ast> SemanticIndexBuilder<'db, 'ast> {
fn record_reachability_constraint(
&mut self,
predicate: Predicate<'db>,
) -> ScopedVisibilityConstraintId {
) -> ScopedReachabilityConstraintId {
let predicate_id = self.add_predicate(predicate);
self.record_reachability_constraint_id(predicate_id)
}
@ -643,22 +603,22 @@ impl<'db, 'ast> SemanticIndexBuilder<'db, 'ast> {
fn record_reachability_constraint_id(
&mut self,
predicate_id: ScopedPredicateId,
) -> ScopedVisibilityConstraintId {
let visibility_constraint = self
.current_visibility_constraints_mut()
) -> ScopedReachabilityConstraintId {
let reachability_constraint = self
.current_reachability_constraints_mut()
.add_atom(predicate_id);
self.current_use_def_map_mut()
.record_reachability_constraint(visibility_constraint);
visibility_constraint
.record_reachability_constraint(reachability_constraint);
reachability_constraint
}
/// Record the negation of a given reachability/visibility constraint.
/// Record the negation of a given reachability constraint.
fn record_negated_reachability_constraint(
&mut self,
reachability_constraint: ScopedVisibilityConstraintId,
reachability_constraint: ScopedReachabilityConstraintId,
) {
let negated_constraint = self
.current_visibility_constraints_mut()
.current_reachability_constraints_mut()
.add_not_constraint(reachability_constraint);
self.current_use_def_map_mut()
.record_reachability_constraint(negated_constraint);
@ -1139,12 +1099,10 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
let pre_return_state = matches!(last_stmt, ast::Stmt::Return(_))
.then(|| builder.flow_snapshot());
builder.visit_stmt(last_stmt);
let scope_start_visibility =
builder.current_use_def_map().scope_start_visibility;
let reachability = builder.current_use_def_map().reachability;
if let Some(pre_return_state) = pre_return_state {
builder.flow_restore(pre_return_state);
builder.current_use_def_map_mut().scope_start_visibility =
scope_start_visibility;
builder.current_use_def_map_mut().reachability = reachability;
}
}
@ -1297,11 +1255,11 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
continue;
};
// In order to understand the visibility of definitions created by a `*` import,
// we need to know the visibility of the global-scope definitions in the
// In order to understand the reachability of definitions created by a `*` import,
// we need to know the reachability of the global-scope definitions in the
// `referenced_module` the symbols imported from. Much like predicates for `if`
// statements can only have their visibility constraints resolved at type-inference
// time, the visibility of these global-scope definitions in the external module
// statements can only have their reachability constraints resolved at type-inference
// time, the reachability of these global-scope definitions in the external module
// cannot be resolved at this point. As such, we essentially model each definition
// stemming from a `from exporter *` import as something like:
//
@ -1328,7 +1286,7 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
self.current_use_def_map().single_place_snapshot(symbol_id);
self.push_additional_definition(symbol_id, node_ref);
self.current_use_def_map_mut()
.record_and_negate_star_import_visibility_constraint(
.record_and_negate_star_import_reachability_constraint(
star_import,
symbol_id,
pre_definition,
@ -1387,7 +1345,7 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
// reachability constraint on the `msg` expression.
//
// The other important part is the `<halt>`. This lets us skip the usual merging of
// flow states and simplification of visibility constraints, since there is no way
// flow states and simplification of reachability constraints, since there is no way
// of getting out of that `msg` branch. We simply restore to the post-test state.
self.visit_expr(test);
@ -1399,12 +1357,10 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
self.record_narrowing_constraint(negated_predicate);
self.record_reachability_constraint(negated_predicate);
self.visit_expr(msg);
self.record_visibility_constraint(negated_predicate);
self.flow_restore(post_test);
}
self.record_narrowing_constraint(predicate);
self.record_visibility_constraint(predicate);
self.record_reachability_constraint(predicate);
}
@ -1479,13 +1435,10 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
self.visit_expr(&node.test);
let mut no_branch_taken = self.flow_snapshot();
let mut last_predicate = self.record_expression_narrowing_constraint(&node.test);
let mut reachability_constraint =
let mut last_reachability_constraint =
self.record_reachability_constraint(last_predicate);
self.visit_body(&node.body);
let visibility_constraint_id = self.record_visibility_constraint(last_predicate);
let mut vis_constraints = vec![visibility_constraint_id];
let mut post_clauses: Vec<FlowSnapshot> = vec![];
let elif_else_clauses = node
.elif_else_clauses
@ -1509,38 +1462,27 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
// we can only take an elif/else branch if none of the previous ones were
// taken
self.flow_restore(no_branch_taken.clone());
self.record_negated_narrowing_constraint(last_predicate);
self.record_negated_reachability_constraint(reachability_constraint);
let elif_predicate = if let Some(elif_test) = clause_test {
self.record_negated_narrowing_constraint(last_predicate);
self.record_negated_reachability_constraint(last_reachability_constraint);
if let Some(elif_test) = clause_test {
self.visit_expr(elif_test);
// A test expression is evaluated whether the branch is taken or not
no_branch_taken = self.flow_snapshot();
reachability_constraint =
last_predicate = self.record_expression_narrowing_constraint(elif_test);
last_reachability_constraint =
self.record_reachability_constraint(last_predicate);
let predicate = self.record_expression_narrowing_constraint(elif_test);
Some(predicate)
} else {
None
};
}
self.visit_body(clause_body);
for id in &vis_constraints {
self.record_negated_visibility_constraint(*id);
}
if let Some(elif_predicate) = elif_predicate {
last_predicate = elif_predicate;
let id = self.record_visibility_constraint(elif_predicate);
vis_constraints.push(id);
}
}
for post_clause_state in post_clauses {
self.flow_merge(post_clause_state);
}
self.simplify_visibility_constraints(no_branch_taken);
}
ast::Stmt::While(ast::StmtWhile {
test,
@ -1555,57 +1497,49 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
let predicate = self.record_expression_narrowing_constraint(test);
self.record_reachability_constraint(predicate);
// We need multiple copies of the visibility constraint for the while condition,
// We need multiple copies of the reachability constraint for the while condition,
// since we need to model situations where the first evaluation of the condition
// returns True, but a later evaluation returns False.
let first_predicate_id = self.current_use_def_map_mut().add_predicate(predicate);
let later_predicate_id = self.current_use_def_map_mut().add_predicate(predicate);
let first_vis_constraint_id = self
.current_visibility_constraints_mut()
.add_atom(first_predicate_id);
let later_vis_constraint_id = self
.current_visibility_constraints_mut()
.current_reachability_constraints_mut()
.add_atom(later_predicate_id);
// If the body is executed, we know that we've evaluated the condition at least
// once, and that the first evaluation was True. We might not have evaluated the
// condition more than once, so we can't assume that later evaluations were True.
// So the body's full reachability constraint is `first`.
self.record_reachability_constraint_id(first_predicate_id);
let outer_loop = self.push_loop();
self.visit_body(body);
let this_loop = self.pop_loop(outer_loop);
// If the body is executed, we know that we've evaluated the condition at least
// once, and that the first evaluation was True. We might not have evaluated the
// condition more than once, so we can't assume that later evaluations were True.
// So the body's full visibility constraint is `first`.
let body_vis_constraint_id = first_vis_constraint_id;
self.record_visibility_constraint_id(body_vis_constraint_id);
// We execute the `else` once the condition evaluates to false. This could happen
// without ever executing the body, if the condition is false the first time it's
// tested. So the starting flow state of the `else` clause is the union of:
// - the pre-loop state with a visibility constraint that the first evaluation of
// - the pre-loop state with a reachability constraint that the first evaluation of
// the while condition was false,
// - the post-body state (which already has a visibility constraint that the
// first evaluation was true) with a visibility constraint that a _later_
// - the post-body state (which already has a reachability constraint that the
// first evaluation was true) with a reachability constraint that a _later_
// evaluation of the while condition was false.
// To model this correctly, we need two copies of the while condition constraint,
// since the first and later evaluations might produce different results.
let post_body = self.flow_snapshot();
self.flow_restore(pre_loop.clone());
self.record_negated_visibility_constraint(first_vis_constraint_id);
self.flow_restore(pre_loop);
self.flow_merge(post_body);
self.record_negated_narrowing_constraint(predicate);
self.record_negated_reachability_constraint(later_vis_constraint_id);
self.visit_body(orelse);
self.record_negated_visibility_constraint(later_vis_constraint_id);
// Breaking out of a while loop bypasses the `else` clause, so merge in the break
// states after visiting `else`.
for break_state in this_loop.break_states {
let snapshot = self.flow_snapshot();
self.flow_restore(break_state);
self.record_visibility_constraint_id(body_vis_constraint_id);
self.flow_merge(snapshot);
}
self.simplify_visibility_constraints(pre_loop);
}
ast::Stmt::With(ast::StmtWith {
items,
@ -1652,7 +1586,7 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
let iter_expr = self.add_standalone_expression(iter);
self.visit_expr(iter);
self.record_ambiguous_visibility();
self.record_ambiguous_reachability();
let pre_loop = self.flow_snapshot();
@ -1709,10 +1643,15 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
&case.pattern,
case.guard.as_deref(),
);
let vis_constraint_id = self.record_reachability_constraint(match_predicate);
let reachability_constraint =
self.record_reachability_constraint(match_predicate);
let match_success_guard_failure = case.guard.as_ref().map(|guard| {
let guard_expr = self.add_standalone_expression(guard);
// We could also add the guard expression as a reachability constraint, but
// it seems unlikely that both the case predicate as well as the guard are
// statically known conditions, so we currently don't model that.
self.record_ambiguous_reachability();
self.visit_expr(guard);
let post_guard_eval = self.flow_snapshot();
let predicate = Predicate {
@ -1726,8 +1665,6 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
match_success_guard_failure
});
self.record_visibility_constraint_id(vis_constraint_id);
self.visit_body(&case.body);
post_case_snapshots.push(self.flow_snapshot());
@ -1738,6 +1675,7 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
// snapshots into.
self.flow_restore(no_case_matched.clone());
self.record_negated_narrowing_constraint(match_predicate);
self.record_negated_reachability_constraint(reachability_constraint);
if let Some(match_success_guard_failure) = match_success_guard_failure {
self.flow_merge(match_success_guard_failure);
} else {
@ -1748,15 +1686,12 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
debug_assert!(case.guard.is_none());
}
self.record_negated_visibility_constraint(vis_constraint_id);
no_case_matched = self.flow_snapshot();
}
for post_clause_state in post_case_snapshots {
self.flow_merge(post_clause_state);
}
self.simplify_visibility_constraints(no_case_matched);
}
ast::Stmt::Try(ast::StmtTry {
body,
@ -1767,7 +1702,7 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
range: _,
node_index: _,
}) => {
self.record_ambiguous_visibility();
self.record_ambiguous_reachability();
// Save the state prior to visiting any of the `try` block.
//
@ -2136,16 +2071,13 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
let predicate = self.record_expression_narrowing_constraint(test);
let reachability_constraint = self.record_reachability_constraint(predicate);
self.visit_expr(body);
let visibility_constraint = self.record_visibility_constraint(predicate);
let post_body = self.flow_snapshot();
self.flow_restore(pre_if.clone());
self.flow_restore(pre_if);
self.record_negated_narrowing_constraint(predicate);
self.record_negated_reachability_constraint(reachability_constraint);
self.visit_expr(orelse);
self.record_negated_visibility_constraint(visibility_constraint);
self.flow_merge(post_body);
self.simplify_visibility_constraints(pre_if);
}
ast::Expr::ListComp(
list_comprehension @ ast::ExprListComp {
@ -2203,18 +2135,17 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
node_index: _,
op,
}) => {
let pre_op = self.flow_snapshot();
let mut snapshots = vec![];
let mut visibility_constraints = vec![];
let mut reachability_constraints = vec![];
for (index, value) in values.iter().enumerate() {
self.visit_expr(value);
for vid in &visibility_constraints {
self.record_visibility_constraint_id(*vid);
for id in &reachability_constraints {
self.current_use_def_map_mut()
.record_reachability_constraint(*id); // TODO: nicer API
}
self.visit_expr(value);
// For the last value, we don't need to model control flow. There is no short-circuiting
// anymore.
if index < values.len() - 1 {
@ -2223,37 +2154,32 @@ impl<'ast> Visitor<'ast> for SemanticIndexBuilder<'_, 'ast> {
ast::BoolOp::And => self.add_predicate(predicate),
ast::BoolOp::Or => self.add_negated_predicate(predicate),
};
let visibility_constraint = self
.current_visibility_constraints_mut()
let reachability_constraint = self
.current_reachability_constraints_mut()
.add_atom(predicate_id);
let after_expr = self.flow_snapshot();
// We first model the short-circuiting behavior. We take the short-circuit
// path here if all of the previous short-circuit paths were not taken, so
// we record all previously existing visibility constraints, and negate the
// we record all previously existing reachability constraints, and negate the
// one for the current expression.
for vid in &visibility_constraints {
self.record_visibility_constraint_id(*vid);
}
self.record_negated_visibility_constraint(visibility_constraint);
self.record_negated_reachability_constraint(reachability_constraint);
snapshots.push(self.flow_snapshot());
// Then we model the non-short-circuiting behavior. Here, we need to delay
// the application of the visibility constraint until after the expression
// the application of the reachability constraint until after the expression
// has been evaluated, so we only push it onto the stack here.
self.flow_restore(after_expr);
self.record_narrowing_constraint_id(predicate_id);
self.record_reachability_constraint_id(predicate_id);
visibility_constraints.push(visibility_constraint);
reachability_constraints.push(reachability_constraint);
}
}
for snapshot in snapshots {
self.flow_merge(snapshot);
}
self.simplify_visibility_constraints(pre_op);
}
ast::Expr::StringLiteral(_) => {
// Track reachability of string literals, as they could be a stringified annotation