Add BindingKind variants to represent deleted bindings (#5071)

## Summary

Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.

Consider:

```python
x = 1
print(x)
del x
```

If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.

In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.

This PR introduces two new `BindingKind` variants to model deletions:

- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:

```python
try:
  1 / 0
except Exception as e:
  pass
```

In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.

The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.

### Alternatives

As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.

This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.
This commit is contained in:
Charlie Marsh 2023-06-14 09:27:24 -04:00 committed by GitHub
parent bf5fbf8971
commit aa41ffcfde
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
11 changed files with 169 additions and 92 deletions

View file

@ -41,11 +41,20 @@ impl<'a> Binding<'a> {
}
/// Return `true` if this [`Binding`] represents an explicit re-export
/// (e.g., `import FastAPI as FastAPI`).
/// (e.g., `FastAPI` in `from fastapi import FastAPI as FastAPI`).
pub const fn is_explicit_export(&self) -> bool {
self.flags.contains(BindingFlags::EXPLICIT_EXPORT)
}
/// Return `true` if this [`Binding`] represents an unbound variable
/// (e.g., `x` in `x = 1; del x`).
pub const fn is_unbound(&self) -> bool {
matches!(
self.kind,
BindingKind::Annotation | BindingKind::Deletion | BindingKind::UnboundException
)
}
/// Return `true` if this binding redefines the given binding.
pub fn redefines(&self, existing: &'a Binding) -> bool {
match &self.kind {
@ -83,10 +92,10 @@ impl<'a> Binding<'a> {
_ => {}
}
}
BindingKind::Annotation => {
return false;
}
BindingKind::FutureImportation => {
BindingKind::Deletion
| BindingKind::Annotation
| BindingKind::FutureImportation
| BindingKind::Builtin => {
return false;
}
_ => {}
@ -95,7 +104,6 @@ impl<'a> Binding<'a> {
existing.kind,
BindingKind::ClassDefinition
| BindingKind::FunctionDefinition
| BindingKind::Builtin
| BindingKind::Importation(..)
| BindingKind::FromImportation(..)
| BindingKind::SubmoduleImportation(..)
@ -367,6 +375,24 @@ pub enum BindingKind<'a> {
/// import foo.bar
/// ```
SubmoduleImportation(SubmoduleImportation<'a>),
/// A binding for a deletion, like `x` in:
/// ```python
/// del x
/// ```
Deletion,
/// A binding to unbind the local variable, like `x` in:
/// ```python
/// try:
/// ...
/// except Exception as x:
/// ...
/// ```
///
/// After the `except` block, `x` is unbound, despite the lack
/// of an explicit `del` statement.
UnboundException,
}
bitflags! {

View file

@ -191,8 +191,9 @@ impl<'a> SemanticModel<'a> {
.map_or(false, |binding| binding.kind.is_builtin())
}
/// Return `true` if `member` is unbound.
pub fn is_unbound(&self, member: &str) -> bool {
/// Return `true` if `member` is an "available" symbol, i.e., a symbol that has not been bound
/// in the current scope, or in any containing scope.
pub fn is_available(&self, member: &str) -> bool {
self.find_binding(member)
.map_or(true, |binding| binding.kind.is_builtin())
}
@ -203,7 +204,7 @@ impl<'a> SemanticModel<'a> {
// should prefer it over local resolutions.
if self.in_forward_reference() {
if let Some(binding_id) = self.scopes.global().get(symbol) {
if !self.bindings[binding_id].kind.is_annotation() {
if !self.bindings[binding_id].is_unbound() {
// Mark the binding as used.
let context = self.execution_context();
let reference_id = self.references.push(ScopeId::global(), range, context);
@ -254,17 +255,29 @@ impl<'a> SemanticModel<'a> {
self.bindings[binding_id].references.push(reference_id);
}
// But if it's a type annotation, don't treat it as resolved. For example, given:
//
// ```python
// name: str
// print(name)
// ```
//
// The `name` in `print(name)` should be treated as unresolved, but the `name` in
// `name: str` should be treated as used.
if self.bindings[binding_id].kind.is_annotation() {
continue;
match self.bindings[binding_id].kind {
// If it's a type annotation, don't treat it as resolved. For example, given:
//
// ```python
// name: str
// print(name)
// ```
//
// The `name` in `print(name)` should be treated as unresolved, but the `name` in
// `name: str` should be treated as used.
BindingKind::Annotation => continue,
// If it's a deletion, don't treat it as resolved, since the name is now
// unbound. For example, given:
//
// ```python
// x = 1
// del x
// print(x)
// ```
//
// The `x` in `print(x)` should be treated as unresolved.
BindingKind::Deletion | BindingKind::UnboundException => break,
_ => {}
}
return ResolvedRead::Resolved(binding_id);
@ -618,9 +631,11 @@ impl<'a> SemanticModel<'a> {
pub fn set_globals(&mut self, globals: Globals<'a>) {
// If any global bindings don't already exist in the global scope, add them.
for (name, range) in globals.iter() {
if self.global_scope().get(name).map_or(true, |binding_id| {
self.bindings[binding_id].kind.is_annotation()
}) {
if self
.global_scope()
.get(name)
.map_or(true, |binding_id| self.bindings[binding_id].is_unbound())
{
let id = self.bindings.push(Binding {
kind: BindingKind::Assignment,
range: *range,

View file

@ -38,9 +38,6 @@ pub struct Scope<'a> {
/// In this case, the binding created by `x = 2` shadows the binding created by `x = 1`.
shadowed_bindings: HashMap<BindingId, BindingId, BuildNoHashHasher<BindingId>>,
/// A list of all names that have been deleted in this scope.
deleted_symbols: Vec<&'a str>,
/// Index into the globals arena, if the scope contains any globally-declared symbols.
globals_id: Option<GlobalsId>,
@ -56,7 +53,6 @@ impl<'a> Scope<'a> {
star_imports: Vec::default(),
bindings: FxHashMap::default(),
shadowed_bindings: IntMap::default(),
deleted_symbols: Vec::default(),
globals_id: None,
flags: ScopeFlags::empty(),
}
@ -69,7 +65,6 @@ impl<'a> Scope<'a> {
star_imports: Vec::default(),
bindings: FxHashMap::default(),
shadowed_bindings: IntMap::default(),
deleted_symbols: Vec::default(),
globals_id: None,
flags: ScopeFlags::empty(),
}
@ -92,7 +87,6 @@ impl<'a> Scope<'a> {
/// Removes the binding with the given name.
pub fn delete(&mut self, name: &'a str) -> Option<BindingId> {
self.deleted_symbols.push(name);
self.bindings.remove(name)
}
@ -101,14 +95,6 @@ impl<'a> Scope<'a> {
self.bindings.contains_key(name)
}
/// Returns `true` if the scope declares a symbol with the given name.
///
/// Unlike [`Scope::has`], the name may no longer be bound to a value (e.g., it could be
/// deleted).
pub fn declares(&self, name: &str) -> bool {
self.has(name) || self.deleted_symbols.contains(&name)
}
/// Returns the ids of all bindings defined in this scope.
pub fn binding_ids(&self) -> impl Iterator<Item = BindingId> + '_ {
self.bindings.values().copied()