[ty] AST garbage collection (#18482)

## Summary

Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.

The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.

The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.

Part of https://github.com/astral-sh/ty/issues/214.
This commit is contained in:
Ibraheem Ahmed 2025-06-13 08:40:11 -04:00 committed by GitHub
parent 76d9009a6e
commit c9dff5c7d5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
824 changed files with 25243 additions and 804 deletions

View file

@ -3,7 +3,7 @@
use bstr::ByteSlice;
use std::fmt;
use ruff_python_ast::{self as ast, AnyStringFlags, Expr, StringFlags};
use ruff_python_ast::{self as ast, AnyStringFlags, AtomicNodeIndex, Expr, StringFlags};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::{
@ -287,6 +287,7 @@ impl StringParser {
return Ok(ast::InterpolatedStringLiteralElement {
value: self.source,
range: self.range,
node_index: AtomicNodeIndex::dummy(),
});
};
@ -364,6 +365,7 @@ impl StringParser {
Ok(ast::InterpolatedStringLiteralElement {
value: value.into_boxed_str(),
range: self.range,
node_index: AtomicNodeIndex::dummy(),
})
}
@ -385,6 +387,7 @@ impl StringParser {
value: self.source.into_boxed_bytes(),
range: self.range,
flags: self.flags.into(),
node_index: AtomicNodeIndex::dummy(),
}));
}
@ -394,6 +397,7 @@ impl StringParser {
value: self.source.into_boxed_bytes(),
range: self.range,
flags: self.flags.into(),
node_index: AtomicNodeIndex::dummy(),
}));
};
@ -431,6 +435,7 @@ impl StringParser {
value: value.into_boxed_slice(),
range: self.range,
flags: self.flags.into(),
node_index: AtomicNodeIndex::dummy(),
}))
}
@ -441,6 +446,7 @@ impl StringParser {
value: self.source,
range: self.range,
flags: self.flags.into(),
node_index: AtomicNodeIndex::dummy(),
}));
}
@ -450,6 +456,7 @@ impl StringParser {
value: self.source,
range: self.range,
flags: self.flags.into(),
node_index: AtomicNodeIndex::dummy(),
}));
};
@ -487,6 +494,7 @@ impl StringParser {
value: value.into_boxed_str(),
range: self.range,
flags: self.flags.into(),
node_index: AtomicNodeIndex::dummy(),
}))
}