[ty] AST garbage collection (#18482)

## Summary

Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.

The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.

The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.

Part of https://github.com/astral-sh/ty/issues/214.
This commit is contained in:
Ibraheem Ahmed 2025-06-13 08:40:11 -04:00 committed by GitHub
parent 76d9009a6e
commit c9dff5c7d5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
824 changed files with 25243 additions and 804 deletions

View file

@ -1,23 +1,26 @@
---
source: crates/ruff_python_parser/tests/fixtures.rs
input_file: crates/ruff_python_parser/resources/valid/expressions/await.py
snapshot_kind: text
---
## AST
```
Module(
ModModule {
node_index: AtomicNodeIndex(..),
range: 0..211,
body: [
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 0..7,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 0..7,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 6..7,
id: Name("x"),
ctx: Load,
@ -29,15 +32,19 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 8..19,
value: BinOp(
ExprBinOp {
node_index: AtomicNodeIndex(..),
range: 8..19,
left: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 8..15,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 14..15,
id: Name("x"),
ctx: Load,
@ -48,6 +55,7 @@ Module(
op: Add,
right: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 18..19,
value: Int(
1,
@ -60,17 +68,21 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 20..33,
value: BoolOp(
ExprBoolOp {
node_index: AtomicNodeIndex(..),
range: 20..33,
op: And,
values: [
Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 20..27,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 26..27,
id: Name("a"),
ctx: Load,
@ -80,6 +92,7 @@ Module(
),
Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 32..33,
id: Name("b"),
ctx: Load,
@ -92,15 +105,19 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 34..43,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 34..43,
value: Call(
ExprCall {
node_index: AtomicNodeIndex(..),
range: 40..43,
func: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 40..41,
id: Name("f"),
ctx: Load,
@ -108,6 +125,7 @@ Module(
),
arguments: Arguments {
range: 41..43,
node_index: AtomicNodeIndex(..),
args: [],
keywords: [],
},
@ -119,16 +137,20 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 44..56,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 44..56,
value: List(
ExprList {
node_index: AtomicNodeIndex(..),
range: 50..56,
elts: [
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 51..52,
value: Int(
1,
@ -137,6 +159,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 54..55,
value: Int(
2,
@ -153,16 +176,20 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 57..69,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 57..69,
value: Set(
ExprSet {
node_index: AtomicNodeIndex(..),
range: 63..69,
elts: [
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 64..65,
value: Int(
3,
@ -171,6 +198,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 67..68,
value: Int(
4,
@ -186,18 +214,22 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 70..82,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 70..82,
value: Dict(
ExprDict {
node_index: AtomicNodeIndex(..),
range: 76..82,
items: [
DictItem {
key: Some(
Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 77..78,
id: Name("i"),
ctx: Load,
@ -206,6 +238,7 @@ Module(
),
value: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 80..81,
value: Int(
5,
@ -222,16 +255,20 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 83..93,
value: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 83..93,
elts: [
Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 83..90,
value: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 89..90,
value: Int(
7,
@ -242,6 +279,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 92..93,
value: Int(
8,
@ -257,16 +295,20 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 94..107,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 94..107,
value: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 100..107,
elts: [
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 101..102,
value: Int(
9,
@ -275,6 +317,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 104..106,
value: Int(
10,
@ -292,15 +335,19 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 108..120,
value: Compare(
ExprCompare {
node_index: AtomicNodeIndex(..),
range: 108..120,
left: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 108..115,
value: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 114..115,
value: Int(
1,
@ -315,6 +362,7 @@ Module(
comparators: [
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 119..120,
value: Int(
1,
@ -328,21 +376,26 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 121..146,
value: If(
ExprIf {
node_index: AtomicNodeIndex(..),
range: 121..146,
test: BooleanLiteral(
ExprBooleanLiteral {
node_index: AtomicNodeIndex(..),
range: 132..136,
value: true,
},
),
body: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 121..128,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 127..128,
id: Name("x"),
ctx: Load,
@ -352,6 +405,7 @@ Module(
),
orelse: NoneLiteral(
ExprNoneLiteral {
node_index: AtomicNodeIndex(..),
range: 142..146,
},
),
@ -361,19 +415,24 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 147..158,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 147..158,
value: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 153..158,
elts: [
Starred(
ExprStarred {
node_index: AtomicNodeIndex(..),
range: 154..156,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 155..156,
id: Name("x"),
ctx: Load,
@ -393,25 +452,34 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 159..178,
value: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 159..178,
value: Lambda(
ExprLambda {
node_index: AtomicNodeIndex(..),
range: 166..177,
parameters: Some(
Parameters {
range: 173..174,
node_index: AtomicNodeIndex(
0,
),
posonlyargs: [],
args: [
ParameterWithDefault {
range: 173..174,
node_index: AtomicNodeIndex(..),
parameter: Parameter {
range: 173..174,
node_index: AtomicNodeIndex(..),
name: Identifier {
id: Name("x"),
range: 173..174,
node_index: AtomicNodeIndex(..),
},
annotation: None,
},
@ -425,6 +493,7 @@ Module(
),
body: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 176..177,
id: Name("x"),
ctx: Load,
@ -438,15 +507,19 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 179..192,
value: BinOp(
ExprBinOp {
node_index: AtomicNodeIndex(..),
range: 179..192,
left: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 179..186,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 185..186,
id: Name("x"),
ctx: Load,
@ -457,10 +530,12 @@ Module(
op: Pow,
right: UnaryOp(
ExprUnaryOp {
node_index: AtomicNodeIndex(..),
range: 190..192,
op: USub,
operand: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 191..192,
id: Name("x"),
ctx: Load,
@ -474,15 +549,19 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 193..211,
value: BinOp(
ExprBinOp {
node_index: AtomicNodeIndex(..),
range: 193..211,
left: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 193..200,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 199..200,
id: Name("x"),
ctx: Load,
@ -493,9 +572,11 @@ Module(
op: Pow,
right: Await(
ExprAwait {
node_index: AtomicNodeIndex(..),
range: 204..211,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 210..211,
id: Name("y"),
ctx: Load,