[ty] AST garbage collection (#18482)

## Summary

Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.

The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.

The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.

Part of https://github.com/astral-sh/ty/issues/214.
This commit is contained in:
Ibraheem Ahmed 2025-06-13 08:40:11 -04:00 committed by GitHub
parent 76d9009a6e
commit c9dff5c7d5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
824 changed files with 25243 additions and 804 deletions

View file

@ -1,26 +1,30 @@
---
source: crates/ruff_python_parser/tests/fixtures.rs
input_file: crates/ruff_python_parser/resources/valid/expressions/subscript.py
snapshot_kind: text
---
## AST
```
Module(
ModModule {
node_index: AtomicNodeIndex(..),
range: 0..266,
body: [
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 0..10,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 0..10,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 0..7,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 0..4,
id: Name("data"),
ctx: Load,
@ -28,6 +32,7 @@ Module(
),
slice: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 5..6,
value: Int(
0,
@ -39,6 +44,7 @@ Module(
),
slice: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 8..9,
value: Int(
0,
@ -52,12 +58,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 11..21,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 11..21,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 11..15,
id: Name("data"),
ctx: Load,
@ -65,10 +74,12 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 16..20,
elts: [
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 16..17,
value: Int(
0,
@ -77,6 +88,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 19..20,
value: Int(
1,
@ -95,12 +107,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 22..31,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 22..31,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 22..26,
id: Name("data"),
ctx: Load,
@ -108,14 +123,17 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 27..30,
elts: [
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 27..29,
lower: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 27..28,
value: Int(
0,
@ -139,12 +157,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 32..43,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 32..43,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 32..36,
id: Name("data"),
ctx: Load,
@ -152,14 +173,17 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 37..42,
elts: [
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 37..39,
lower: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 37..38,
value: Int(
0,
@ -173,6 +197,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 41..42,
value: Int(
1,
@ -191,12 +216,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 44..56,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 44..56,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 44..48,
id: Name("data"),
ctx: Load,
@ -204,14 +232,17 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 49..55,
elts: [
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 49..52,
lower: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 49..50,
value: Int(
0,
@ -222,6 +253,7 @@ Module(
upper: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 51..52,
value: Int(
1,
@ -234,6 +266,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 54..55,
value: Int(
2,
@ -252,12 +285,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 57..80,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 57..80,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 57..61,
id: Name("data"),
ctx: Load,
@ -265,14 +301,17 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 62..79,
elts: [
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 62..67,
lower: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 62..63,
value: Int(
0,
@ -283,6 +322,7 @@ Module(
upper: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 64..65,
value: Int(
1,
@ -293,6 +333,7 @@ Module(
step: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 66..67,
value: Int(
2,
@ -304,6 +345,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 69..70,
value: Int(
3,
@ -312,10 +354,12 @@ Module(
),
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 72..79,
lower: Some(
Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 72..73,
id: Name("a"),
ctx: Load,
@ -325,9 +369,11 @@ Module(
upper: Some(
BinOp(
ExprBinOp {
node_index: AtomicNodeIndex(..),
range: 74..79,
left: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 74..75,
id: Name("b"),
ctx: Load,
@ -336,6 +382,7 @@ Module(
op: Add,
right: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 78..79,
value: Int(
1,
@ -360,12 +407,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 81..93,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 81..93,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 81..85,
id: Name("data"),
ctx: Load,
@ -373,9 +423,11 @@ Module(
),
slice: Named(
ExprNamed {
node_index: AtomicNodeIndex(..),
range: 86..92,
target: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 86..87,
id: Name("a"),
ctx: Store,
@ -383,6 +435,7 @@ Module(
),
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 91..92,
id: Name("b"),
ctx: Load,
@ -397,12 +450,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 94..106,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 94..106,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 94..98,
id: Name("data"),
ctx: Load,
@ -410,10 +466,12 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 99..105,
elts: [
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 99..100,
lower: None,
upper: None,
@ -422,11 +480,13 @@ Module(
),
Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 102..105,
lower: None,
upper: Some(
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 103..105,
value: Int(
11,
@ -449,12 +509,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 107..120,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 107..120,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 107..111,
id: Name("data"),
ctx: Load,
@ -462,10 +525,12 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 112..119,
elts: [
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 112..113,
value: Int(
1,
@ -474,6 +539,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 115..116,
value: Int(
2,
@ -482,6 +548,7 @@ Module(
),
NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 118..119,
value: Int(
3,
@ -500,12 +567,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 121..132,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 121..132,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 121..125,
id: Name("data"),
ctx: Load,
@ -513,10 +583,12 @@ Module(
),
slice: UnaryOp(
ExprUnaryOp {
node_index: AtomicNodeIndex(..),
range: 126..131,
op: Invert,
operand: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 127..131,
id: Name("flag"),
ctx: Load,
@ -531,12 +603,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 133..148,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 133..148,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 133..137,
id: Name("data"),
ctx: Load,
@ -544,13 +619,16 @@ Module(
),
slice: Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 138..147,
lower: Some(
Named(
ExprNamed {
node_index: AtomicNodeIndex(..),
range: 139..145,
target: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 139..140,
id: Name("a"),
ctx: Store,
@ -558,6 +636,7 @@ Module(
),
value: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 144..145,
value: Int(
0,
@ -578,12 +657,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 149..165,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 149..165,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 149..153,
id: Name("data"),
ctx: Load,
@ -591,13 +673,16 @@ Module(
),
slice: Slice(
ExprSlice {
node_index: AtomicNodeIndex(..),
range: 154..164,
lower: Some(
Named(
ExprNamed {
node_index: AtomicNodeIndex(..),
range: 155..161,
target: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 155..156,
id: Name("a"),
ctx: Store,
@ -605,6 +690,7 @@ Module(
),
value: NumberLiteral(
ExprNumberLiteral {
node_index: AtomicNodeIndex(..),
range: 160..161,
value: Int(
0,
@ -617,6 +703,7 @@ Module(
upper: Some(
Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 163..164,
id: Name("y"),
ctx: Load,
@ -633,12 +720,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 226..234,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 226..234,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 226..230,
id: Name("data"),
ctx: Load,
@ -646,13 +736,16 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 231..233,
elts: [
Starred(
ExprStarred {
node_index: AtomicNodeIndex(..),
range: 231..233,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 232..233,
id: Name("x"),
ctx: Load,
@ -673,12 +766,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 235..249,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 235..249,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 235..239,
id: Name("data"),
ctx: Load,
@ -686,18 +782,22 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 240..248,
elts: [
Starred(
ExprStarred {
node_index: AtomicNodeIndex(..),
range: 240..248,
value: BoolOp(
ExprBoolOp {
node_index: AtomicNodeIndex(..),
range: 241..248,
op: And,
values: [
Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 241..242,
id: Name("x"),
ctx: Load,
@ -705,6 +805,7 @@ Module(
),
Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 247..248,
id: Name("y"),
ctx: Load,
@ -728,12 +829,15 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 250..265,
value: Subscript(
ExprSubscript {
node_index: AtomicNodeIndex(..),
range: 250..265,
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 250..254,
id: Name("data"),
ctx: Load,
@ -741,16 +845,20 @@ Module(
),
slice: Tuple(
ExprTuple {
node_index: AtomicNodeIndex(..),
range: 255..264,
elts: [
Starred(
ExprStarred {
node_index: AtomicNodeIndex(..),
range: 255..264,
value: Named(
ExprNamed {
node_index: AtomicNodeIndex(..),
range: 257..263,
target: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 257..258,
id: Name("x"),
ctx: Store,
@ -758,6 +866,7 @@ Module(
),
value: Name(
ExprName {
node_index: AtomicNodeIndex(..),
range: 262..263,
id: Name("y"),
ctx: Load,