[ty] AST garbage collection (#18482)

## Summary

Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.

The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.

The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.

Part of https://github.com/astral-sh/ty/issues/214.
This commit is contained in:
Ibraheem Ahmed 2025-06-13 08:40:11 -04:00 committed by GitHub
parent 76d9009a6e
commit c9dff5c7d5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
824 changed files with 25243 additions and 804 deletions

View file

@ -1,25 +1,28 @@
---
source: crates/ruff_python_parser/tests/fixtures.rs
input_file: crates/ruff_python_parser/resources/valid/expressions/string.py
snapshot_kind: text
---
## AST
```
Module(
ModModule {
node_index: AtomicNodeIndex(..),
range: 0..163,
body: [
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 0..13,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 0..13,
value: StringLiteralValue {
inner: Single(
StringLiteral {
range: 0..13,
node_index: AtomicNodeIndex(..),
value: "Hello World",
flags: StringLiteralFlags {
quote_style: Single,
@ -35,14 +38,17 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 14..20,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 14..20,
value: StringLiteralValue {
inner: Single(
StringLiteral {
range: 14..20,
node_index: AtomicNodeIndex(..),
value: "😎",
flags: StringLiteralFlags {
quote_style: Double,
@ -58,9 +64,11 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 21..32,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 21..32,
value: StringLiteralValue {
inner: Concatenated(
@ -68,6 +76,7 @@ Module(
strings: [
StringLiteral {
range: 21..26,
node_index: AtomicNodeIndex(..),
value: "Foo",
flags: StringLiteralFlags {
quote_style: Single,
@ -77,6 +86,7 @@ Module(
},
StringLiteral {
range: 27..32,
node_index: AtomicNodeIndex(..),
value: "Bar",
flags: StringLiteralFlags {
quote_style: Single,
@ -95,9 +105,11 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 33..60,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 39..58,
value: StringLiteralValue {
inner: Concatenated(
@ -105,6 +117,7 @@ Module(
strings: [
StringLiteral {
range: 39..42,
node_index: AtomicNodeIndex(..),
value: "A",
flags: StringLiteralFlags {
quote_style: Single,
@ -114,6 +127,7 @@ Module(
},
StringLiteral {
range: 47..50,
node_index: AtomicNodeIndex(..),
value: "B",
flags: StringLiteralFlags {
quote_style: Single,
@ -123,6 +137,7 @@ Module(
},
StringLiteral {
range: 55..58,
node_index: AtomicNodeIndex(..),
value: "C",
flags: StringLiteralFlags {
quote_style: Single,
@ -141,14 +156,17 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 61..79,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 61..79,
value: StringLiteralValue {
inner: Single(
StringLiteral {
range: 61..79,
node_index: AtomicNodeIndex(..),
value: "Olá, Mundo!",
flags: StringLiteralFlags {
quote_style: Single,
@ -164,14 +182,17 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 80..91,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 80..91,
value: StringLiteralValue {
inner: Single(
StringLiteral {
range: 80..91,
node_index: AtomicNodeIndex(..),
value: "ABCDE",
flags: StringLiteralFlags {
quote_style: Double,
@ -187,9 +208,11 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 92..121,
value: StringLiteral(
ExprStringLiteral {
node_index: AtomicNodeIndex(..),
range: 98..119,
value: StringLiteralValue {
inner: Concatenated(
@ -197,6 +220,7 @@ Module(
strings: [
StringLiteral {
range: 98..106,
node_index: AtomicNodeIndex(..),
value: "aB",
flags: StringLiteralFlags {
quote_style: Single,
@ -206,6 +230,7 @@ Module(
},
StringLiteral {
range: 111..119,
node_index: AtomicNodeIndex(..),
value: "cD",
flags: StringLiteralFlags {
quote_style: Single,
@ -224,14 +249,17 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 122..136,
value: BytesLiteral(
ExprBytesLiteral {
node_index: AtomicNodeIndex(..),
range: 122..136,
value: BytesLiteralValue {
inner: Single(
BytesLiteral {
range: 122..136,
node_index: AtomicNodeIndex(..),
value: [
104,
101,
@ -259,15 +287,18 @@ Module(
),
Expr(
StmtExpr {
node_index: AtomicNodeIndex(..),
range: 137..161,
value: BytesLiteral(
ExprBytesLiteral {
node_index: AtomicNodeIndex(..),
range: 137..161,
value: BytesLiteralValue {
inner: Concatenated(
[
BytesLiteral {
range: 137..145,
node_index: AtomicNodeIndex(..),
value: [
98,
121,
@ -283,6 +314,7 @@ Module(
},
BytesLiteral {
range: 146..161,
node_index: AtomicNodeIndex(..),
value: [
99,
111,