[ty] AST garbage collection (#18482)

## Summary Garbage collect ASTs once we are done checking a given file. Queries with a cross-file dependency on the AST will reparse the file on demand. This reduces ty's peak memory usage by ~20-30%. The primary change of this PR is adding a `node_index` field to every AST node, that is assigned by the parser. `ParsedModule` can use this to create a flat index of AST nodes any time the file is parsed (or reparsed). This allows `AstNodeRef` to simply index into the current instance of the `ParsedModule`, instead of storing a pointer directly. The indices are somewhat hackily (using an atomic integer) assigned by the `parsed_module` query instead of by the parser directly. Assigning the indices in source-order in the (recursive) parser turns out to be difficult, and collecting the nodes during semantic indexing is impossible as `SemanticIndex` does not hold onto a specific `ParsedModuleRef`, which the pointers in the flat AST are tied to. This means that we have to do an extra AST traversal to assign and collect the nodes into a flat index, but the small performance impact (~3% on cold runs) seems worth it for the memory savings. Part of https://github.com/astral-sh/ty/issues/214.
2025-08-17 17:10:53 +00:00 · 2025-06-13 08:40:11 -04:00 · 2025-06-13 08:40:11 -04:00 · c9dff5c7d5
commit c9dff5c7d5
parent 76d9009a6e
824 changed files with 25243 additions and 804 deletions
--- a/crates/ruff_python_parser/src/parser/mod.rs
+++ b/crates/ruff_python_parser/src/parser/mod.rs
@ -2,7 +2,7 @@ use std::cmp::Ordering;

 use bitflags::bitflags;

-use ruff_python_ast::{Mod, ModExpression, ModModule};
+use ruff_python_ast::{AtomicNodeIndex, Mod, ModExpression, ModModule};
 use ruff_text_size::{Ranged, TextRange, TextSize};

 use crate::error::UnsupportedSyntaxError;
@ -132,6 +132,7 @@ impl<'src> Parser<'src> {
        ModExpression {
            body: Box::new(parsed_expr.expr),
            range: self.node_range(start),
+            node_index: AtomicNodeIndex::dummy(),
        }
    }

@ -149,6 +150,7 @@ impl<'src> Parser<'src> {
        ModModule {
            body,
            range: TextRange::new(self.start_offset, self.current_token_range().end()),
+            node_index: AtomicNodeIndex::dummy(),
        }
    }