gh-112529: Make the GC scheduling thread-safe (#114880)

The GC keeps track of the number of allocations (less deallocations) since the last GC. This buffers the count in thread-local state and uses atomic operations to modify the per-interpreter count. The thread-local buffering avoids contention on shared state. A consequence is that the GC scheduling is not as precise, so "test_sneaky_frame_object" is skipped because it requires that the GC be run exactly after allocating a frame object.
2025-11-20 02:50:14 +00:00 · 2024-02-16 11:22:27 -05:00 · 2024-02-16 11:22:27 -05:00 · b24c9161a6
commit b24c9161a6
parent f92857a930
7 changed files with 71 additions and 16 deletions
--- a/Modules/gcmodule.c
+++ b/Modules/gcmodule.c
@ -201,6 +201,16 @@ gc_get_count_impl(PyObject *module)
 /*[clinic end generated code: output=354012e67b16398f input=a392794a08251751]*/
 {
    GCState *gcstate = get_gc_state();
+
+#ifdef Py_GIL_DISABLED
+    _PyThreadStateImpl *tstate = (_PyThreadStateImpl *)_PyThreadState_GET();
+    struct _gc_thread_state *gc = &tstate->gc;
+
+    // Flush the local allocation count to the global count
+    _Py_atomic_add_int(&gcstate->generations[0].count, (int)gc->alloc_count);
+    gc->alloc_count = 0;
+#endif
+
    return Py_BuildValue("(iii)",
                         gcstate->generations[0].count,
                         gcstate->generations[1].count,