GH-133136: Revise QSBR to reduce excess memory held (gh-135473)

The free threading build uses QSBR to delay the freeing of dictionary
keys and list arrays when the objects are accessed by multiple threads
in order to allow concurrent reads to proceed with holding the object
lock. The requests are processed in batches to reduce execution
overhead, but for large memory blocks this can lead to excess memory
usage.

Take into account the size of the memory block when deciding when to
process QSBR requests.

Also track the amount of memory being held by QSBR for mimalloc pages.  Advance the write sequence if this memory exceeds a limit.  Advancing the sequence will allow it to be freed more quickly.

Process the held QSBR items from the "eval breaker", rather than from `_PyMem_FreeDelayed()`.  This gives a higher chance that the global read sequence has advanced enough so that items can be freed.

Co-authored-by: Sam Gross <colesbury@gmail.com>
This commit is contained in:
Neil Schemenauer 2025-06-25 00:06:32 -07:00 committed by GitHub
parent 18d32fb646
commit 113de8545f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 129 additions and 27 deletions

View file

@ -1387,6 +1387,10 @@ _Py_HandlePending(PyThreadState *tstate)
_Py_unset_eval_breaker_bit(tstate, _PY_EVAL_EXPLICIT_MERGE_BIT);
_Py_brc_merge_refcounts(tstate);
}
/* Process deferred memory frees held by QSBR */
if (_Py_qsbr_should_process(((_PyThreadStateImpl *)tstate)->qsbr)) {
_PyMem_ProcessDelayed(tstate);
}
#endif
/* GC scheduled to run */

View file

@ -41,10 +41,6 @@
// Starting size of the array of qsbr thread states
#define MIN_ARRAY_SIZE 8
// For _Py_qsbr_deferred_advance(): the number of deferrals before advancing
// the write sequence.
#define QSBR_DEFERRED_LIMIT 10
// Allocate a QSBR thread state from the freelist
static struct _qsbr_thread_state *
qsbr_allocate(struct _qsbr_shared *shared)
@ -117,13 +113,9 @@ _Py_qsbr_advance(struct _qsbr_shared *shared)
}
uint64_t
_Py_qsbr_deferred_advance(struct _qsbr_thread_state *qsbr)
_Py_qsbr_shared_next(struct _qsbr_shared *shared)
{
if (++qsbr->deferrals < QSBR_DEFERRED_LIMIT) {
return _Py_qsbr_shared_current(qsbr->shared) + QSBR_INCR;
}
qsbr->deferrals = 0;
return _Py_qsbr_advance(qsbr->shared);
return _Py_qsbr_shared_current(shared) + QSBR_INCR;
}
static uint64_t