gh-114271: Fix race in Thread.join() (#114839)

There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
This commit is contained in:
mpage 2024-03-16 05:56:30 -07:00 committed by GitHub
parent 86bc40dd41
commit 33da0e844c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 771 additions and 643 deletions

View file

@ -304,30 +304,6 @@ PyEvent_WaitTimed(PyEvent *evt, PyTime_t timeout_ns)
}
}
_PyEventRc *
_PyEventRc_New(void)
{
_PyEventRc *erc = (_PyEventRc *)PyMem_RawCalloc(1, sizeof(_PyEventRc));
if (erc != NULL) {
erc->refcount = 1;
}
return erc;
}
void
_PyEventRc_Incref(_PyEventRc *erc)
{
_Py_atomic_add_ssize(&erc->refcount, 1);
}
void
_PyEventRc_Decref(_PyEventRc *erc)
{
if (_Py_atomic_add_ssize(&erc->refcount, -1) == 1) {
PyMem_RawFree(erc);
}
}
static int
unlock_once(_PyOnceFlag *o, int res)
{

View file

@ -1032,20 +1032,7 @@ _PyInterpreterState_SetRunningMain(PyInterpreterState *interp)
void
_PyInterpreterState_SetNotRunningMain(PyInterpreterState *interp)
{
PyThreadState *tstate = interp->threads.main;
assert(tstate == current_fast_get());
if (tstate->on_delete != NULL) {
// The threading module was imported for the first time in this
// thread, so it was set as threading._main_thread. (See gh-75698.)
// The thread has finished running the Python program so we mark
// the thread object as finished.
assert(tstate->_whence != _PyThreadState_WHENCE_THREADING);
tstate->on_delete(tstate->on_delete_data);
tstate->on_delete = NULL;
tstate->on_delete_data = NULL;
}
assert(interp->threads.main == current_fast_get());
interp->threads.main = NULL;
}
@ -1570,16 +1557,6 @@ PyThreadState_Clear(PyThreadState *tstate)
Py_CLEAR(tstate->context);
if (tstate->on_delete != NULL) {
// For the "main" thread of each interpreter, this is meant
// to be done in _PyInterpreterState_SetNotRunningMain().
// That leaves threads created by the threading module,
// and any threads killed by forking.
// However, we also accommodate "main" threads that still
// don't call _PyInterpreterState_SetNotRunningMain() yet.
tstate->on_delete(tstate->on_delete_data);
}
#ifdef Py_GIL_DISABLED
// Each thread should clear own freelists in free-threading builds.
struct _Py_object_freelists *freelists = _Py_object_freelists_GET();