Commit graph

388 commits

Author SHA1 Message Date
Miss Islington (bot)
49da170709
[3.12] gh-116510: Fix a Crash Due to Shared Immortal Interned Strings (gh-125205)
Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.

This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785 (i.e. backporting gh-125709 too).

(cherry picked from commit f2cb399470, AKA gh-124865)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-12-03 10:26:25 -07:00
Serhiy Storchaka
f1e7424802
[3.12] gh-109746: Make _thread.start_new_thread delete state of new thread on its startup failure (GH-109761) (GH-127173)
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.

(cherry picked from commit ca3ea9ad05)

Co-authored-by: Radislav Chugunov <52372310+chgnrdv@users.noreply.github.com>
2024-11-22 19:56:39 +00:00
Sam Gross
738cf21609
[3.12] gh-119585: Fix crash involving PyGILState_Release() and PyThreadState_Clear() (GH-119753) (#119861)
Make sure that `gilstate_counter` is not zero in when calling
`PyThreadState_Clear()`. A destructor called from `PyThreadState_Clear()` may
call back into `PyGILState_Ensure()` and `PyGILState_Release()`. If
`gilstate_counter` is zero, it will try to create a new thread state before
the current active thread state is destroyed, leading to an assertion failure
or crash.
(cherry picked from commit bcc1be39cb)
2024-05-31 15:42:09 +00:00
Eric Snow
1e1a30f9f4
[3.12] gh-110310: Add a Per-Interpreter XID Registry for Heap Types (gh-110311) (gh-110714)
We do the following:

* add a per-interpreter XID registry (PyInterpreterState.xidregistry)
* put heap types there (keep static types in _PyRuntimeState.xidregistry)
* clear the registries during interpreter/runtime finalization
* avoid duplicate entries in the registry (when _PyCrossInterpreterData_RegisterClass() is called more than once for a type)
* use Py_TYPE() instead of PyObject_Type() in _PyCrossInterpreterData_Lookup()

The per-interpreter registry helps preserve isolation between interpreters.  This is important when heap types are registered, which is something we haven't been doing yet but I will likely do soon.

(cherry-picked from commit 80dc39e1dc)
2023-11-28 02:36:29 +00:00
Eric Snow
0122b4d7c9
[3.12] gh-105716: Support Background Threads in Subinterpreters Consistently (gh-109921) (gh-110707)
The existence of background threads running on a subinterpreter was preventing interpreters from getting properly destroyed, as well as impacting the ability to run the interpreter again. It also affected how we wait for non-daemon threads to finish.

We add PyInterpreterState.threads.main, with some internal C-API functions.

(cherry-picked from commit 1dd9dee45d)
2023-11-27 19:01:05 -07:00
Eric Snow
82ae5a609d
[3.12] gh-109793: Allow Switching Interpreters During Finalization (gh-109794) (gh-110705)
Essentially, we should check the thread ID rather than the thread state pointer.
2023-11-28 00:58:02 +00:00
Eric Snow
313554457e
[3.12] gh-109853: Fix sys.path[0] For Subinterpreters (gh-109994) (gh-110701)
This change makes sure sys.path[0] is set properly for subinterpreters.  Before, it wasn't getting set at all.

This change does not address the broader concerns from gh-109853.

(cherry-picked from commit a040a32ea2)
2023-11-27 22:21:12 +00:00
Eric Snow
592a849fdf
[3.12] gh-76785: Use Pending Calls When Releasing Cross-Interpreter Data (gh-109556) (gh-112288)
This fixes some crashes in the _xxinterpchannels module, due to a race between interpreters.
(cherry picked from commit fd7e08a6f3)
2023-11-27 14:49:48 -07:00
Miss Islington (bot)
2465fe0014
[3.12] GH-110455: Guard assert(tstate->thread_id > 0) with GH-ifndef HAVE_PTHREAD_STUBS (GH-110487) (GH-110491)
GH-110455: Guard `assert(tstate->thread_id > 0)` with `GH-ifndef HAVE_PTHREAD_STUBS` (GH-110487)
(cherry picked from commit 5fd8821cf8)

Co-authored-by: Brett Cannon <brett@python.org>
2023-10-06 23:48:48 +00:00
Victor Stinner
4936fa9541
[3.12] gh-108987: Fix _thread.start_new_thread() race condition (#109135) (#110342)
* gh-108987: Fix _thread.start_new_thread() race condition (#109135)

Fix _thread.start_new_thread() race condition. If a thread is created
during Python finalization, the newly spawned thread now exits
immediately instead of trying to access freed memory and lead to a
crash.

thread_run() calls PyEval_AcquireThread() which checks if the thread
must exit. The problem was that tstate was dereferenced earlier in
_PyThreadState_Bind() which leads to a crash most of the time.

Move _PyThreadState_CheckConsistency() from thread_run() to
_PyThreadState_Bind().

(cherry picked from commit 517cd82ea7)

* gh-109795: `_thread.start_new_thread`: allocate thread bootstate using raw memory allocator (#109808)

(cherry picked from commit 1b8f2366b3)

---------

Co-authored-by: Radislav Chugunov <52372310+chgnrdv@users.noreply.github.com>
2023-10-04 11:20:31 +00:00
Victor Stinner
30748d36b3
[3.12] gh-104690: thread_run() checks for tstate dangling pointer (#109056) (#109133)
gh-104690: thread_run() checks for tstate dangling pointer (#109056)

thread_run() of _threadmodule.c now calls
_PyThreadState_CheckConsistency() to check if tstate is a dangling
pointer when Python is built in debug mode.

Rename ceval_gil.c is_tstate_valid() to
_PyThreadState_CheckConsistency() to reuse it in _threadmodule.c.

(cherry picked from commit f63d37877a)
2023-10-02 16:55:06 +02:00
Eric Snow
aa9707dda9
[3.12] gh-107080: Fix Py_TRACE_REFS Crashes Under Isolated Subinterpreters (#107751)
* Unrevert "[3.12] gh-107080: Fix Py_TRACE_REFS Crashes Under Isolated Subinterpreters (gh-107567) (#107599)".

This reverts commit 6e4eec7606 (gh-107648).

* Initialize each interpreter's refchain properly.

* Skip test_basic_multiple_interpreters_deleted_no_reset on tracerefs builds.
2023-08-16 12:03:05 +02:00
Eric Snow
da151fdc7a
[3.12] gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache (gh-106974) (gh-107412)
gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache (gh-106974)

This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules.  The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL.  It turns out it's tricky to use the same thread state for different threads.  Who could have guessed?

We solve the problem by eliminating the one object we were still sharing between interpreters.  We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter.

We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.

(cherry picked from commit 8ba4df91ae)
2023-07-28 23:16:12 +00:00
Victor Stinner
0d4a76654f
[3.12] GH-103082: Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS (#107069) (#107075)
GH-103082: Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS (#107069)

Rename private C API constants:

* Rename PY_MONITORING_UNGROUPED_EVENTS to _PY_MONITORING_UNGROUPED_EVENTS
* Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS

(cherry picked from commit 0927a2b25c)
2023-07-22 22:20:38 +00:00
Eric Snow
33d3069c45
[3.12] gh-104812: Run Pending Calls in any Thread (gh-104813) (gh-105752)
For a while now, pending calls only run in the main thread (in the main interpreter).  This PR changes things to allow any thread run a pending call, unless the pending call was explicitly added for the main thread to run.
(cherry picked from commit 757b402)
2023-06-14 00:50:08 +00:00
Miss Islington (bot)
77c03a3b72
[3.12] gh-100227: Lock Around Modification of the Global Allocators State (gh-105516) (gh-105532)
The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low.
(cherry picked from commit 68dfa49627)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2023-06-08 22:35:53 +00:00
Eric Snow
b08ea9a561
[3.12] gh-100227: Lock Around Adding Global Audit Hooks (gh-105515) (gh-105525)
The risk of a race with this state is relatively low, but we play it safe anyway.
(cherry picked from commit e822a676f1)
2023-06-08 21:05:47 +00:00
Miss Islington (bot)
2ad2bd8b14
[3.12] gh-100227: Lock Around Use of the Global "atexit" State (gh-105514) (gh-105517)
The risk of a race with this state is relatively low, but we play it safe anyway.
(cherry picked from commit 7799c8e678)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2023-06-08 19:27:44 +00:00
Miss Islington (bot)
d2be5c73ed
[3.12] gh-104341: Call _PyEval_ReleaseLock() with NULL When Finalizing the Current Thread (gh-105109) (gh-105209)
This avoids the problematic race in drop_gil() by skipping the FORCE_SWITCHING code there for finalizing threads.

(The idea for this approach came out of discussions with @markshannon.)
(cherry picked from commit 3698fda)

Co-authored-by: Eric Snow ericsnowcurrently@gmail.com
2023-06-01 22:50:28 +00:00
Eric Snow
26baa747c2
gh-104341: Adjust tstate_must_exit() to Respect Interpreter Finalization (gh-104437)
With the move to a per-interpreter GIL, this check slipped through the cracks.
2023-05-15 13:59:26 -06:00
Eric Snow
5c9ee498c6
gh-99113: A Per-Interpreter GIL! (gh-104210)
This is the culmination of PEP 684 (and of my 8-year long multi-core Python project)!

Each subinterpreter may now be created with its own GIL (via Py_NewInterpreterFromConfig()).  If not so configured then the interpreter will share with the main interpreter--the status quo since subinterpreters were added decades ago.  The main interpreter always has its own GIL and subinterpreters from Py_NewInterpreter() will always share with the main interpreter.
2023-05-08 13:15:09 -06:00
Eric Snow
92d8bfffbf
gh-99113: Make Sure the GIL is Acquired at the Right Places (gh-104208)
This is a pre-requisite for a per-interpreter GIL.  Without it this change isn't strictly necessary.  However, there is no real downside otherwise.
2023-05-06 15:59:30 -06:00
Eric Snow
55671fe047
gh-99113: Share the GIL via PyInterpreterState.ceval.gil (gh-104203)
In preparation for a per-interpreter GIL, we add PyInterpreterState.ceval.gil, set it to the shared GIL for each interpreter, and use that rather than using _PyRuntime.ceval.gil directly.  Note that _PyRuntime.ceval.gil is still the actual GIL.
2023-05-05 13:23:00 -06:00
Victor Stinner
45398ad512
gh-103323: Remove PyRuntimeState_GetThreadState() (#104171)
This function no longer makes sense, since its runtime parameter is
no longer used. Use directly _PyThreadState_GET() and
_PyInterpreterState_GET() instead.
2023-05-04 16:21:01 +02:00
Mark Shannon
738c226786
GH-103082: Code cleanup in instrumentation code (#103474) 2023-04-29 04:51:55 +00:00
Eric Snow
d8627999d8
gh-100227: Add a Granular Lock for _PyRuntime.imports.extensions.dict (gh-103460)
The lock is unnecessary as long as there's a GIL, but completely
necessary with a per-interpreter GIL.
2023-04-24 21:09:35 -06:00
Eric Snow
df3173d28e
gh-101659: Isolate "obmalloc" State to Each Interpreter (gh-101660)
This is strictly about moving the "obmalloc" runtime state from
`_PyRuntimeState` to `PyInterpreterState`.  Doing so improves isolation
between interpreters, specifically most of the memory (incl. objects)
allocated for each interpreter's use.  This is important for a
per-interpreter GIL, but such isolation is valuable even without it.

FWIW, a per-interpreter obmalloc is the proverbial
canary-in-the-coalmine when it comes to the isolation of objects between
interpreters.  Any object that leaks (unintentionally) to another
interpreter is highly likely to cause a crash (on debug builds at
least).  That's a useful thing to know, relative to interpreter
isolation.
2023-04-24 17:23:57 -06:00
Eric Snow
f8abfa3314
gh-103323: Get the "Current" Thread State from a Thread-Local Variable (gh-103324)
We replace _PyRuntime.tstate_current with a thread-local variable. As part of this change, we add a _Py_thread_local macro in pyport.h (only for the core runtime) to smooth out the compiler differences. The main motivation here is in support of a per-interpreter GIL, but this change also provides some performance improvement opportunities.

Note that we do not provide a fallback to the thread-local, either falling back to the old tstate_current or to thread-specific storage (PyThread_tss_*()). If that proves problematic then we can circle back. I consider it unlikely, but will run the buildbots to double-check.

Also note that this does not change any of the code related to the GILState API, where it uses a thread state stored in thread-specific storage. I suspect we can combine that with _Py_tss_tstate (from here). However, that can be addressed separately and is not urgent (nor critical).

(While this change was mostly done independently, I did take some inspiration from earlier (~2020) work by @markshannon (main...markshannon:threadstate_in_tls) and @vstinner (#23976).)
2023-04-24 11:17:02 -06:00
Mark Shannon
411b169281
GH-103082: Implementation of PEP 669: Low Impact Monitoring for CPython (GH-103083)
* The majority of the monitoring code is in instrumentation.c

* The new instrumentation bytecodes are in bytecodes.c

* legacy_tracing.c adapts the new API to the old sys.setrace and sys.setprofile APIs
2023-04-12 12:04:55 +01:00
Irit Katriel
78b763f630
gh-103176: sys._current_exceptions() returns mapping to exception instances instead of exc_info tuples (#103177) 2023-04-11 09:38:37 +01:00
Eric Snow
52e9b389a8
gh-100227: Use an Array for _PyRuntime's Set of Locks During Init (gh-103315)
This cleans things up a bit and simplifies adding new granular global locks.
2023-04-06 12:00:49 -06:00
Eric Snow
dcd6f226d6
gh-100227: Make the Global PyModuleDef Cache Safe for Isolated Interpreters (gh-103084)
Sharing mutable (or non-immortal) objects between interpreters is generally not safe.  We can work around that but not easily. 
 There are two restrictions that are critical for objects that break interpreter isolation.

The first is that the object's state be guarded by a global lock.  For now the GIL meets this requirement, but a granular global lock is needed once we have a per-interpreter GIL.

The second restriction is that the object (and, for a container, its items) be deallocated/resized only when the interpreter in which it was allocated is the current one.  This is because every interpreter has (or will have, see gh-101660) its own object allocator.  Deallocating an object with a different allocator can cause crashes.

The dict for the cache of module defs is completely internal, which simplifies what we have to do to meet those requirements.  To do so, we do the following:

* add a mechanism for re-using a temporary thread state tied to the main interpreter in an arbitrary thread
   * add _PyRuntime.imports.extensions.main_tstate` 
   * add _PyThreadState_InitDetached() and _PyThreadState_ClearDetached() (pystate.c)
   * add _PyThreadState_BindDetached() and _PyThreadState_UnbindDetached() (pystate.c)
* make sure the cache dict (_PyRuntime.imports.extensions.dict) and its items are all owned by the main interpreter)
* add a placeholder using for a granular global lock

Note that the cache is only used for legacy extension modules and not for multi-phase init modules.

https://github.com/python/cpython/issues/100227
2023-03-29 17:15:43 -06:00
Eric Snow
89e67ada69
gh-100227: Revert gh-102925 "gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters" (gh-103063)
This reverts commit 87be8d9.

This approach to keeping the interned strings safe is turning out to be too complex for my taste (due to obmalloc isolation). For now I'm going with the simpler solution, making the dict per-interpreter. We can revisit that later if we want a sharing solution.
2023-03-27 16:53:05 -06:00
Eric Snow
87be8d9522
gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters (gh-102925)
This is effectively two changes.  The first (the bulk of the change) is where we add _Py_AddToGlobalDict() (and _PyRuntime.cached_objects.main_tstate, etc.).  The second (much smaller) change is where we update PyUnicode_InternInPlace() to use _Py_AddToGlobalDict() instead of calling PyDict_SetDefault() directly.

Basically, _Py_AddToGlobalDict() is a wrapper around PyDict_SetDefault() that should be used whenever we need to add a value to a runtime-global dict object (in the few cases where we are leaving the container global rather than moving it to PyInterpreterState, e.g. the interned strings dict).  _Py_AddToGlobalDict() does all the necessary work to make sure the target global dict is shared safely between isolated interpreters.  This is especially important as we move the obmalloc state to each interpreter (gh-101660), as well as, potentially, the GIL (PEP 684).

https://github.com/python/cpython/issues/100227
2023-03-22 18:30:04 -06:00
Eric Snow
743687434c
gh-102304: Move the Total Refcount to PyInterpreterState (gh-102545)
Moving it valuable with a per-interpreter GIL.  However, it is also useful without one, since it allows us to identify refleaks within a single interpreter or where references are escaping an interpreter.  This becomes more important as we move the obmalloc state to PyInterpreterState.

https://github.com/python/cpython/issues/102304
2023-03-21 11:46:09 -06:00
Eric Snow
5c75b7a91c
gh-102304: Fix Non-Debug Builds (gh-102846)
Some debug-only code slipped in with gh-102543.

https://github.com/python/cpython/issues/102304
2023-03-20 11:28:13 -06:00
Eric Snow
ad77d16a62
gh-102304: Move _Py_RefTotal to _PyRuntimeState (gh-102543)
The essentially eliminates the global variable, with the associated benefits. This is also a precursor to isolating this bit of state to PyInterpreterState.

Folks that currently read _Py_RefTotal directly would have to start using _Py_GetGlobalRefTotal() instead.

https://github.com/python/cpython/issues/102304
2023-03-20 10:03:04 -06:00
Eric Snow
cdb21ba74d
gh-102660: Handle m_copy Specially for the sys and builtins Modules (gh-102661)
It doesn't make sense to use multi-phase init for these modules. Using a per-interpreter "m_copy" (instead of PyModuleDef.m_base.m_copy) makes this work okay. (This came up while working on gh-101660.)

Note that we might instead end up disallowing re-load for sys/builtins since they are so special.

https://github.com/python/cpython/issues/102660
2023-03-14 14:01:35 -06:00
Eric Snow
f300a1fa4c
gh-100227: Move the dtoa State to PyInterpreterState (gh-102331)
https://github.com/python/cpython/issues/100227
2023-02-28 13:14:40 -07:00
Kumar Aditya
5f11478ce7
GH-102126: fix deadlock at shutdown when clearing thread states (#102222) 2023-02-25 12:21:36 +05:30
Eric Snow
3dea4ba6c1
gh-101758: Fix the wasm Buildbots (gh-101943)
They were broken by gh-101920.

https://github.com/python/cpython/issues/101758
2023-02-15 17:54:05 -07:00
Eric Snow
b2fc549278
gh-101758: Clean Up Uses of Import State (gh-101919)
This change is almost entirely moving code around and hiding import state behind internal API.  We introduce no changes to behavior, nor to non-internal API.  (Since there was already going to be a lot of churn, I took this as an opportunity to re-organize import.c into topically-grouped sections of code.)  The motivation is to simplify a number of upcoming changes.

Specific changes:

* move existing import-related code to import.c, wherever possible
* add internal API for interacting with import state (both global and per-interpreter)
* use only API outside of import.c (to limit churn there when changing the location, etc.)
* consolidate the import-related state of PyInterpreterState into a single struct field (this changes layout slightly)
* add macros for import state in import.c (to simplify changing the location)
* group code in import.c into sections
*remove _PyState_AddModule()

https://github.com/python/cpython/issues/101758
2023-02-15 15:32:31 -07:00
Mark Shannon
feec49c407
GH-101578: Normalize the current exception (GH-101607)
* Make sure that the current exception is always normalized.

* Remove redundant type and traceback fields for the current exception.

* Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException

* Add new API functions: PyException_GetArgs, PyException_SetArgs
2023-02-08 09:31:12 +00:00
Eric Snow
132b3f8302
gh-59956: Partial Fix for GILState API Compatibility with Subinterpreters (gh-101431)
The GILState API (PEP 311) implementation from 2003 made the assumption that only one thread state would ever be used for any given OS thread, explicitly disregarding the case of subinterpreters.  However, PyThreadState_Swap() still facilitated switching between subinterpreters, meaning the "current" thread state (holding the GIL), and the GILState thread state could end up out of sync, causing problems (including crashes).

This change addresses the issue by keeping the two in sync in PyThreadState_Swap().  I verified the fix against gh-99040.

Note that the other GILState-subinterpreter incompatibility (with autoInterpreterState) is not resolved here.

https://github.com/python/cpython/issues/59956
2023-02-06 14:39:25 -07:00
Eric Snow
e11fc032a7
gh-59956: Clarify Runtime State Status Expectations (gh-101308)
A PyThreadState can be in one of many states in its lifecycle, represented by some status value.  Those statuses haven't been particularly clear, so we're addressing that here.  Specifically:

* made the distinct lifecycle statuses clear on PyThreadState
* identified expectations of how various lifecycle-related functions relate to status
* noted the various places where those expectations don't match the actual behavior

At some point we'll need to address the mismatches.

(This change also includes some cleanup.)

https://github.com/python/cpython/issues/59956
2023-01-30 12:07:48 -07:00
Виталий Дмитриев
37f15a5efa
Fix typos in pystate.c file (#101348) 2023-01-26 15:04:11 -08:00
Eric Snow
7b20a0f55a
gh-59956: Allow the "Trashcan" Mechanism to Work Without a Thread State (gh-101209)
We've factored out a struct from the two PyThreadState fields. This accomplishes two things:

* make it clear that the trashcan-related code doesn't need any other parts of PyThreadState
* allows us to use the trashcan mechanism even when there isn't a "current" thread state

We still expect the caller to hold the GIL.

https://github.com/python/cpython/issues/59956
2023-01-23 08:30:20 -07:00
Nikita Sobolev
8be6992620
gh-101181: Fix unused-variable warning in pystate.c (#101188)
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
2023-01-20 23:31:30 +05:30
Eric Snow
f30c94024f
gh-59956: Fix Function Groupings in pystate.c (gh-101172)
This is a follow-up to gh-101161.  The objective is to make it easier to read Python/pystate.c by grouping the functions there in a consistent way.  This exclusively involves moving code around and adding various kinds of comments.

https://github.com/python/cpython/issues/59956
2023-01-19 17:23:53 -07:00
Eric Snow
6036c3e856
gh-59956: Clarify GILState-related Code (gh-101161)
The objective of this change is to help make the GILState-related code easier to understand.  This mostly involves moving code around and some semantically equivalent refactors.  However, there are a also a small number of slight changes in structure and behavior:

* tstate_current is moved out of _PyRuntimeState.gilstate
* autoTSSkey is moved out of _PyRuntimeState.gilstate
* autoTSSkey is initialized earlier
* autoTSSkey is re-initialized (after fork) earlier

https://github.com/python/cpython/issues/59956
2023-01-19 16:04:14 -07:00