Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.
This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785 (i.e. backporting gh-125709 too).
(cherry picked from commit f2cb399470, AKA gh-124865)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.
* Allow interned strings to be mortal, and fix related issues (GH-120520)
* Add an InternalDocs file describing how interning should work and how to use it.
* Add internal functions to *explicitly* request what kind of interning is done:
- `_PyUnicode_InternMortal`
- `_PyUnicode_InternImmortal`
- `_PyUnicode_InternStatic`
* Switch uses of `PyUnicode_InternInPlace` to those.
* Disallow using `_Py_SetImmortal` on strings directly.
You should use `_PyUnicode_InternImmortal` instead:
- Strings should be interned before immortalization, otherwise you're possibly
interning a immortalizing copy.
- `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
`SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
backports, as they are now part of public API and version-specific ABI.
* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.
Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
- `_Py_ID`
- `_Py_STR` (including the empty string)
- one-character latin-1 singletons
Now, when you intern a singleton, that exact singleton will be interned.
* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).
* Intern `_Py_STR` singletons at startup.
* Beef up the tests. Cover internal details (marked with `@cpython_only`).
* Add lots of assertions
* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)
* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs
* Document immortality in some functions that take `const char *`
This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.
Always point out a non-immortalizing alternative.
* Don't immortalize user-provided attr names in _ctypes
* Immortalize names in code objects to avoid crash (GH-121903)
* Intern latin-1 one-byte strings at startup (GH-122303)
There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
gh-107080: Fix Py_TRACE_REFS Crashes Under Isolated Subinterpreters (gh-107567)
The linked list of objects was a global variable, which broke isolation between interpreters, causing crashes. To solve this, we've moved the linked list to each interpreter.
(cherry picked from commit 58ef741867)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
The _xxsubinterpreters module was meant to only use public API. Some internal C-API usage snuck in over the last few years (e.g. gh-28969). This fixes that.
(cherry picked from commit e6373c0d8b)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
gh-105340: include hidden fast-locals in locals() (GH-105715)
* gh-105340: include hidden fast-locals in locals()
(cherry picked from commit 104d7b760f)
Co-authored-by: Carl Meyer <carl@oddbird.net>
This implements PEP 695, Type Parameter Syntax. It adds support for:
- Generic functions (def func[T](): ...)
- Generic classes (class X[T](): ...)
- Type aliases (type X = ...)
- New scoping when the new syntax is used within a class body
- Compiler and interpreter changes to support the new syntax and scoping rules
Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>
Co-authored-by: Eric Traut <eric@traut.com>
Co-authored-by: Larry Hastings <larry@hastings.org>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This change has two small parts:
1. a follow-up to gh-103940 with one case I missed
2. adding a missing return that I noticed while working on related code
This is strictly about moving the "obmalloc" runtime state from
`_PyRuntimeState` to `PyInterpreterState`. Doing so improves isolation
between interpreters, specifically most of the memory (incl. objects)
allocated for each interpreter's use. This is important for a
per-interpreter GIL, but such isolation is valuable even without it.
FWIW, a per-interpreter obmalloc is the proverbial
canary-in-the-coalmine when it comes to the isolation of objects between
interpreters. Any object that leaks (unintentionally) to another
interpreter is highly likely to cause a crash (on debug builds at
least). That's a useful thing to know, relative to interpreter
isolation.
This is the implementation of PEP683
Motivation:
The PR introduces the ability to immortalize instances in CPython which bypasses reference counting. Tagging objects as immortal allows up to skip certain operations when we know that the object will be around for the entire execution of the runtime.
Note that this by itself will bring a performance regression to the runtime due to the extra reference count checks. However, this brings the ability of having truly immutable objects that are useful in other contexts such as immutable data sharing between sub-interpreters.
* The majority of the monitoring code is in instrumentation.c
* The new instrumentation bytecodes are in bytecodes.c
* legacy_tracing.c adapts the new API to the old sys.setrace and sys.setprofile APIs
Moving it valuable with a per-interpreter GIL. However, it is also useful without one, since it allows us to identify refleaks within a single interpreter or where references are escaping an interpreter. This becomes more important as we move the obmalloc state to PyInterpreterState.
https://github.com/python/cpython/issues/102304
The essentially eliminates the global variable, with the associated benefits. This is also a precursor to isolating this bit of state to PyInterpreterState.
Folks that currently read _Py_RefTotal directly would have to start using _Py_GetGlobalRefTotal() instead.
https://github.com/python/cpython/issues/102304
When __getattr__ is defined, python with try to find an attribute using _PyObject_GenericGetAttrWithDict
find nothing is reasonable so we don't need an exception, it will hurt performance.
* Make sure that the current exception is always normalized.
* Remove redundant type and traceback fields for the current exception.
* Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException
* Add new API functions: PyException_GetArgs, PyException_SetArgs
We've factored out a struct from the two PyThreadState fields. This accomplishes two things:
* make it clear that the trashcan-related code doesn't need any other parts of PyThreadState
* allows us to use the trashcan mechanism even when there isn't a "current" thread state
We still expect the caller to hold the GIL.
https://github.com/python/cpython/issues/59956
* code_sizeof() now uses an unsigned type (size_t) to compute the result.
* Fix _PyObject_ComputedDictPointer(): cast _PyObject_VAR_SIZE() to
Py_ssize_t, rather than long: it's a different type on 64-bit Windows.
* Clarify that _PyObject_VAR_SIZE() uses an unsigned type (size_t).
Work on test coverage for `PyObject_Print` made it clear that some lines can't get executed.
Simplify the function by excluding the checks for non-string types.
Also eliminate creating a temporary bytes object.
* Make sure that tp_dictoffset is correct with Py_TPFLAGS_MANAGED_DICT is set.
* Avoid traversing managed dict twice when subclassing class with Py_TPFLAGS_MANAGED_DICT set.
This is the first of several precursors to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types.
We do the following:
* add `_PyStaticType_InitBuiltin()`
* add `_Py_TPFLAGS_STATIC_BUILTIN`
* set it on all static builtin types in `_PyStaticType_InitBuiltin()`
* shuffle some code around to be able to use _PyStaticType_InitBuiltin()
* rename `_PyStructSequence_InitType()` to `_PyStructSequence_InitBuiltinWithFlags()`
* add `_PyStructSequence_InitBuiltin()`.
Move the follow functions and type from frameobject.h to pyframe.h,
so the standard <Python.h> provide frame getter functions:
* PyFrame_Check()
* PyFrame_GetBack()
* PyFrame_GetBuiltins()
* PyFrame_GetGenerator()
* PyFrame_GetGlobals()
* PyFrame_GetLasti()
* PyFrame_GetLocals()
* PyFrame_Type
Remove #include "frameobject.h" from many C files. It's no longer
needed.
Python now always use the ``%zu`` and ``%zd`` printf formats to
format a size_t or Py_ssize_t number. Building Python 3.12 requires a
C11 compiler, so these printf formats are now always supported.
* PyObject_Print() and _PyObject_Dump() now use the printf %zd format
to display an object reference count.
* Update PY_FORMAT_SIZE_T comment.
* Remove outdated notes about the %zd format in PyBytes_FromFormat()
and PyUnicode_FromFormat() documentations.
* configure no longer checks for the %zd format and no longer defines
PY_FORMAT_SIZE_T macro in pyconfig.h.
* pymacconfig.h no longer undefines PY_FORMAT_SIZE_T: macOS 10.4 is
no longer supported. Python 3.12 now requires macOS 10.6 (Snow
Leopard) or newer.
Convert the following macros to static inline functions:
* _Py_AS_GC()
* _PyGCHead_FINALIZED(), _PyGCHead_SET_FINALIZED()
* _PyGCHead_NEXT(), _PyGCHead_SET_NEXT()
* _PyGCHead_PREV(), _PyGCHead_SET_PREV()
* _PyGC_FINALIZED(), _PyGC_SET_FINALIZED()
* _PyObject_GC_IS_TRACKED()
* _PyObject_GC_MAY_BE_TRACKED()
Add a macro wrapping the _PyObject_GC_IS_TRACKED() function to cast
the argument to PyObject*.
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).
* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
_Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
_Py_LeaveRecursiveCall()