cpython/Python
Petr Viktorin 49f6beb56a
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-09-27 13:28:48 -07:00
..
clinic [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
deepfreeze
frozen_modules
_warnings.c
adaptive.md
asdl.c
asm_trampoline.S
assemble.c
ast.c [3.12] GH-112215: Backport C recursion changes (GH-115083) 2024-02-13 10:45:59 +01:00
ast_opt.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
ast_unparse.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
bltinmodule.c [3.12] gh-121153: Fix some errors with use of _PyLong_CompactValue() (GH-121154) 2024-07-17 07:58:25 +00:00
bootstrap_hash.c
bytecodes.c [3.12] gh-123083: Fix a potential use-after-free in ``STORE_ATTR_WITH… (#123237) 2024-08-23 01:37:40 +09:00
ceval.c [3.12] gh-112716: Fix SystemError when __builtins__ is not a dict (GH-112770) (GH-113103) 2023-12-14 12:54:25 +00:00
ceval_gil.c [3.12] gh-108987: Fix _thread.start_new_thread() race condition (#109135) (#110342) 2023-10-04 11:20:31 +00:00
ceval_macros.h
codecs.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
compile.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
condvar.h
context.c [3.12] gh-120811: Fix reference leak upon _PyContext_Exit failure (GH-120812) (#120844) 2024-06-22 16:44:31 +05:30
dtoa.c [3.12] gh-91565: Replace bugs.python.org links with Devguide/GitHub ones (GH-91568) (GH-117890) 2024-04-15 12:59:34 +00:00
dup2.c
dynamic_annotations.c
dynload_hpux.c
dynload_shlib.c
dynload_stub.c
dynload_win.c
emscripten_signal.c
errors.c [3.12] gh-124188: Fix PyErr_ProgramTextObject() (GH-124189) (GH-124426) 2024-09-24 08:53:54 +00:00
fileutils.c gh-111856: Fix os.fstat on windows with FAT32 and exFAT filesystem (GH-112038) 2023-11-13 16:25:01 +00:00
flowgraph.c [3.12] gh-113297: Fix segfault in compiler for with statement with 19 context managers (#113327) (#113404) 2023-12-23 13:29:11 +00:00
formatter_unicode.c
frame.c [3.12] gh-119897: Revert buggy optimization which was removed in 3.13 (#120467) 2024-06-18 10:45:40 +01:00
frozen.c [3.12] gh-106560: Fix redundant declarations in Python/frozen.c (#112612) (#112651) 2023-12-03 11:54:59 +00:00
frozenmain.c
future.c
generated_cases.c.h [3.12] gh-123083: Fix a potential use-after-free in ``STORE_ATTR_WITH… (#123237) 2024-08-23 01:37:40 +09:00
getargs.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
getcompiler.c
getcopyright.c
getopt.c
getplatform.c
getversion.c
hamt.c
hashtable.c [3.12] gh-106931: Intern Statically Allocated Strings Globally (gh-107272) (gh-110713) 2023-11-27 23:51:12 +00:00
import.c [3.12] gh-114685: Fix incorrect use of PyBUF_READ in import.c (GH-114686) (GH-114700) 2024-01-29 10:09:51 +00:00
importdl.c
importdl.h
initconfig.c [3.12] gh-90300: Remove reference to PYTHON_FROZEN_MODULES in Python CLI help (GH-117035) 2024-03-19 20:05:08 +00:00
instrumentation.c [3.12] gh-109371: Fix monitoring with instruction events set (gh-109385) (#109542) 2023-09-18 17:40:51 +02:00
intrinsics.c
legacy_tracing.c [3.12] gh-122029: Log call events in sys.setprofile when it's a method with c function (GH-122072) (GH-122206) 2024-07-23 22:44:43 +00:00
makeopcodetargets.py
marshal.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
modsupport.c
mysnprintf.c
mystrtoul.c
opcode_metadata.h
opcode_targets.h
pathconfig.c
perf_trampoline.c [3.12] gh-113343: Fix error check on mmap(2) (GH-113342) (#113374) 2023-12-21 19:44:15 +00:00
preconfig.c
pyarena.c
pyctype.c
pyfpe.c
pyhash.c
pylifecycle.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
pymath.c
pystate.c [3.12] gh-119585: Fix crash involving PyGILState_Release() and PyThreadState_Clear() (GH-119753) (#119861) 2024-05-31 15:42:09 +00:00
pystrcmp.c
pystrhex.c
pystrtod.c
Python-ast.c [3.12] GH-112215: Backport C recursion changes (GH-115083) 2024-02-13 10:45:59 +01:00
Python-tokenize.c [3.12] gh-120343: Fix column offsets of multiline tokens in tokenize (GH-120391) (#120428) 2024-06-12 19:10:35 +00:00
pythonrun.c [3.12] gh-113358: Fix rendering tracebacks with exceptions with a broken __getattr__ (GH-113359) (#114173) 2024-01-21 17:12:17 +00:00
pytime.c
README
specialize.c [3.12] Check for valid tp_version_tag in specializer (gh-89811) (gh-114216) 2024-01-20 04:45:33 +08:00
stdlib_module_names.h [3.12] gh-123892: Add "_wmi" to sys.stdlib_module_names (GH-123893) (#123897) 2024-09-10 10:11:56 +00:00
structmember.c [3.12] gh-115011: Improve support of __index__() in setters of members with unsigned integer type (GH-115029) (GH-115294) 2024-02-11 11:56:17 +00:00
suggestions.c
symtable.c [3.12] gh-119666: fix multiple class-scope comprehensions referencing __class__ (GH-120295) (#120300) 2024-06-10 00:37:15 -04:00
sysmodule.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
thread.c
thread_nt.h
thread_pthread.h [3.12] gh-112536: Add TSAN build on Github Actions (GH-116872) 2024-03-18 09:52:54 +00:00
thread_pthread_stubs.h
traceback.c [3.12] gh-109181: Fix refleak in tb_get_lineno() (#111948) 2023-11-10 14:07:45 +01:00
tracemalloc.c [3.12] gh-121390: tracemalloc: Fix tracebacks memory leak (GH-121391) (#121393) 2024-07-05 06:59:06 +00:00

Miscellaneous source files for the main Python shared library