cpython/Tools/build
Petr Viktorin 49f6beb56a
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-09-27 13:28:48 -07:00
..
check_extension_modules.py [3.12] gh-123892: Add "_wmi" to sys.stdlib_module_names (GH-123893) (#123897) 2024-09-10 10:11:56 +00:00
deepfreeze.py [3.12] gh-106931: Intern Statically Allocated Strings Globally (gh-107272) (gh-110713) 2023-11-27 23:51:12 +00:00
freeze_modules.py [3.12] gh-106560: Fix redundant declarations in Python/frozen.c (#112612) (#112651) 2023-12-03 11:54:59 +00:00
generate_global_objects.py [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
generate_levenshtein_examples.py gh-99016: Make build scripts compatible with Python 3.8 (GH-99017) 2022-11-02 20:30:09 +02:00
generate_opcode_h.py gh-103963: fix 'make regen-opcode' in out-of-tree builds (#104177) 2023-05-04 17:45:56 +00:00
generate_re_casefix.py [3.12] Fix syntax in generate_re_casefix.py (GH-122699) (#122722) 2024-08-06 06:42:27 +00:00
generate_sbom.py [3.12] gh-123458: Skip SBOM generation if no git repository is detected (GH-123507) (#123615) 2024-09-03 01:21:40 +02:00
generate_sre_constants.py
generate_stdlib_module_names.py gh-98040: Remove just the imp module (#98573) 2023-04-28 16:17:58 -07:00
generate_token.py gh-102856: Initial implementation of PEP 701 (#102855) 2023-04-19 11:18:16 -05:00
parse_html5_entities.py
regen-configure.sh [3.12] gh-112088: Run autoreconf in GHA check_generated_files (GH-112090) (#112159) 2023-11-16 15:55:40 +01:00
smelly.py
stable_abi.py [3.12] GH-121970: Rewrite the C-API annotations extension (GH-121985) (#122025) 2024-07-19 12:48:50 +00:00
umarshal.py [3.12] Fix the long64 reader in umarshal.py (GH-107828) (#107849) 2023-08-11 11:59:45 +02:00
update_file.py
verify_ensurepip_wheels.py [3.12] gh-109002: Ensure only one wheel for each vendored package (GH-109003) (#109005) 2023-09-06 20:01:36 +02:00