cpython/Tools
Petr Viktorin 49f6beb56a
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-09-27 13:28:48 -07:00
..
build [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
buildbot bpo-41173: Copy test results file from ARM worker before uploading (GH-21305) 2020-07-08 00:24:39 +01:00
c-analyzer [3.12] Fix typos (#123775) (#123867) 2024-09-09 13:22:13 +00:00
cases_generator Remove redundant words from interpreter_definition.md. (GH-103455) 2023-04-11 15:30:05 -05:00
ccbench bpo-43723: Fix deprecation error caused by thread.setDaemon() (GH-25361) 2021-04-12 13:12:36 +02:00
clinic [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
freeze [3.12] gh-65701: document that freeze doesn't work with framework builds on macOS (GH-113352) (#113362) 2023-12-22 11:16:30 +01:00
gdb GH-101291: Rearrange the size bits in PyLongObject (GH-102464) 2023-03-22 14:49:51 +00:00
i18n gh-102507 Remove invisible pagebreak characters (#102531) 2023-03-08 13:58:14 +00:00
importbench [3.12] gh-98040: Fix importbench: use types.ModuleType() (GH-105743) (#105754) 2023-06-13 22:59:02 +00:00
iobench gh-100176: Tools/iobench: Remove redundant compat code for Python <= 3.2 (#100197) 2023-04-08 12:04:47 +03:00
msi gh-117505: Run ensurepip in isolated env in Windows installer (GH-118257) 2024-09-18 15:16:29 +01:00
nuget bpo-41744: Package python.props with correct name in NuGet package (GH-22154) 2020-09-14 20:30:15 +01:00
patchcheck [3.12] gh-109408: Revert pre-commit whitespace checks pending portable solution (GH-110726) (#110730) 2023-10-11 16:37:41 +00:00
peg_generator [3.12] gh-122270: Fix typos in the Py_DEBUG macro name (GH-122271) (GH-122276) 2024-07-25 11:22:42 +00:00
scripts [3.12] gh-116576: Fix Tools/scripts/sortperf.py sorting the same list (GH-116577) (#116582) 2024-03-11 07:15:51 +00:00
ssl [3.12] gh-123700: Update OpenSSL versions in multissltests and CI (GH-123704) 2024-09-04 16:31:28 -05:00
stringbench [codemod] Fix non-matching bracket pairs (GH-28473) 2021-09-22 01:09:00 +02:00
tsan [3.12] gh-112536: Add --tsan test for reasonable TSAN execution times. (gh-116601) (#116929) 2024-03-18 10:22:19 +01:00
tz bpo-29919: Remove unused imports found by pyflakes (#137) 2017-03-27 16:05:26 +02:00
unicode [3.12] Code: Update Donghee Na's name (GH-109744) (#110225) 2023-10-02 17:31:34 +00:00
unittestgui Remove a redundant assignment in Tools/unittestgui/unittestgui.py (GH-21438) 2021-05-16 16:55:06 +01:00
wasm [3.12] GH-116313: get WASI builds to run under wasmtime 18 w/ WASI 0.2/preview2 primitives (GH-116327) (GH-116373) 2024-03-05 13:35:02 -08:00
README gh-102110: Add all tools description missed (GH-102625) 2023-03-30 13:49:07 -07:00
requirements-dev.txt [3.12] Bump mypy to 1.7.1 (#112581) (#112601) 2023-12-01 17:10:38 +00:00
requirements-hypothesis.txt [3.12] build(deps): bump hypothesis from 6.108.10 to 6.111.2 in /Tools (GH-123567) (#123592) 2024-09-02 14:38:42 +03:00

This directory contains a number of Python programs that are useful
while building or extending Python.

build           Automatically generated directory by the build system
                contain build artifacts and intermediate files.

buildbot        Batchfiles for running on Windows buildbot workers.

c-analyzer      Tools to check no new global variables have been added.

cases_generator Tooling to generate interpreters.

ccbench         A Python threads-based concurrency benchmark. (*)

clinic          A preprocessor for CPython C files in order to automate
                the boilerplate involved with writing argument parsing
                code for "builtins".

freeze          Create a stand-alone executable from a Python program.

gdb             Python code to be run inside gdb, to make it easier to
                debug Python itself (by David Malcolm).

i18n            Tools for internationalization. pygettext.py
                parses Python source code and generates .pot files,
                and msgfmt.py generates a binary message catalog
                from a catalog in text format.

importbench     A set of micro-benchmarks for various import scenarios.

iobench         Benchmark for the new Python I/O system. (*)

msi             Support for packaging Python as an MSI package on Windows.

nuget           Files for the NuGet package manager for .NET.

patchcheck      Tools for checking and applying patches to the Python source code
                and verifying the integrity of patch files.

peg_generator   PEG-based parser generator (pegen) used for new parser.

scripts         A number of useful single-file programs, e.g. tabnanny.py
                by Tim Peters, which checks for inconsistent mixing of
                tabs and spaces, and 2to3, which converts Python 2 code
                to Python 3 code.

ssl             Scripts to generate ssl_data.h from OpenSSL sources, and run
                tests against multiple installations of OpenSSL and LibreSSL.

stringbench     A suite of micro-benchmarks for various operations on
                strings (both 8-bit and unicode). (*)

tz              A script to dump timezone from /usr/share/zoneinfo.

unicode         Tools for generating unicodedata and codecs from unicode.org
                and other mapping files (by Fredrik Lundh, Marc-Andre Lemburg
                and Martin von Loewis).

unittestgui     A Tkinter based GUI test runner for unittest, with test
                discovery.

wasm            Config and helpers to facilitate cross compilation of CPython
                to WebAssembly (WASM).

(*) A generic benchmark suite is maintained separately at https://github.com/python/performance

Note: The pynche color editor has moved to https://gitlab.com/warsaw/pynche