cpython/Modules/_io
Petr Viktorin 49f6beb56a
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-09-27 13:28:48 -07:00
..
clinic [3.12] gh-115015: Argument Clinic: fix generated code for METH_METHOD methods without params (#115016) (#115067) 2024-02-06 11:20:16 +01:00
_iomodule.c gh-101819: Isolate _io (#101948) 2023-05-15 09:26:27 +00:00
_iomodule.h gh-101819: Isolate _io (#101948) 2023-05-15 09:26:27 +00:00
bufferedio.c [3.12] gh-95782: Fix io.BufferedReader.tell() etc. being able to return offsets < 0 (GH-99709) (GH-115599) 2024-02-17 14:56:00 +02:00
bytesio.c [3.12] gh-111049: Fix crash during garbage collection of the BytesIO buffer object (GH-111221) (GH-113096) 2023-12-14 10:28:57 +00:00
fileio.c [3.12] gh-114286: Fix maybe-uninitialized warning in Modules/_io/fileio.c (GH-114287) (GH-114288) 2024-01-19 10:58:09 +00:00
iobase.c [3.12] gh-107801: Improve the accuracy of io.IOBase.seek docs (#108268) (#108655) 2023-08-29 22:19:08 +02:00
stringio.c [3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065) 2024-09-27 13:28:48 -07:00
textio.c [3.12] gh-119506: fix _io.TextIOWrapper.write() write during flush (GH-119507) (#119965) 2024-06-19 10:23:29 +00:00
winconsoleio.c gh-110913: Fix WindowsConsoleIO chunking of UTF-8 text (GH-111007) 2023-10-20 12:37:31 +00:00