[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)

This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
This commit is contained in:
Petr Viktorin 2024-09-27 22:28:48 +02:00 committed by GitHub
parent 2fa9ca5070
commit 49f6beb56a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
51 changed files with 26040 additions and 27615 deletions

View file

@ -526,6 +526,14 @@ state:
Note that ``Py_XDECREF()`` should be used instead of ``Py_DECREF()`` in
this case, since *obj* can be ``NULL``.
The number of different *name* strings passed to this function
should be kept small, usually by only using statically allocated strings
as *name*.
For names that aren't known at compile time, prefer calling
:c:func:`PyUnicode_FromString` and :c:func:`PyObject_SetAttr` directly.
For more details, see :c:func:`PyUnicode_InternFromString`, which may be
used internally to create a key object.
.. versionadded:: 3.10
@ -590,6 +598,9 @@ state:
used from the module's initialization function.
Return ``-1`` with an exception set on error, ``0`` on success.
This is a convenience function that calls :c:func:`PyLong_FromLong` and
:c:func:`PyModule_AddObjectRef`; see their documentation for details.
.. c:function:: int PyModule_AddStringConstant(PyObject *module, const char *name, const char *value)
@ -598,6 +609,10 @@ state:
``NULL``-terminated.
Return ``-1`` with an exception set on error, ``0`` on success.
This is a convenience function that calls
:c:func:`PyUnicode_InternFromString` and :c:func:`PyModule_AddObjectRef`;
see their documentation for details.
.. c:macro:: PyModule_AddIntMacro(module, macro)

View file

@ -107,6 +107,13 @@ Object Protocol
If *v* is ``NULL``, the attribute is deleted, but this feature is
deprecated in favour of using :c:func:`PyObject_DelAttrString`.
The number of different attribute names passed to this function
should be kept small, usually by using a statically allocated string
as *attr_name*.
For attribute names that aren't known at compile time, prefer calling
:c:func:`PyUnicode_FromString` and :c:func:`PyObject_SetAttr` directly.
For more details, see :c:func:`PyUnicode_InternFromString`, which may be
used internally to create a key object.
.. c:function:: int PyObject_GenericSetAttr(PyObject *o, PyObject *name, PyObject *value)
@ -132,6 +139,14 @@ Object Protocol
specified as a :c:expr:`const char*` UTF-8 encoded bytes string,
rather than a :c:expr:`PyObject*`.
The number of different attribute names passed to this function
should be kept small, usually by using a statically allocated string
as *attr_name*.
For attribute names that aren't known at compile time, prefer calling
:c:func:`PyUnicode_FromString` and :c:func:`PyObject_DelAttr` directly.
For more details, see :c:func:`PyUnicode_InternFromString`, which may be
used internally to create a key object for lookup.
.. c:function:: PyObject* PyObject_GenericGetDict(PyObject *o, void *context)

View file

@ -1461,15 +1461,35 @@ They all return ``NULL`` or ``-1`` if an exception occurs.
existing interned string that is the same as :c:expr:`*p_unicode`, it sets :c:expr:`*p_unicode` to
it (releasing the reference to the old string object and creating a new
:term:`strong reference` to the interned string object), otherwise it leaves
:c:expr:`*p_unicode` alone and interns it (creating a new :term:`strong reference`).
:c:expr:`*p_unicode` alone and interns it.
(Clarification: even though there is a lot of talk about references, think
of this function as reference-neutral; you own the object after the call
if and only if you owned it before the call.)
of this function as reference-neutral. You must own the object you pass in;
after the call you no longer own the passed-in reference, but you newly own
the result.)
This function never raises an exception.
On error, it leaves its argument unchanged without interning it.
Instances of subclasses of :py:class:`str` may not be interned, that is,
:c:expr:`PyUnicode_CheckExact(*p_unicode)` must be true. If it is not,
then -- as with any other error -- the argument is left unchanged.
Note that interned strings are not “immortal”.
You must keep a reference to the result to benefit from interning.
.. c:function:: PyObject* PyUnicode_InternFromString(const char *str)
A combination of :c:func:`PyUnicode_FromString` and
:c:func:`PyUnicode_InternInPlace`, returning either a new Unicode string
object that has been interned, or a new ("owned") reference to an earlier
interned string object with the same value.
:c:func:`PyUnicode_InternInPlace`, meant for statically allocated strings.
Return a new ("owned") reference to either a new Unicode string object
that has been interned, or an earlier interned string object with the
same value.
Python may keep a reference to the result, or
prevent it from being garbage-collected promptly.
For interning an unbounded number of different strings, such as ones coming
from user input, prefer calling :c:func:`PyUnicode_FromString` and
:c:func:`PyUnicode_InternInPlace` directly.

File diff suppressed because it is too large Load diff

View file

@ -550,21 +550,16 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(anon_setcomp));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(anon_string));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(anon_unknown));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(close_br));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(dbl_close_br));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(dbl_open_br));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(dbl_percent));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(defaults));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(dot));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(dot_locals));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(empty));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(generic_base));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(json_decoder));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(kwdefaults));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(list_err));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(newline));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(open_br));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(percent));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(shim_name));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(type_params));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_STR(utf_8));
@ -577,7 +572,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(TextIOWrapper));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(True));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(WarningMessage));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_WindowsConsoleIO));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(__IOBase_closed));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(__abc_tpflags__));
@ -766,6 +760,7 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_lock_unlock_module));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_loop));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_needs_com_addref_));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_only_immortal));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_pack_));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_restype_));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_showwarnmsg));
@ -777,7 +772,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_uninitialized_submodules));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_warn_unawaited_coroutine));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(_xoptions));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(a));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(abs_tol));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(access));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(add));
@ -797,7 +791,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(attribute));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(authorizer_callback));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(autocommit));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(b));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(backtick));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(base));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(before));
@ -815,7 +808,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(byteorder));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(bytes));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(bytes_per_sep));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(c));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(c_call));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(c_exception));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(c_return));
@ -868,7 +860,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(count));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(covariant));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(cwd));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(d));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(data));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(database));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(decode));
@ -896,7 +887,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(dst));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(dst_dir_fd));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(duration));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(e));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(eager_start));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(effective_ids));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(element_factory));
@ -1057,7 +1047,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(mro));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(msg));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(mycmp));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(n));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(n_arg));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(n_fields));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(n_sequence_fields));
@ -1102,7 +1091,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(outgoing));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(overlapped));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(owner));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(p));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(pages));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(parent));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(password));
@ -1130,7 +1118,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(ps2));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(query));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(quotetabs));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(r));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(raw));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(read));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(read1));
@ -1154,7 +1141,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(return));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(reverse));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(reversed));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(s));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(salt));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(sched_priority));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(scheduler));
@ -1257,7 +1243,6 @@ _PyStaticObjects_CheckRefcnt(PyInterpreterState *interp) {
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(writable));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(write));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(write_through));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(x));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(year));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_ID(zdict));
_PyStaticObject_CheckRefcnt((PyObject *)&_Py_SINGLETON(strings).ascii[0]);

View file

@ -36,21 +36,16 @@ struct _Py_global_strings {
STRUCT_FOR_STR(anon_setcomp, "<setcomp>")
STRUCT_FOR_STR(anon_string, "<string>")
STRUCT_FOR_STR(anon_unknown, "<unknown>")
STRUCT_FOR_STR(close_br, "}")
STRUCT_FOR_STR(dbl_close_br, "}}")
STRUCT_FOR_STR(dbl_open_br, "{{")
STRUCT_FOR_STR(dbl_percent, "%%")
STRUCT_FOR_STR(defaults, ".defaults")
STRUCT_FOR_STR(dot, ".")
STRUCT_FOR_STR(dot_locals, ".<locals>")
STRUCT_FOR_STR(empty, "")
STRUCT_FOR_STR(generic_base, ".generic_base")
STRUCT_FOR_STR(json_decoder, "json.decoder")
STRUCT_FOR_STR(kwdefaults, ".kwdefaults")
STRUCT_FOR_STR(list_err, "list index out of range")
STRUCT_FOR_STR(newline, "\n")
STRUCT_FOR_STR(open_br, "{")
STRUCT_FOR_STR(percent, "%")
STRUCT_FOR_STR(shim_name, "<shim>")
STRUCT_FOR_STR(type_params, ".type_params")
STRUCT_FOR_STR(utf_8, "utf-8")
@ -66,7 +61,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(TextIOWrapper)
STRUCT_FOR_ID(True)
STRUCT_FOR_ID(WarningMessage)
STRUCT_FOR_ID(_)
STRUCT_FOR_ID(_WindowsConsoleIO)
STRUCT_FOR_ID(__IOBase_closed)
STRUCT_FOR_ID(__abc_tpflags__)
@ -255,6 +249,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(_lock_unlock_module)
STRUCT_FOR_ID(_loop)
STRUCT_FOR_ID(_needs_com_addref_)
STRUCT_FOR_ID(_only_immortal)
STRUCT_FOR_ID(_pack_)
STRUCT_FOR_ID(_restype_)
STRUCT_FOR_ID(_showwarnmsg)
@ -266,7 +261,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(_uninitialized_submodules)
STRUCT_FOR_ID(_warn_unawaited_coroutine)
STRUCT_FOR_ID(_xoptions)
STRUCT_FOR_ID(a)
STRUCT_FOR_ID(abs_tol)
STRUCT_FOR_ID(access)
STRUCT_FOR_ID(add)
@ -286,7 +280,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(attribute)
STRUCT_FOR_ID(authorizer_callback)
STRUCT_FOR_ID(autocommit)
STRUCT_FOR_ID(b)
STRUCT_FOR_ID(backtick)
STRUCT_FOR_ID(base)
STRUCT_FOR_ID(before)
@ -304,7 +297,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(byteorder)
STRUCT_FOR_ID(bytes)
STRUCT_FOR_ID(bytes_per_sep)
STRUCT_FOR_ID(c)
STRUCT_FOR_ID(c_call)
STRUCT_FOR_ID(c_exception)
STRUCT_FOR_ID(c_return)
@ -357,7 +349,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(count)
STRUCT_FOR_ID(covariant)
STRUCT_FOR_ID(cwd)
STRUCT_FOR_ID(d)
STRUCT_FOR_ID(data)
STRUCT_FOR_ID(database)
STRUCT_FOR_ID(decode)
@ -385,7 +376,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(dst)
STRUCT_FOR_ID(dst_dir_fd)
STRUCT_FOR_ID(duration)
STRUCT_FOR_ID(e)
STRUCT_FOR_ID(eager_start)
STRUCT_FOR_ID(effective_ids)
STRUCT_FOR_ID(element_factory)
@ -546,7 +536,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(mro)
STRUCT_FOR_ID(msg)
STRUCT_FOR_ID(mycmp)
STRUCT_FOR_ID(n)
STRUCT_FOR_ID(n_arg)
STRUCT_FOR_ID(n_fields)
STRUCT_FOR_ID(n_sequence_fields)
@ -591,7 +580,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(outgoing)
STRUCT_FOR_ID(overlapped)
STRUCT_FOR_ID(owner)
STRUCT_FOR_ID(p)
STRUCT_FOR_ID(pages)
STRUCT_FOR_ID(parent)
STRUCT_FOR_ID(password)
@ -619,7 +607,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(ps2)
STRUCT_FOR_ID(query)
STRUCT_FOR_ID(quotetabs)
STRUCT_FOR_ID(r)
STRUCT_FOR_ID(raw)
STRUCT_FOR_ID(read)
STRUCT_FOR_ID(read1)
@ -643,7 +630,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(return)
STRUCT_FOR_ID(reverse)
STRUCT_FOR_ID(reversed)
STRUCT_FOR_ID(s)
STRUCT_FOR_ID(salt)
STRUCT_FOR_ID(sched_priority)
STRUCT_FOR_ID(scheduler)
@ -746,7 +732,6 @@ struct _Py_global_strings {
STRUCT_FOR_ID(writable)
STRUCT_FOR_ID(write)
STRUCT_FOR_ID(write_through)
STRUCT_FOR_ID(x)
STRUCT_FOR_ID(year)
STRUCT_FOR_ID(zdict)
} identifiers;
@ -769,6 +754,10 @@ struct _Py_global_strings {
(_Py_SINGLETON(strings.identifiers._py_ ## NAME._ascii.ob_base))
#define _Py_STR(NAME) \
(_Py_SINGLETON(strings.literals._py_ ## NAME._ascii.ob_base))
#define _Py_LATIN1_CHR(CH) \
((CH) < 128 \
? (PyObject*)&_Py_SINGLETON(strings).ascii[(CH)] \
: (PyObject*)&_Py_SINGLETON(strings).latin1[(CH) - 128])
/* _Py_DECLARE_STR() should precede all uses of _Py_STR() in a function.

View file

@ -70,6 +70,13 @@ static inline void _Py_RefcntAdd(PyObject* op, Py_ssize_t n)
static inline void _Py_SetImmortal(PyObject *op)
{
#ifdef Py_DEBUG
// For strings, use _PyUnicode_InternImmortal instead.
if (PyUnicode_CheckExact(op)) {
assert(PyUnicode_CHECK_INTERNED(op) == SSTATE_INTERNED_IMMORTAL
|| PyUnicode_CHECK_INTERNED(op) == SSTATE_INTERNED_IMMORTAL_STATIC);
}
#endif
if (op) {
op->ob_refcnt = _Py_IMMORTAL_REFCNT;
}

View file

@ -542,21 +542,16 @@ extern "C" {
INIT_STR(anon_setcomp, "<setcomp>"), \
INIT_STR(anon_string, "<string>"), \
INIT_STR(anon_unknown, "<unknown>"), \
INIT_STR(close_br, "}"), \
INIT_STR(dbl_close_br, "}}"), \
INIT_STR(dbl_open_br, "{{"), \
INIT_STR(dbl_percent, "%%"), \
INIT_STR(defaults, ".defaults"), \
INIT_STR(dot, "."), \
INIT_STR(dot_locals, ".<locals>"), \
INIT_STR(empty, ""), \
INIT_STR(generic_base, ".generic_base"), \
INIT_STR(json_decoder, "json.decoder"), \
INIT_STR(kwdefaults, ".kwdefaults"), \
INIT_STR(list_err, "list index out of range"), \
INIT_STR(newline, "\n"), \
INIT_STR(open_br, "{"), \
INIT_STR(percent, "%"), \
INIT_STR(shim_name, "<shim>"), \
INIT_STR(type_params, ".type_params"), \
INIT_STR(utf_8, "utf-8"), \
@ -572,7 +567,6 @@ extern "C" {
INIT_ID(TextIOWrapper), \
INIT_ID(True), \
INIT_ID(WarningMessage), \
INIT_ID(_), \
INIT_ID(_WindowsConsoleIO), \
INIT_ID(__IOBase_closed), \
INIT_ID(__abc_tpflags__), \
@ -761,6 +755,7 @@ extern "C" {
INIT_ID(_lock_unlock_module), \
INIT_ID(_loop), \
INIT_ID(_needs_com_addref_), \
INIT_ID(_only_immortal), \
INIT_ID(_pack_), \
INIT_ID(_restype_), \
INIT_ID(_showwarnmsg), \
@ -772,7 +767,6 @@ extern "C" {
INIT_ID(_uninitialized_submodules), \
INIT_ID(_warn_unawaited_coroutine), \
INIT_ID(_xoptions), \
INIT_ID(a), \
INIT_ID(abs_tol), \
INIT_ID(access), \
INIT_ID(add), \
@ -792,7 +786,6 @@ extern "C" {
INIT_ID(attribute), \
INIT_ID(authorizer_callback), \
INIT_ID(autocommit), \
INIT_ID(b), \
INIT_ID(backtick), \
INIT_ID(base), \
INIT_ID(before), \
@ -810,7 +803,6 @@ extern "C" {
INIT_ID(byteorder), \
INIT_ID(bytes), \
INIT_ID(bytes_per_sep), \
INIT_ID(c), \
INIT_ID(c_call), \
INIT_ID(c_exception), \
INIT_ID(c_return), \
@ -863,7 +855,6 @@ extern "C" {
INIT_ID(count), \
INIT_ID(covariant), \
INIT_ID(cwd), \
INIT_ID(d), \
INIT_ID(data), \
INIT_ID(database), \
INIT_ID(decode), \
@ -891,7 +882,6 @@ extern "C" {
INIT_ID(dst), \
INIT_ID(dst_dir_fd), \
INIT_ID(duration), \
INIT_ID(e), \
INIT_ID(eager_start), \
INIT_ID(effective_ids), \
INIT_ID(element_factory), \
@ -1052,7 +1042,6 @@ extern "C" {
INIT_ID(mro), \
INIT_ID(msg), \
INIT_ID(mycmp), \
INIT_ID(n), \
INIT_ID(n_arg), \
INIT_ID(n_fields), \
INIT_ID(n_sequence_fields), \
@ -1097,7 +1086,6 @@ extern "C" {
INIT_ID(outgoing), \
INIT_ID(overlapped), \
INIT_ID(owner), \
INIT_ID(p), \
INIT_ID(pages), \
INIT_ID(parent), \
INIT_ID(password), \
@ -1125,7 +1113,6 @@ extern "C" {
INIT_ID(ps2), \
INIT_ID(query), \
INIT_ID(quotetabs), \
INIT_ID(r), \
INIT_ID(raw), \
INIT_ID(read), \
INIT_ID(read1), \
@ -1149,7 +1136,6 @@ extern "C" {
INIT_ID(return), \
INIT_ID(reverse), \
INIT_ID(reversed), \
INIT_ID(s), \
INIT_ID(salt), \
INIT_ID(sched_priority), \
INIT_ID(scheduler), \
@ -1252,7 +1238,6 @@ extern "C" {
INIT_ID(writable), \
INIT_ID(write), \
INIT_ID(write_through), \
INIT_ID(x), \
INIT_ID(year), \
INIT_ID(zdict), \
}

View file

@ -13,17 +13,31 @@ extern "C" {
void _PyUnicode_ExactDealloc(PyObject *op);
Py_ssize_t _PyUnicode_InternedSize(void);
Py_ssize_t _PyUnicode_InternedSize_Immortal(void);
/* runtime lifecycle */
extern void _PyUnicode_InitState(PyInterpreterState *);
extern PyStatus _PyUnicode_InitGlobalObjects(PyInterpreterState *);
extern PyStatus _PyUnicode_InitInternDict(PyInterpreterState *);
extern PyStatus _PyUnicode_InitTypes(PyInterpreterState *);
extern void _PyUnicode_Fini(PyInterpreterState *);
extern void _PyUnicode_FiniTypes(PyInterpreterState *);
extern PyTypeObject _PyUnicodeASCIIIter_Type;
/* Interning */
// All these are "ref-neutral", like the public PyUnicode_InternInPlace.
// Explicit interning routines:
PyAPI_FUNC(void) _PyUnicode_InternMortal(PyInterpreterState *interp, PyObject **);
PyAPI_FUNC(void) _PyUnicode_InternImmortal(PyInterpreterState *interp, PyObject **);
// Left here to help backporting:
PyAPI_FUNC(void) _PyUnicode_InternInPlace(PyInterpreterState *interp, PyObject **p);
// Only for statically allocated strings:
extern void _PyUnicode_InternStatic(PyInterpreterState *interp, PyObject **);
/* other API */
struct _Py_unicode_runtime_ids {
@ -60,7 +74,6 @@ struct _Py_unicode_state {
struct _Py_unicode_ids ids;
};
extern void _PyUnicode_InternInPlace(PyInterpreterState *interp, PyObject **p);
extern void _PyUnicode_ClearInterned(PyInterpreterState *interp);

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,122 @@
# String interning
*Interned* strings are conceptually part of an interpreter-global
*set* of interned strings, meaning that:
- no two interned strings have the same content (across an interpreter);
- two interned strings can be safely compared using pointer equality
(Python `is`).
This is used to optimize dict and attribute lookups, among other things.
Python uses two different mechanisms to intern strings: singletons and
dynamic interning.
## Singletons
The 256 possible one-character latin-1 strings, which can be retrieved with
`_Py_LATIN1_CHR(c)`, are stored in statically allocated arrays,
`_PyRuntime.static_objects.strings.ascii` and
`_PyRuntime.static_objects.strings.latin1`.
Longer singleton strings are marked in C source with `_Py_ID` (if the string
is a valid C identifier fragment) or `_Py_STR` (if it needs a separate
C-compatible name.)
These are also stored in statically allocated arrays.
They are collected from CPython sources using `make regen-global-objects`
(`Tools/build/generate_global_objects.py`), which generates code
for declaration, initialization and finalization.
The empty string is one of the singletons: `_Py_STR(empty)`.
Deep-frozen modules (see `Tools/build/deepfreeze.py`) use either singletons,
or statically allocated strings. These are added to `INTERNED_STRINGS`
at runtime initialization, when deepfreeze modules are loaded.
These sets of singletons (`_Py_LATIN1_CHR`, `_Py_ID`, `_Py_STR`, deepfreeze)
are disjoint.
If you have such a singleton, it (and no other copy) will be interned.
These singletons are interned in a runtime-global lookup table,
`_PyRuntime.cached_objects.interned_strings` (`INTERNED_STRINGS`),
at runtime initialization, and immutable until it's torn down
at runtime finalization.
It is shared across threads and interpreters without any synchronization.
## Dynamically allocated strings
All other strings are allocated dynamically, and have their
`_PyUnicode_STATE(s).statically_allocated` flag set to zero.
When interned, such strings are added to an interpreter-wide dict,
`PyInterpreterState.cached_objects.interned_strings`.
The key and value of each entry in this dict reference the same object.
## Immortality and reference counting
Invariant: Every immortal string is interned.
In practice, this means that you must not use `_Py_SetImmortal` on
a string. (If you know it's already immortal, don't immortalize it;
if you know it's not interned you might be immortalizing a redundant copy;
if it's interned and mortal it needs extra processing in
`_PyUnicode_InternImmortal`.)
The converse is not true: interned strings can be mortal.
For mortal interned strings:
- the 2 references from the interned dict (key & value) are excluded from
their refcount
- the deallocator (`unicode_dealloc`) removes the string from the interned dict
- at shutdown, when the interned dict is cleared, the references are added back
As with any type, you should only immortalize strings that will live until
interpreter shutdown.
We currently also immortalize strings contained in code objects and similar,
specifically in the compiler and in `marshal`.
These are “close enough” to immortal: even in use cases like hot reloading
or `eval`-ing user input, the number of distinct identifiers and string
constants expected to stay low.
## Internal API
We have the following *internal* API for interning:
- `_PyUnicode_InternMortal`: just intern the string
- `_PyUnicode_InternImmortal`: intern, and immortalize the result
- `_PyUnicode_InternStatic`: intern a static singleton (`_Py_STR`, `_Py_ID`
or one-byte). Not for general use.
All take an interpreter state, and a pointer to a `PyObject*` which they
modify in place.
The functions take ownership of (“steal”) the reference to their argument,
and update the argument with a *new* reference.
This means:
- They're “reference neutral”.
- They must not be called with a borrowed reference.
## State
The intern state (retrieved by `PyUnicode_CHECK_INTERNED(s)`;
stored in `_PyUnicode_STATE(s).interned`) can be:
- `SSTATE_NOT_INTERNED` (defined as 0, which is useful in a boolean context)
- `SSTATE_INTERNED_MORTAL` (1)
- `SSTATE_INTERNED_IMMORTAL` (2)
- `SSTATE_INTERNED_IMMORTAL_STATIC` (3)
The valid transitions between these states are:
- For dynamically allocated strings:
- 0 -> 1 (`_PyUnicode_InternMortal`)
- 1 -> 2 or 0 -> 2 (`_PyUnicode_InternImmortal`)
Using `_PyUnicode_InternStatic` on these is an error; the other cases
don't change the state.
- Singletons are interned (0 -> 3) at runtime init;
after that all interning functions don't change the state.

View file

@ -1984,7 +1984,7 @@ test_keywords(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObjec
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2018,7 +2018,7 @@ exit:
static PyObject *
test_keywords_impl(PyObject *module, PyObject *a, PyObject *b)
/*[clinic end generated code: output=73d46a9ae3320f96 input=0d3484844749c05b]*/
/*[clinic end generated code: output=13ba007e1c842a37 input=0d3484844749c05b]*/
/*[clinic input]
@ -2054,7 +2054,7 @@ test_keywords_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t nargs,
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2088,7 +2088,7 @@ exit:
static PyObject *
test_keywords_kwonly_impl(PyObject *module, PyObject *a, PyObject *b)
/*[clinic end generated code: output=c9f02a41f425897d input=384adc78bfa0bff7]*/
/*[clinic end generated code: output=789799a6d2d6eb4d input=384adc78bfa0bff7]*/
/*[clinic input]
@ -2125,7 +2125,7 @@ test_keywords_opt(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyO
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2172,7 +2172,7 @@ exit:
static PyObject *
test_keywords_opt_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c)
/*[clinic end generated code: output=b35d4e66f7283e46 input=eda7964f784f4607]*/
/*[clinic end generated code: output=42430dd8ea5afde6 input=eda7964f784f4607]*/
/*[clinic input]
@ -2211,7 +2211,7 @@ test_keywords_opt_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t nar
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2269,7 +2269,7 @@ exit:
static PyObject *
test_keywords_opt_kwonly_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c, PyObject *d)
/*[clinic end generated code: output=ede7e6e65106bf2b input=209387a4815e5082]*/
/*[clinic end generated code: output=f312c35c380d2bf9 input=209387a4815e5082]*/
/*[clinic input]
@ -2307,7 +2307,7 @@ test_keywords_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t nar
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2354,7 +2354,7 @@ exit:
static PyObject *
test_keywords_kwonly_opt_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c)
/*[clinic end generated code: output=36d4df939a4c3eef input=18393cc64fa000f4]*/
/*[clinic end generated code: output=3937da2a8233ebe0 input=18393cc64fa000f4]*/
/*[clinic input]
@ -2390,7 +2390,7 @@ test_posonly_keywords(PyObject *module, PyObject *const *args, Py_ssize_t nargs,
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2424,7 +2424,7 @@ exit:
static PyObject *
test_posonly_keywords_impl(PyObject *module, PyObject *a, PyObject *b)
/*[clinic end generated code: output=4835f4b6cf386c28 input=1767b0ebdf06060e]*/
/*[clinic end generated code: output=6b4f6dd5f4db3877 input=1767b0ebdf06060e]*/
/*[clinic input]
@ -2461,7 +2461,7 @@ test_posonly_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t nargs, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2495,7 +2495,7 @@ exit:
static PyObject *
test_posonly_kwonly_impl(PyObject *module, PyObject *a, PyObject *c)
/*[clinic end generated code: output=2570ea156a8d3cb5 input=9042f2818f664839]*/
/*[clinic end generated code: output=8bef2a8198e70b26 input=9042f2818f664839]*/
/*[clinic input]
@ -2534,7 +2534,7 @@ test_posonly_keywords_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2571,7 +2571,7 @@ exit:
static PyObject *
test_posonly_keywords_kwonly_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c)
/*[clinic end generated code: output=aaa0e6b5ce02900d input=29546ebdca492fea]*/
/*[clinic end generated code: output=a44b8ae8300955e1 input=29546ebdca492fea]*/
/*[clinic input]
@ -2610,7 +2610,7 @@ test_posonly_keywords_opt(PyObject *module, PyObject *const *args, Py_ssize_t na
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2659,7 +2659,7 @@ exit:
static PyObject *
test_posonly_keywords_opt_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c, PyObject *d)
/*[clinic end generated code: output=1d9f2d8420d0a85f input=cdf5a9625e554e9b]*/
/*[clinic end generated code: output=cae6647c9e8e0238 input=cdf5a9625e554e9b]*/
/*[clinic input]
@ -2697,7 +2697,7 @@ test_posonly_keywords_opt2(PyObject *module, PyObject *const *args, Py_ssize_t n
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2744,7 +2744,7 @@ exit:
static PyObject *
test_posonly_keywords_opt2_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c)
/*[clinic end generated code: output=a83caa0505b296cf input=1581299d21d16f14]*/
/*[clinic end generated code: output=6526fd08aafa2149 input=1581299d21d16f14]*/
/*[clinic input]
@ -2783,7 +2783,7 @@ test_posonly_opt_keywords_opt(PyObject *module, PyObject *const *args, Py_ssize_
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2837,7 +2837,7 @@ exit:
static PyObject *
test_posonly_opt_keywords_opt_impl(PyObject *module, PyObject *a,
PyObject *b, PyObject *c, PyObject *d)
/*[clinic end generated code: output=0b24fba3dc04d26b input=408798ec3d42949f]*/
/*[clinic end generated code: output=b8d01e98443738c2 input=408798ec3d42949f]*/
/*[clinic input]
@ -2877,7 +2877,7 @@ test_posonly_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t narg
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2926,7 +2926,7 @@ exit:
static PyObject *
test_posonly_kwonly_opt_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c, PyObject *d)
/*[clinic end generated code: output=592b217bca2f7bcc input=8d8e5643bbbc2309]*/
/*[clinic end generated code: output=81d71c288f13d4dc input=8d8e5643bbbc2309]*/
/*[clinic input]
@ -2965,7 +2965,7 @@ test_posonly_kwonly_opt2(PyObject *module, PyObject *const *args, Py_ssize_t nar
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3012,7 +3012,7 @@ exit:
static PyObject *
test_posonly_kwonly_opt2_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c)
/*[clinic end generated code: output=b8b00420826bc11f input=f7e5eed94f75fff0]*/
/*[clinic end generated code: output=a717d2a1a3310289 input=f7e5eed94f75fff0]*/
/*[clinic input]
@ -3052,7 +3052,7 @@ test_posonly_opt_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3106,7 +3106,7 @@ exit:
static PyObject *
test_posonly_opt_kwonly_opt_impl(PyObject *module, PyObject *a, PyObject *b,
PyObject *c, PyObject *d)
/*[clinic end generated code: output=3b9ee879ebee285a input=1e557dc979d120fd]*/
/*[clinic end generated code: output=0f50b4b8d45cf2de input=1e557dc979d120fd]*/
/*[clinic input]
@ -3148,7 +3148,7 @@ test_posonly_keywords_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssi
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), &_Py_ID(e), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), _Py_LATIN1_CHR('e'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3200,7 +3200,7 @@ static PyObject *
test_posonly_keywords_kwonly_opt_impl(PyObject *module, PyObject *a,
PyObject *b, PyObject *c, PyObject *d,
PyObject *e)
/*[clinic end generated code: output=d380f84f81cc0e45 input=c3884a4f956fdc89]*/
/*[clinic end generated code: output=8dac8d2a4e6105fa input=c3884a4f956fdc89]*/
/*[clinic input]
@ -3240,7 +3240,7 @@ test_posonly_keywords_kwonly_opt2(PyObject *module, PyObject *const *args, Py_ss
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3289,7 +3289,7 @@ exit:
static PyObject *
test_posonly_keywords_kwonly_opt2_impl(PyObject *module, PyObject *a,
PyObject *b, PyObject *c, PyObject *d)
/*[clinic end generated code: output=ee629e962cb06992 input=68d01d7c0f6dafb0]*/
/*[clinic end generated code: output=5a96d521e6414f5d input=68d01d7c0f6dafb0]*/
/*[clinic input]
@ -3332,7 +3332,7 @@ test_posonly_keywords_opt_kwonly_opt(PyObject *module, PyObject *const *args, Py
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), &_Py_ID(e), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), _Py_LATIN1_CHR('e'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3393,7 +3393,7 @@ static PyObject *
test_posonly_keywords_opt_kwonly_opt_impl(PyObject *module, PyObject *a,
PyObject *b, PyObject *c,
PyObject *d, PyObject *e)
/*[clinic end generated code: output=a2721babb42ecfd1 input=d0883d45876f186c]*/
/*[clinic end generated code: output=d5a474dcd5dc3e9f input=d0883d45876f186c]*/
/*[clinic input]
@ -3436,7 +3436,7 @@ test_posonly_keywords_opt2_kwonly_opt(PyObject *module, PyObject *const *args, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), &_Py_ID(e), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), _Py_LATIN1_CHR('e'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3502,7 +3502,7 @@ static PyObject *
test_posonly_keywords_opt2_kwonly_opt_impl(PyObject *module, PyObject *a,
PyObject *b, PyObject *c,
PyObject *d, PyObject *e)
/*[clinic end generated code: output=0626203eedb6e7e8 input=c95e2e1ec93035ad]*/
/*[clinic end generated code: output=ac239c5ee8a74408 input=c95e2e1ec93035ad]*/
/*[clinic input]
@ -3547,7 +3547,7 @@ test_posonly_opt_keywords_opt_kwonly_opt(PyObject *module, PyObject *const *args
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), &_Py_ID(d), &_Py_ID(e), &_Py_ID(f), },
.ob_item = { _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), _Py_LATIN1_CHR('e'), _Py_LATIN1_CHR('f'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3621,7 +3621,7 @@ test_posonly_opt_keywords_opt_kwonly_opt_impl(PyObject *module, PyObject *a,
PyObject *b, PyObject *c,
PyObject *d, PyObject *e,
PyObject *f)
/*[clinic end generated code: output=07d8acc04558a5a0 input=9914857713c5bbf8]*/
/*[clinic end generated code: output=638bbd0005639342 input=9914857713c5bbf8]*/
/*[clinic input]
test_keyword_only_parameter
@ -4017,7 +4017,7 @@ test_vararg(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), },
.ob_item = { _Py_LATIN1_CHR('a'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -4052,7 +4052,7 @@ exit:
static PyObject *
test_vararg_impl(PyObject *module, PyObject *a, PyObject *args)
/*[clinic end generated code: output=880365c61ae205d7 input=81d33815ad1bae6e]*/
/*[clinic end generated code: output=1411e464f358a7ba input=81d33815ad1bae6e]*/
/*[clinic input]
test_vararg_with_default
@ -4089,7 +4089,7 @@ test_vararg_with_default(PyObject *module, PyObject *const *args, Py_ssize_t nar
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -4135,7 +4135,7 @@ exit:
static PyObject *
test_vararg_with_default_impl(PyObject *module, PyObject *a, PyObject *args,
int b)
/*[clinic end generated code: output=291e9a5a09831128 input=6e110b54acd9b22d]*/
/*[clinic end generated code: output=f09d4b917063ca41 input=6e110b54acd9b22d]*/
/*[clinic input]
test_vararg_with_only_defaults
@ -4172,7 +4172,7 @@ test_vararg_with_only_defaults(PyObject *module, PyObject *const *args, Py_ssize
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -4223,7 +4223,7 @@ exit:
static PyObject *
test_vararg_with_only_defaults_impl(PyObject *module, PyObject *args, int b,
PyObject *c)
/*[clinic end generated code: output=dd21b28f0db26a4b input=fa56a709a035666e]*/
/*[clinic end generated code: output=cc6590b8805d5433 input=fa56a709a035666e]*/
/*[clinic input]
test_paramname_module
@ -4490,7 +4490,7 @@ Test_cls_with_param(TestObj *self, PyTypeObject *cls, PyObject *const *args, Py_
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), },
.ob_item = { _Py_LATIN1_CHR('a'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -4525,7 +4525,7 @@ exit:
static PyObject *
Test_cls_with_param_impl(TestObj *self, PyTypeObject *cls, int a)
/*[clinic end generated code: output=00218e7f583e6c81 input=af158077bd237ef9]*/
/*[clinic end generated code: output=107b46f09870e1f8 input=af158077bd237ef9]*/
/*[clinic input]
@ -4838,7 +4838,7 @@ Test___init__(PyObject *self, PyObject *args, PyObject *kwargs)
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), },
.ob_item = { _Py_LATIN1_CHR('a'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -4872,7 +4872,7 @@ exit:
static int
Test___init___impl(TestObj *self, PyObject *a)
/*[clinic end generated code: output=0b9ca79638ab3ecb input=a8f9222a6ab35c59]*/
/*[clinic end generated code: output=0e1239b9bc247bc1 input=a8f9222a6ab35c59]*/
/*[clinic input]

View file

@ -109,7 +109,7 @@ def runtest_refleak(test_name, test_func,
getunicodeinternedsize = sys.getunicodeinternedsize
fd_count = os_helper.fd_count
# initialize variables to make pyflakes quiet
rc_before = alloc_before = fd_before = interned_before = 0
rc_before = alloc_before = fd_before = interned_immortal_before = 0
if not quiet:
print("beginning", repcount, "repetitions. Showing number of leaks "
@ -135,9 +135,11 @@ def runtest_refleak(test_name, test_func,
# Also, readjust the reference counts and alloc blocks by ignoring
# any strings that might have been interned during test_func. These
# strings will be deallocated at runtime shutdown
interned_after = getunicodeinternedsize()
alloc_after = getallocatedblocks() - interned_after
rc_after = gettotalrefcount() - interned_after * 2
interned_immortal_after = getunicodeinternedsize(
# Use an internal-only keyword argument that mypy doesn't know yet
_only_immortal=True) # type: ignore[call-arg]
alloc_after = getallocatedblocks() - interned_immortal_after
rc_after = gettotalrefcount() - interned_immortal_after * 2
fd_after = fd_count()
rc_deltas[i] = get_pooled_int(rc_after - rc_before)
@ -164,7 +166,7 @@ def runtest_refleak(test_name, test_func,
alloc_before = alloc_after
rc_before = rc_after
fd_before = fd_after
interned_before = interned_after
interned_immortal_before = interned_immortal_after
restore_support_xml(xml_filename)

View file

@ -1502,7 +1502,7 @@ class ClinicExternalTest(TestCase):
# Verify by checking the checksum.
checksum = (
"/*[clinic end generated code: "
"output=2124c291eb067d76 input=9543a8d2da235301]*/\n"
"output=99dd9b13ffdc660d input=9543a8d2da235301]*/\n"
)
with open(fn, encoding='utf-8') as f:
generated = f.read()

View file

@ -810,6 +810,30 @@ class ScopeTests(unittest.TestCase):
gc_collect() # For PyPy or other GCs.
self.assertIsNone(ref())
def test_multiple_nesting(self):
# Regression test for https://github.com/python/cpython/issues/121863
class MultiplyNested:
def f1(self):
__arg = 1
class D:
def g(self, __arg):
return __arg
return D().g(_MultiplyNested__arg=2)
def f2(self):
__arg = 1
class D:
def g(self, __arg):
return __arg
return D().g
inst = MultiplyNested()
with self.assertRaises(TypeError):
inst.f1()
closure = inst.f2()
with self.assertRaises(TypeError):
closure(_MultiplyNested__arg=2)
if __name__ == '__main__':
unittest.main()

View file

@ -710,8 +710,11 @@ class SysModuleTest(unittest.TestCase):
self.assertRaises(TypeError, sys.intern, S("abc"))
@support.cpython_only
@requires_subinterpreters
def test_subinterp_intern_dynamically_allocated(self):
# Implementation detail: Dynamically allocated strings
# are distinct between interpreters
s = "never interned before" + str(random.randrange(0, 10**9))
t = sys.intern(s)
self.assertIs(t, s)
@ -719,25 +722,58 @@ class SysModuleTest(unittest.TestCase):
interp = interpreters.create()
interp.run(textwrap.dedent(f'''
import sys
t = sys.intern({s!r})
# set `s`, avoid parser interning & constant folding
s = str({s.encode()!r}, 'utf-8')
t = sys.intern(s)
assert id(t) != {id(s)}, (id(t), {id(s)})
assert id(t) != {id(t)}, (id(t), {id(t)})
'''))
@support.cpython_only
@requires_subinterpreters
def test_subinterp_intern_statically_allocated(self):
# Implementation detail: Statically allocated strings are shared
# between interpreters.
# See Tools/build/generate_global_objects.py for the list
# of strings that are always statically allocated.
s = '__init__'
t = sys.intern(s)
for s in ('__init__', 'CANCELLED', '<module>', 'utf-8',
'{{', '', '\n', '_', 'x', '\0', '\N{CEDILLA}', '\xff',
):
with self.subTest(s=s):
t = sys.intern(s)
print('------------------------')
interp = interpreters.create()
interp.run(textwrap.dedent(f'''
import sys
t = sys.intern({s!r})
assert id(t) == {id(t)}, (id(t), {id(t)})
'''))
interp = interpreters.create()
interp.run(textwrap.dedent(f'''
import sys
# set `s`, avoid parser interning & constant folding
s = str({s.encode()!r}, 'utf-8')
t = sys.intern(s)
assert id(t) == {id(t)}, (id(t), {id(t)})
'''))
@support.cpython_only
@requires_subinterpreters
def test_subinterp_intern_singleton(self):
# Implementation detail: singletons are used for 0- and 1-character
# latin1 strings.
for s in '', '\n', '_', 'x', '\0', '\N{CEDILLA}', '\xff':
with self.subTest(s=s):
interp = interpreters.create()
interp.run(textwrap.dedent(f'''
import sys
# set `s`, avoid parser interning & constant folding
s = str({s.encode()!r}, 'utf-8')
assert id(s) == {id(s)}
t = sys.intern(s)
assert id(t) == id(s), (id(t), id(s))
'''))
def test_sys_flags(self):
self.assertTrue(sys.flags)

View file

@ -0,0 +1,12 @@
:c:func:`PyUnicode_InternInPlace` no longer prevents its argument from being
garbage collected.
Several functions that take ``char *`` are now
documented as possibly preventing string objects from being garbage
collected; refer to their documentation for details:
:c:func:`PyUnicode_InternFromString`,
:c:func:`PyDict_SetItemString`,
:c:func:`PyObject_SetAttrString`,
:c:func:`PyObject_DelAttrString`,
:c:func:`PyUnicode_InternFromString`,
and ``PyModule_Add*`` convenience functions.

View file

@ -0,0 +1,5 @@
Strings interned with :func:`sys.intern` are again garbage-collected when no
longer used, as per the documentation. Strings interned with the C function
:c:func:`PyUnicode_InternInPlace` are still immortal. Internals of the
string interning mechanism have been changed. This may affect performance
and identities of :class:`str` objects.

View file

@ -2164,9 +2164,15 @@ PyCSimpleType_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
Py_DECREF(result);
return NULL;
}
x = PyDict_SetItemString(result->tp_dict,
ml->ml_name,
meth);
PyObject *name = PyUnicode_FromString(ml->ml_name);
if (name == NULL) {
Py_DECREF(meth);
Py_DECREF(result);
return NULL;
}
PyUnicode_InternInPlace(&name);
x = PyDict_SetItem(result->tp_dict, name, meth);
Py_DECREF(name);
Py_DECREF(meth);
if (x == -1) {
Py_DECREF(result);

View file

@ -193,7 +193,7 @@ write_str(stringio *self, PyObject *obj)
}
if (self->writenl) {
PyObject *translated = PyUnicode_Replace(
decoded, &_Py_STR(newline), self->writenl, -1);
decoded, _Py_LATIN1_CHR('\n'), self->writenl, -1);
Py_SETREF(decoded, translated);
}
if (decoded == NULL)

View file

@ -2,6 +2,9 @@
#include "pycore_moduleobject.h" // _PyModule_GetState()
#include "structmember.h" // PyMemberDef
#include "pycore_runtime.h" // _Py_ID()
#include "pycore_pystate.h" // _PyInterpreterState_GET()
#include "clinic/_operator.c.h"
typedef struct {
@ -1224,6 +1227,7 @@ attrgetter_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
return NULL;
/* prepare attr while checking args */
PyInterpreterState *interp = _PyInterpreterState_GET();
for (idx = 0; idx < nattrs; ++idx) {
PyObject *item = PyTuple_GET_ITEM(args, idx);
int dot_count;
@ -1251,7 +1255,7 @@ attrgetter_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
if (dot_count == 0) {
Py_INCREF(item);
PyUnicode_InternInPlace(&item);
_PyUnicode_InternMortal(interp, &item);
PyTuple_SET_ITEM(attr, idx, item);
} else { /* make it a tuple of non-dotted attrnames */
PyObject *attr_chain = PyTuple_New(dot_count + 1);
@ -1277,7 +1281,7 @@ attrgetter_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
Py_DECREF(attr);
return NULL;
}
PyUnicode_InternInPlace(&attr_chain_item);
_PyUnicode_InternMortal(interp, &attr_chain_item);
PyTuple_SET_ITEM(attr_chain, attr_chain_idx, attr_chain_item);
++attr_chain_idx;
unibuff_till = unibuff_from = unibuff_till + 1;
@ -1291,7 +1295,7 @@ attrgetter_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
Py_DECREF(attr);
return NULL;
}
PyUnicode_InternInPlace(&attr_chain_item);
_PyUnicode_InternMortal(interp, &attr_chain_item);
PyTuple_SET_ITEM(attr_chain, attr_chain_idx, attr_chain_item);
PyTuple_SET_ITEM(attr, idx, attr_chain);
@ -1587,7 +1591,8 @@ methodcaller_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
name = PyTuple_GET_ITEM(args, 0);
Py_INCREF(name);
PyUnicode_InternInPlace(&name);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &name);
mc->name = name;
mc->kwds = Py_XNewRef(kwds);

View file

@ -1865,8 +1865,7 @@ get_dotted_path(PyObject *obj, PyObject *name)
{
PyObject *dotted_path;
Py_ssize_t i, n;
_Py_DECLARE_STR(dot, ".");
dotted_path = PyUnicode_Split(name, &_Py_STR(dot), -1);
dotted_path = PyUnicode_Split(name, _Py_LATIN1_CHR('.'), -1);
if (dotted_path == NULL)
return NULL;
n = PyList_GET_SIZE(dotted_path);
@ -6717,8 +6716,10 @@ load_build(PickleState *st, UnpicklerObject *self)
/* normally the keys for instance attributes are
interned. we should try to do that here. */
Py_INCREF(d_key);
if (PyUnicode_CheckExact(d_key))
PyUnicode_InternInPlace(&d_key);
if (PyUnicode_CheckExact(d_key)) {
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &d_key);
}
if (PyObject_SetItem(dict, d_key, d_value) < 0) {
Py_DECREF(d_key);
goto error;

View file

@ -717,7 +717,7 @@ pysqlite_connection_set_progress_handler(pysqlite_Connection *self, PyTypeObject
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(progress_handler), &_Py_ID(n), },
.ob_item = { &_Py_ID(progress_handler), _Py_LATIN1_CHR('n'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1665,4 +1665,4 @@ exit:
#ifndef DESERIALIZE_METHODDEF
#define DESERIALIZE_METHODDEF
#endif /* !defined(DESERIALIZE_METHODDEF) */
/*[clinic end generated code: output=834a99827555bf1a input=a9049054013a1b77]*/
/*[clinic end generated code: output=305d580e3eaa622d input=a9049054013a1b77]*/

View file

@ -43,7 +43,7 @@ _bisect_bisect_right(PyObject *module, PyObject *const *args, Py_ssize_t nargs,
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(x), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('x'), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -151,7 +151,7 @@ _bisect_insort_right(PyObject *module, PyObject *const *args, Py_ssize_t nargs,
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(x), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('x'), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -256,7 +256,7 @@ _bisect_bisect_left(PyObject *module, PyObject *const *args, Py_ssize_t nargs, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(x), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('x'), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -364,7 +364,7 @@ _bisect_insort_left(PyObject *module, PyObject *const *args, Py_ssize_t nargs, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(x), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('x'), &_Py_ID(lo), &_Py_ID(hi), &_Py_ID(key), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -433,4 +433,4 @@ skip_optional_kwonly:
exit:
return return_value;
}
/*[clinic end generated code: output=5a7fa64bf9b262f3 input=a9049054013a1b77]*/
/*[clinic end generated code: output=57335e39ce2bf80e input=a9049054013a1b77]*/

View file

@ -1354,7 +1354,7 @@ _hashlib_scrypt(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObj
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(password), &_Py_ID(salt), &_Py_ID(n), &_Py_ID(r), &_Py_ID(p), &_Py_ID(maxmem), &_Py_ID(dklen), },
.ob_item = { &_Py_ID(password), &_Py_ID(salt), _Py_LATIN1_CHR('n'), _Py_LATIN1_CHR('r'), _Py_LATIN1_CHR('p'), &_Py_ID(maxmem), &_Py_ID(dklen), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1851,4 +1851,4 @@ exit:
#ifndef _HASHLIB_SCRYPT_METHODDEF
#define _HASHLIB_SCRYPT_METHODDEF
#endif /* !defined(_HASHLIB_SCRYPT_METHODDEF) */
/*[clinic end generated code: output=b339e255db698147 input=a9049054013a1b77]*/
/*[clinic end generated code: output=4734184f6555dc95 input=a9049054013a1b77]*/

View file

@ -1263,7 +1263,7 @@ keywords(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject *kw
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1319,7 +1319,7 @@ keywords_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObj
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1375,7 +1375,7 @@ keywords_opt(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1444,7 +1444,7 @@ keywords_opt_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t nargs, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1524,7 +1524,7 @@ keywords_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t nargs, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1592,7 +1592,7 @@ posonly_keywords(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyOb
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1648,7 +1648,7 @@ posonly_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObje
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1705,7 +1705,7 @@ posonly_keywords_kwonly(PyObject *module, PyObject *const *args, Py_ssize_t narg
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1764,7 +1764,7 @@ posonly_keywords_opt(PyObject *module, PyObject *const *args, Py_ssize_t nargs,
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1835,7 +1835,7 @@ posonly_opt_keywords_opt(PyObject *module, PyObject *const *args, Py_ssize_t nar
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1911,7 +1911,7 @@ posonly_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t nargs, Py
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -1982,7 +1982,7 @@ posonly_opt_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t nargs
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2058,7 +2058,7 @@ posonly_keywords_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssize_t
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), &_Py_ID(e), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), _Py_LATIN1_CHR('e'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2133,7 +2133,7 @@ posonly_keywords_opt_kwonly_opt(PyObject *module, PyObject *const *args, Py_ssiz
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), &_Py_ID(c), &_Py_ID(d), &_Py_ID(e), },
.ob_item = { _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), _Py_LATIN1_CHR('e'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2217,7 +2217,7 @@ posonly_opt_keywords_opt_kwonly_opt(PyObject *module, PyObject *const *args, Py_
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(c), &_Py_ID(d), },
.ob_item = { _Py_LATIN1_CHR('c'), _Py_LATIN1_CHR('d'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2296,7 +2296,7 @@ keyword_only_parameter(PyObject *module, PyObject *const *args, Py_ssize_t nargs
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), },
.ob_item = { _Py_LATIN1_CHR('a'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2351,7 +2351,7 @@ posonly_vararg(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObje
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2446,7 +2446,7 @@ vararg(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject *kwna
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), },
.ob_item = { _Py_LATIN1_CHR('a'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2504,7 +2504,7 @@ vararg_with_default(PyObject *module, PyObject *const *args, Py_ssize_t nargs, P
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2572,7 +2572,7 @@ vararg_with_default2(PyObject *module, PyObject *const *args, Py_ssize_t nargs,
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(c), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), _Py_LATIN1_CHR('c'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -2643,7 +2643,7 @@ vararg_with_only_defaults(PyObject *module, PyObject *const *args, Py_ssize_t na
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(b), },
.ob_item = { _Py_LATIN1_CHR('b'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -3169,4 +3169,4 @@ _testclinic_TestClass_meth_method_no_params(PyObject *self, PyTypeObject *cls, P
}
return _testclinic_TestClass_meth_method_no_params_impl(self, cls);
}
/*[clinic end generated code: output=d1fcf6ab8867f4ad input=a9049054013a1b77]*/
/*[clinic end generated code: output=74fdd265fd402226 input=a9049054013a1b77]*/

View file

@ -88,7 +88,7 @@ _testmultiphase_StateAccessType_increment_count_clinic(StateAccessTypeObject *se
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(n), &_Py_ID(twice), },
.ob_item = { _Py_LATIN1_CHR('n'), &_Py_ID(twice), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -162,4 +162,4 @@ _testmultiphase_StateAccessType_get_count(StateAccessTypeObject *self, PyTypeObj
}
return _testmultiphase_StateAccessType_get_count_impl(self, cls);
}
/*[clinic end generated code: output=2193fe33d5e2b739 input=a9049054013a1b77]*/
/*[clinic end generated code: output=0543b54ec62be171 input=a9049054013a1b77]*/

View file

@ -908,7 +908,7 @@ cmath_isclose(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObjec
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(rel_tol), &_Py_ID(abs_tol), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), &_Py_ID(rel_tol), &_Py_ID(abs_tol), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -982,4 +982,4 @@ skip_optional_kwonly:
exit:
return return_value;
}
/*[clinic end generated code: output=87f609786ef270cd input=a9049054013a1b77]*/
/*[clinic end generated code: output=a6c9ca48ffe871b6 input=a9049054013a1b77]*/

View file

@ -42,7 +42,7 @@ batched_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(iterable), &_Py_ID(n), },
.ob_item = { &_Py_ID(iterable), _Py_LATIN1_CHR('n'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -494,7 +494,7 @@ itertools_combinations(PyTypeObject *type, PyObject *args, PyObject *kwargs)
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(iterable), &_Py_ID(r), },
.ob_item = { &_Py_ID(iterable), _Py_LATIN1_CHR('r'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -565,7 +565,7 @@ itertools_combinations_with_replacement(PyTypeObject *type, PyObject *args, PyOb
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(iterable), &_Py_ID(r), },
.ob_item = { &_Py_ID(iterable), _Py_LATIN1_CHR('r'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -635,7 +635,7 @@ itertools_permutations(PyTypeObject *type, PyObject *args, PyObject *kwargs)
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(iterable), &_Py_ID(r), },
.ob_item = { &_Py_ID(iterable), _Py_LATIN1_CHR('r'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -913,4 +913,4 @@ skip_optional_pos:
exit:
return return_value;
}
/*[clinic end generated code: output=111cbd102c2a23c9 input=a9049054013a1b77]*/
/*[clinic end generated code: output=55a83cfda62afb57 input=a9049054013a1b77]*/

View file

@ -587,7 +587,7 @@ math_isclose(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(a), &_Py_ID(b), &_Py_ID(rel_tol), &_Py_ID(abs_tol), },
.ob_item = { _Py_LATIN1_CHR('a'), _Py_LATIN1_CHR('b'), &_Py_ID(rel_tol), &_Py_ID(abs_tol), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -950,4 +950,4 @@ math_ulp(PyObject *module, PyObject *arg)
exit:
return return_value;
}
/*[clinic end generated code: output=bd6c271030b9698b input=a9049054013a1b77]*/
/*[clinic end generated code: output=c1335a499389a04e input=a9049054013a1b77]*/

View file

@ -2038,7 +2038,7 @@ os__path_isdir(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObje
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(s), },
.ob_item = { _Py_LATIN1_CHR('s'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -12002,4 +12002,4 @@ exit:
#ifndef OS_WAITSTATUS_TO_EXITCODE_METHODDEF
#define OS_WAITSTATUS_TO_EXITCODE_METHODDEF
#endif /* !defined(OS_WAITSTATUS_TO_EXITCODE_METHODDEF) */
/*[clinic end generated code: output=6f0c08f692891c72 input=a9049054013a1b77]*/
/*[clinic end generated code: output=67c2e3d4537287c1 input=a9049054013a1b77]*/

View file

@ -132,6 +132,7 @@ all_name_chars(PyObject *o)
static int
intern_strings(PyObject *tuple)
{
PyInterpreterState *interp = _PyInterpreterState_GET();
Py_ssize_t i;
for (i = PyTuple_GET_SIZE(tuple); --i >= 0; ) {
@ -141,7 +142,7 @@ intern_strings(PyObject *tuple)
"non-string found in code slot");
return -1;
}
PyUnicode_InternInPlace(&_PyTuple_ITEMS(tuple)[i]);
_PyUnicode_InternImmortal(interp, &_PyTuple_ITEMS(tuple)[i]);
}
return 0;
}
@ -150,6 +151,7 @@ intern_strings(PyObject *tuple)
static int
intern_string_constants(PyObject *tuple, int *modified)
{
PyInterpreterState *interp = _PyInterpreterState_GET();
for (Py_ssize_t i = PyTuple_GET_SIZE(tuple); --i >= 0; ) {
PyObject *v = PyTuple_GET_ITEM(tuple, i);
if (PyUnicode_CheckExact(v)) {
@ -159,7 +161,7 @@ intern_string_constants(PyObject *tuple, int *modified)
if (all_name_chars(v)) {
PyObject *w = v;
PyUnicode_InternInPlace(&v);
_PyUnicode_InternMortal(interp, &v);
if (w != v) {
PyTuple_SET_ITEM(tuple, i, v);
if (modified) {

View file

@ -3916,7 +3916,8 @@ PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
kv = PyUnicode_FromString(key);
if (kv == NULL)
return -1;
PyUnicode_InternInPlace(&kv); /* XXX Should we really? */
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternImmortal(interp, &kv); /* XXX Should we really? */
err = PyDict_SetItem(v, kv, item);
Py_DECREF(kv);
return err;

View file

@ -1170,7 +1170,8 @@ PyObject_SetAttr(PyObject *v, PyObject *name, PyObject *value)
}
Py_INCREF(name);
PyUnicode_InternInPlace(&name);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &name);
if (tp->tp_setattro != NULL) {
err = (*tp->tp_setattro)(v, name, value);
Py_DECREF(name);

View file

@ -1080,8 +1080,10 @@ type_module(PyTypeObject *type, void *context)
if (s != NULL) {
mod = PyUnicode_FromStringAndSize(
type->tp_name, (Py_ssize_t)(s - type->tp_name));
if (mod != NULL)
PyUnicode_InternInPlace(&mod);
if (mod != NULL) {
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &mod);
}
}
else {
mod = Py_NewRef(&_Py_ID(builtins));
@ -4918,7 +4920,8 @@ type_setattro(PyTypeObject *type, PyObject *name, PyObject *value)
}
/* bpo-40521: Interned strings are shared by all subinterpreters */
if (!PyUnicode_CHECK_INTERNED(name)) {
PyUnicode_InternInPlace(&name);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &name);
if (!PyUnicode_CHECK_INTERNED(name)) {
PyErr_SetString(PyExc_MemoryError,
"Out of memory interning an attribute name");

View file

@ -179,10 +179,7 @@ extern "C" {
*_to++ = (to_type) *_iter++; \
} while (0)
#define LATIN1(ch) \
(ch < 128 \
? (PyObject*)&_Py_SINGLETON(strings).ascii[ch] \
: (PyObject*)&_Py_SINGLETON(strings).latin1[ch - 128])
#define LATIN1 _Py_LATIN1_CHR
#ifdef MS_WINDOWS
/* On Windows, overallocate by 50% is the best factor */
@ -225,18 +222,20 @@ static inline PyObject* unicode_new_empty(void)
return Py_NewRef(empty);
}
/* This dictionary holds all interned unicode strings. Note that references
to strings in this dictionary are *not* counted in the string's ob_refcnt.
When the interned string reaches a refcnt of 0 the string deallocation
function will delete the reference from this dictionary.
*/
/* This dictionary holds per-interpreter interned strings.
* See InternalDocs/string_interning.md for details.
*/
static inline PyObject *get_interned_dict(PyInterpreterState *interp)
{
return _Py_INTERP_CACHED_OBJECT(interp, interned_strings);
}
/* This hashtable holds statically allocated interned strings.
* See InternalDocs/string_interning.md for details.
*/
#define INTERNED_STRINGS _PyRuntime.cached_objects.interned_strings
/* Get number of all interned strings for the current interpreter. */
Py_ssize_t
_PyUnicode_InternedSize(void)
{
@ -244,6 +243,27 @@ _PyUnicode_InternedSize(void)
return _Py_hashtable_len(INTERNED_STRINGS) + PyDict_GET_SIZE(dict);
}
/* Get number of immortal interned strings for the current interpreter. */
Py_ssize_t
_PyUnicode_InternedSize_Immortal(void)
{
PyObject *dict = get_interned_dict(_PyInterpreterState_GET());
PyObject *key, *value;
Py_ssize_t pos = 0;
Py_ssize_t count = 0;
// It's tempting to keep a count and avoid a loop here. But, this function
// is intended for refleak tests. It spends extra work to report the true
// value, to help detect bugs in optimizations.
while (PyDict_Next(dict, &pos, &key, &value)) {
if (_Py_IsImmortal(key)) {
count++;
}
}
return _Py_hashtable_len(INTERNED_STRINGS) + count;
}
static Py_hash_t unicode_hash(PyObject *);
static int unicode_compare_eq(PyObject *, PyObject *);
@ -286,20 +306,6 @@ has_shared_intern_dict(PyInterpreterState *interp)
static int
init_interned_dict(PyInterpreterState *interp)
{
if (_Py_IsMainInterpreter(interp)) {
assert(INTERNED_STRINGS == NULL);
_Py_hashtable_allocator_t hashtable_alloc = {PyMem_RawMalloc, PyMem_RawFree};
INTERNED_STRINGS = _Py_hashtable_new_full(
hashtable_unicode_hash,
hashtable_unicode_compare,
NULL,
NULL,
&hashtable_alloc
);
if (INTERNED_STRINGS == NULL) {
return -1;
}
}
assert(get_interned_dict(interp) == NULL);
PyObject *interned;
if (has_shared_intern_dict(interp)) {
@ -328,7 +334,55 @@ clear_interned_dict(PyInterpreterState *interp)
Py_DECREF(interned);
_Py_INTERP_CACHED_OBJECT(interp, interned_strings) = NULL;
}
if (_Py_IsMainInterpreter(interp) && INTERNED_STRINGS != NULL) {
}
static PyStatus
init_global_interned_strings(PyInterpreterState *interp)
{
assert(INTERNED_STRINGS == NULL);
_Py_hashtable_allocator_t hashtable_alloc = {PyMem_RawMalloc, PyMem_RawFree};
INTERNED_STRINGS = _Py_hashtable_new_full(
hashtable_unicode_hash,
hashtable_unicode_compare,
// Objects stored here are immortal and statically allocated,
// so we don't need key_destroy_func & value_destroy_func:
NULL,
NULL,
&hashtable_alloc
);
if (INTERNED_STRINGS == NULL) {
PyErr_Clear();
return _PyStatus_ERR("failed to create global interned dict");
}
/* Intern statically allocated string identifiers, deepfreeze strings,
* and one-byte latin-1 strings.
* This must be done before any module initialization so that statically
* allocated string identifiers are used instead of heap allocated strings.
* Deepfreeze uses the interned identifiers if present to save space
* else generates them and they are interned to speed up dict lookups.
*/
_PyUnicode_InitStaticStrings(interp);
for (int i = 0; i < 256; i++) {
PyObject *s = LATIN1(i);
_PyUnicode_InternStatic(interp, &s);
assert(s == LATIN1(i));
}
#ifdef Py_DEBUG
assert(_PyUnicode_CheckConsistency(&_Py_STR(empty), 1));
for (int i = 0; i < 256; i++) {
assert(_PyUnicode_CheckConsistency(LATIN1(i), 1));
}
#endif
return _PyStatus_OK();
}
static void clear_global_interned_strings(void)
{
if (INTERNED_STRINGS != NULL) {
_Py_hashtable_destroy(INTERNED_STRINGS);
INTERNED_STRINGS = NULL;
}
@ -661,6 +715,39 @@ _PyUnicode_CheckConsistency(PyObject *op, int check_content)
}
CHECK(PyUnicode_READ(kind, data, ascii->length) == 0);
}
/* Check interning state */
#ifdef Py_DEBUG
switch (PyUnicode_CHECK_INTERNED(op)) {
case SSTATE_NOT_INTERNED:
if (ascii->state.statically_allocated) {
CHECK(_Py_IsImmortal(op));
// This state is for two exceptions:
// - strings are currently checked before they're interned
// - the 256 one-latin1-character strings
// are static but use SSTATE_NOT_INTERNED
}
else {
CHECK(!_Py_IsImmortal(op));
}
break;
case SSTATE_INTERNED_MORTAL:
CHECK(!ascii->state.statically_allocated);
CHECK(!_Py_IsImmortal(op));
break;
case SSTATE_INTERNED_IMMORTAL:
CHECK(!ascii->state.statically_allocated);
CHECK(_Py_IsImmortal(op));
break;
case SSTATE_INTERNED_IMMORTAL_STATIC:
CHECK(ascii->state.statically_allocated);
CHECK(_Py_IsImmortal(op));
break;
default:
Py_UNREACHABLE();
}
#endif
return 1;
#undef CHECK
@ -1619,16 +1706,65 @@ unicode_dealloc(PyObject *unicode)
_Py_FatalRefcountError("deallocating an Unicode singleton");
}
#endif
/* This should never get called, but we also don't want to SEGV if
* we accidentally decref an immortal string out of existence. Since
* the string is an immortal object, just re-set the reference count.
*/
if (PyUnicode_CHECK_INTERNED(unicode)
|| _PyUnicode_STATE(unicode).statically_allocated)
{
if (_PyUnicode_STATE(unicode).statically_allocated) {
/* This should never get called, but we also don't want to SEGV if
* we accidentally decref an immortal string out of existence. Since
* the string is an immortal object, just re-set the reference count.
*/
#ifdef Py_DEBUG
Py_UNREACHABLE();
#endif
_Py_SetImmortal(unicode);
return;
}
switch (_PyUnicode_STATE(unicode).interned) {
case SSTATE_NOT_INTERNED:
break;
case SSTATE_INTERNED_MORTAL:
/* Remove the object from the intern dict.
* Before doing so, we set the refcount to 3: the key and value
* in the interned_dict, plus one to work with.
*/
assert(Py_REFCNT(unicode) == 0);
Py_SET_REFCNT(unicode, 3);
#ifdef Py_REF_DEBUG
/* let's be pedantic with the ref total */
_Py_IncRefTotal(_PyInterpreterState_GET());
_Py_IncRefTotal(_PyInterpreterState_GET());
_Py_IncRefTotal(_PyInterpreterState_GET());
#endif
PyInterpreterState *interp = _PyInterpreterState_GET();
PyObject *interned = get_interned_dict(interp);
assert(interned != NULL);
int r = PyDict_DelItem(interned, unicode);
if (r == -1) {
PyErr_WriteUnraisable(unicode);
// We don't know what happened to the string. It's probably
// best to leak it:
// - if it was not found, something is very wrong
// - if it was deleted, there are no more references to it
// so it can't cause trouble (except wasted memory)
// - if it wasn't deleted, it'll remain interned
_Py_SetImmortal(unicode);
_PyUnicode_STATE(unicode).interned = SSTATE_INTERNED_IMMORTAL;
return;
}
// Only our work reference should be left; remove it too.
assert(Py_REFCNT(unicode) == 1);
Py_SET_REFCNT(unicode, 0);
#ifdef Py_REF_DEBUG
/* let's be pedantic with the ref total */
_Py_DecRefTotal(_PyInterpreterState_GET());
#endif
break;
default:
// As with `statically_allocated` above.
#ifdef Py_REF_DEBUG
Py_UNREACHABLE();
#endif
_Py_SetImmortal(unicode);
return;
}
if (_PyUnicode_HAS_UTF8_MEMORY(unicode)) {
PyObject_Free(_PyUnicode_UTF8(unicode));
}
@ -1970,7 +2106,7 @@ _PyUnicode_FromId(_Py_Identifier *id)
if (!obj) {
return NULL;
}
PyUnicode_InternInPlace(&obj);
_PyUnicode_InternImmortal(interp, &obj);
if (index >= ids->size) {
// Overallocate to reduce the number of realloc
@ -10755,8 +10891,10 @@ _PyUnicode_EqualToASCIIId(PyObject *left, _Py_Identifier *right)
if (left == right_uni)
return 1;
if (PyUnicode_CHECK_INTERNED(left))
assert(PyUnicode_CHECK_INTERNED(right_uni));
if (PyUnicode_CHECK_INTERNED(left)) {
return 0;
}
assert(_PyUnicode_HASH(right_uni) != -1);
Py_hash_t hash = _PyUnicode_HASH(left);
@ -14731,30 +14869,28 @@ _PyUnicode_InitState(PyInterpreterState *interp)
PyStatus
_PyUnicode_InitGlobalObjects(PyInterpreterState *interp)
{
// Initialize the global interned dict
if (_Py_IsMainInterpreter(interp)) {
PyStatus status = init_global_interned_strings(interp);
if (_PyStatus_EXCEPTION(status)) {
return status;
}
}
assert(INTERNED_STRINGS);
return _PyStatus_OK();
}
PyStatus
_PyUnicode_InitInternDict(PyInterpreterState *interp)
{
assert(INTERNED_STRINGS);
if (init_interned_dict(interp)) {
PyErr_Clear();
return _PyStatus_ERR("failed to create interned dict");
}
if (_Py_IsMainInterpreter(interp)) {
/* Intern statically allocated string identifiers and deepfreeze strings.
* This must be done before any module initialization so that statically
* allocated string identifiers are used instead of heap allocated strings.
* Deepfreeze uses the interned identifiers if present to save space
* else generates them and they are interned to speed up dict lookups.
*/
_PyUnicode_InitStaticStrings(interp);
#ifdef Py_DEBUG
assert(_PyUnicode_CheckConsistency(&_Py_STR(empty), 1));
for (int i = 0; i < 256; i++) {
assert(_PyUnicode_CheckConsistency(LATIN1(i), 1));
}
#endif
}
return _PyStatus_OK();
}
@ -14777,104 +14913,243 @@ error:
return _PyStatus_ERR("Can't initialize unicode types");
}
static /* non-null */ PyObject*
intern_static(PyInterpreterState *interp, PyObject *s /* stolen */)
{
// Note that this steals a reference to `s`, but in many cases that
// stolen ref is returned, requiring no decref/incref.
assert(s != NULL);
assert(_PyUnicode_CHECK(s));
assert(_PyUnicode_STATE(s).statically_allocated);
assert(!PyUnicode_CHECK_INTERNED(s));
#ifdef Py_DEBUG
/* We must not add process-global interned string if there's already a
* per-interpreter interned_dict, which might contain duplicates.
*/
PyObject *interned = get_interned_dict(interp);
assert(interned == NULL);
#endif
/* Look in the global cache first. */
PyObject *r = (PyObject *)_Py_hashtable_get(INTERNED_STRINGS, s);
/* We should only init each string once */
assert(r == NULL);
/* but just in case (for the non-debug build), handle this */
if (r != NULL && r != s) {
assert(_PyUnicode_STATE(r).interned == SSTATE_INTERNED_IMMORTAL_STATIC);
assert(_PyUnicode_CHECK(r));
Py_DECREF(s);
return Py_NewRef(r);
}
if (_Py_hashtable_set(INTERNED_STRINGS, s, s) < -1) {
Py_FatalError("failed to intern static string");
}
_PyUnicode_STATE(s).interned = SSTATE_INTERNED_IMMORTAL_STATIC;
return s;
}
void
_PyUnicode_InternInPlace(PyInterpreterState *interp, PyObject **p)
_PyUnicode_InternStatic(PyInterpreterState *interp, PyObject **p)
{
PyObject *s = *p;
// This should only be called as part of runtime initialization
assert(!Py_IsInitialized());
*p = intern_static(interp, *p);
assert(*p);
}
static void
immortalize_interned(PyObject *s)
{
assert(PyUnicode_CHECK_INTERNED(s) == SSTATE_INTERNED_MORTAL);
assert(!_Py_IsImmortal(s));
#ifdef Py_REF_DEBUG
/* The reference count value should be excluded from the RefTotal.
The decrements to these objects will not be registered so they
need to be accounted for in here. */
for (Py_ssize_t i = 0; i < Py_REFCNT(s); i++) {
_Py_DecRefTotal(_PyInterpreterState_GET());
}
#endif
_PyUnicode_STATE(s).interned = SSTATE_INTERNED_IMMORTAL;
_Py_SetImmortal(s);
}
static /* non-null */ PyObject*
intern_common(PyInterpreterState *interp, PyObject *s /* stolen */,
bool immortalize)
{
// Note that this steals a reference to `s`, but in many cases that
// stolen ref is returned, requiring no decref/incref.
#ifdef Py_DEBUG
assert(s != NULL);
assert(_PyUnicode_CHECK(s));
#else
if (s == NULL || !PyUnicode_Check(s)) {
return;
return s;
}
#endif
/* If it's a subclass, we don't really know what putting
it in the interned dict might do. */
if (!PyUnicode_CheckExact(s)) {
return;
return s;
}
if (PyUnicode_CHECK_INTERNED(s)) {
return;
/* Is it already interned? */
switch (PyUnicode_CHECK_INTERNED(s)) {
case SSTATE_NOT_INTERNED:
// no, go on
break;
case SSTATE_INTERNED_MORTAL:
// yes but we might need to make it immortal
if (immortalize) {
immortalize_interned(s);
}
return s;
default:
// all done
return s;
}
/* Look in the global cache first. */
PyObject *r = (PyObject *)_Py_hashtable_get(INTERNED_STRINGS, s);
if (r != NULL && r != s) {
Py_SETREF(*p, Py_NewRef(r));
return;
}
/* Handle statically allocated strings. */
if (_PyUnicode_STATE(s).statically_allocated) {
assert(_Py_IsImmortal(s));
if (_Py_hashtable_set(INTERNED_STRINGS, s, s) == 0) {
_PyUnicode_STATE(*p).interned = SSTATE_INTERNED_IMMORTAL_STATIC;
}
return;
return intern_static(interp, s);
}
/* Look in the per-interpreter cache. */
/* If it's already immortal, intern it as such */
if (_Py_IsImmortal(s)) {
immortalize = 1;
}
/* if it's a short string, get the singleton */
if (PyUnicode_GET_LENGTH(s) == 1 &&
PyUnicode_KIND(s) == PyUnicode_1BYTE_KIND) {
PyObject *r = LATIN1(*(unsigned char*)PyUnicode_DATA(s));
assert(PyUnicode_CHECK_INTERNED(r));
Py_DECREF(s);
return r;
}
#ifdef Py_DEBUG
assert(!unicode_is_singleton(s));
#endif
/* Look in the global cache now. */
{
PyObject *r = (PyObject *)_Py_hashtable_get(INTERNED_STRINGS, s);
if (r != NULL) {
assert(_Py_IsImmortal(r));
assert(r != s); // r must be statically_allocated; s is not
Py_DECREF(s);
return Py_NewRef(r);
}
}
/* Do a setdefault on the per-interpreter cache. */
PyObject *interned = get_interned_dict(interp);
assert(interned != NULL);
PyObject *t = PyDict_SetDefault(interned, s, s);
PyObject *t = PyDict_SetDefault(interned, s, s); // t is borrowed
if (t == NULL) {
PyErr_Clear();
return;
return s;
}
if (t != s) {
Py_SETREF(*p, Py_NewRef(t));
return;
// value was already present (not inserted)
Py_INCREF(t);
Py_DECREF(s);
if (immortalize &&
PyUnicode_CHECK_INTERNED(t) == SSTATE_INTERNED_MORTAL) {
immortalize_interned(t);
}
return t;
}
else {
// value was newly inserted
}
if (_Py_IsImmortal(s)) {
// XXX Restrict this to the main interpreter?
_PyUnicode_STATE(*p).interned = SSTATE_INTERNED_IMMORTAL_STATIC;
return;
}
/* NOT_INTERNED -> INTERNED_MORTAL */
assert(_PyUnicode_STATE(s).interned == SSTATE_NOT_INTERNED);
if (!_Py_IsImmortal(s)) {
/* The two references in interned dict (key and value) are not counted.
unicode_dealloc() and _PyUnicode_ClearInterned() take care of this. */
Py_SET_REFCNT(s, Py_REFCNT(s) - 2);
#ifdef Py_REF_DEBUG
/* The reference count value excluding the 2 references from the
interned dictionary should be excluded from the RefTotal. The
decrements to these objects will not be registered so they
need to be accounted for in here. */
for (Py_ssize_t i = 0; i < Py_REFCNT(s) - 2; i++) {
/* let's be pedantic with the ref total */
_Py_DecRefTotal(_PyInterpreterState_GET());
_Py_DecRefTotal(_PyInterpreterState_GET());
#endif
}
_PyUnicode_STATE(s).interned = SSTATE_INTERNED_MORTAL;
/* INTERNED_MORTAL -> INTERNED_IMMORTAL (if needed) */
#ifdef Py_DEBUG
if (_Py_IsImmortal(s)) {
assert(immortalize);
}
#endif
_Py_SetImmortal(s);
_PyUnicode_STATE(*p).interned = SSTATE_INTERNED_IMMORTAL;
if (immortalize) {
immortalize_interned(s);
}
return s;
}
void
_PyUnicode_InternImmortal(PyInterpreterState *interp, PyObject **p)
{
*p = intern_common(interp, *p, 1);
assert(*p);
}
void
_PyUnicode_InternMortal(PyInterpreterState *interp, PyObject **p)
{
*p = intern_common(interp, *p, 0);
assert(*p);
}
void
_PyUnicode_InternInPlace(PyInterpreterState *interp, PyObject **p)
{
_PyUnicode_InternImmortal(interp, p);
return;
}
void
PyUnicode_InternInPlace(PyObject **p)
{
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternInPlace(interp, p);
_PyUnicode_InternMortal(interp, p);
}
// Function kept for the stable ABI.
// Public-looking name kept for the stable ABI; user should not call this:
PyAPI_FUNC(void) PyUnicode_InternImmortal(PyObject **);
void
PyUnicode_InternImmortal(PyObject **p)
{
PyUnicode_InternInPlace(p);
// Leak a reference on purpose
Py_INCREF(*p);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternImmortal(interp, p);
}
PyObject *
PyUnicode_InternFromString(const char *cp)
{
PyObject *s = PyUnicode_FromString(cp);
if (s == NULL)
if (s == NULL) {
return NULL;
PyUnicode_InternInPlace(&s);
}
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &s);
return s;
}
@ -14895,20 +15170,6 @@ _PyUnicode_ClearInterned(PyInterpreterState *interp)
return;
}
/* TODO:
* Currently, the runtime is not able to guarantee that it can exit without
* allocations that carry over to a future initialization of Python within
* the same process. i.e:
* ./python -X showrefcount -c 'import itertools'
* [237 refs, 237 blocks]
*
* Therefore, this should remain disabled for until there is a strict guarantee
* that no memory will be left after `Py_Finalize`.
*/
#ifdef Py_DEBUG
/* For all non-singleton interned strings, restore the two valid references
to that instance from within the intern string dictionary and let the
normal reference counting process clean up these instances. */
#ifdef INTERNED_STATS
fprintf(stderr, "releasing %zd interned strings\n",
PyDict_GET_SIZE(interned));
@ -14922,13 +15183,32 @@ _PyUnicode_ClearInterned(PyInterpreterState *interp)
int shared = 0;
switch (PyUnicode_CHECK_INTERNED(s)) {
case SSTATE_INTERNED_IMMORTAL:
/* Make immortal interned strings mortal again.
*
* Currently, the runtime is not able to guarantee that it can exit
* without allocations that carry over to a future initialization
* of Python within the same process. i.e:
* ./python -X showrefcount -c 'import itertools'
* [237 refs, 237 blocks]
*
* This should remain disabled (`Py_DEBUG` only) until there is a
* strict guarantee that no memory will be left after
* `Py_Finalize`.
*/
#ifdef Py_DEBUG
// Skip the Immortal Instance check and restore
// the two references (key and value) ignored
// by PyUnicode_InternInPlace().
s->ob_refcnt = 2;
#ifdef Py_REF_DEBUG
/* let's be pedantic with the ref total */
_Py_IncRefTotal(_PyInterpreterState_GET());
_Py_IncRefTotal(_PyInterpreterState_GET());
#endif
#ifdef INTERNED_STATS
total_length += PyUnicode_GET_LENGTH(s);
#endif
#endif // Py_DEBUG
break;
case SSTATE_INTERNED_IMMORTAL_STATIC:
/* It is shared between interpreters, so we should unmark it
@ -14941,7 +15221,15 @@ _PyUnicode_ClearInterned(PyInterpreterState *interp)
}
break;
case SSTATE_INTERNED_MORTAL:
/* fall through */
// Restore 2 references held by the interned dict; these will
// be decref'd by clear_interned_dict's PyDict_Clear.
Py_SET_REFCNT(s, Py_REFCNT(s) + 2);
#ifdef Py_REF_DEBUG
/* let's be pedantic with the ref total */
_Py_IncRefTotal(_PyInterpreterState_GET());
_Py_IncRefTotal(_PyInterpreterState_GET());
#endif
break;
case SSTATE_NOT_INTERNED:
/* fall through */
default:
@ -14962,8 +15250,10 @@ _PyUnicode_ClearInterned(PyInterpreterState *interp)
for (Py_ssize_t i=0; i < ids->size; i++) {
Py_XINCREF(ids->array[i]);
}
#endif /* Py_DEBUG */
clear_interned_dict(interp);
if (_Py_IsMainInterpreter(interp)) {
clear_global_interned_strings();
}
}

View file

@ -36,7 +36,7 @@ _testconsole_write_input(PyObject *module, PyObject *const *args, Py_ssize_t nar
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(file), &_Py_ID(s), },
.ob_item = { &_Py_ID(file), _Py_LATIN1_CHR('s'), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
@ -140,4 +140,4 @@ exit:
#ifndef _TESTCONSOLE_READ_OUTPUT_METHODDEF
#define _TESTCONSOLE_READ_OUTPUT_METHODDEF
#endif /* !defined(_TESTCONSOLE_READ_OUTPUT_METHODDEF) */
/*[clinic end generated code: output=208c72e2c873555b input=a9049054013a1b77]*/
/*[clinic end generated code: output=2dcfa57c20b6e058 input=a9049054013a1b77]*/

View file

@ -4,6 +4,7 @@
#include "pycore_runtime.h" // _PyRuntime
#include "string_parser.h"
#include "tokenizer.h"
#include "pycore_pystate.h" // _PyInterpreterState_GET()
void *_PyPegen_dummy_name(Parser *p, ...) {
return &_PyRuntime.parser.dummy_name;
@ -148,7 +149,8 @@ expr_ty _PyPegen_join_names_with_dot(Parser *p, expr_ty first_name,
if (!uni) {
return NULL;
}
PyUnicode_InternInPlace(&uni);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternMortal(interp, &uni);
if (_PyArena_AddPyObject(p->arena, uni) < 0) {
Py_DECREF(uni);
return NULL;

View file

@ -581,7 +581,8 @@ _PyPegen_new_identifier(Parser *p, const char *n)
}
id = id2;
}
PyUnicode_InternInPlace(&id);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternImmortal(interp, &id);
if (_PyArena_AddPyObject(p->arena, id) < 0)
{
Py_DECREF(id);

View file

@ -26,7 +26,7 @@ unsigned char M_test_frozenmain[] = {
103,101,116,95,99,111,110,102,105,103,115,114,3,0,0,0,
218,3,107,101,121,169,0,243,0,0,0,0,250,18,116,101,
115,116,95,102,114,111,122,101,110,109,97,105,110,46,112,121,
250,8,60,109,111,100,117,108,101,62,114,18,0,0,0,1,
218,8,60,109,111,100,117,108,101,62,114,18,0,0,0,1,
0,0,0,115,97,0,0,0,240,3,1,1,1,243,8,0,
1,11,219,0,24,225,0,5,208,6,26,212,0,27,217,0,
5,128,106,144,35,151,40,145,40,212,0,27,216,9,38,208,

View file

@ -275,10 +275,9 @@ parse_literal(PyObject *fmt, Py_ssize_t *ppos, PyArena *arena)
PyObject *str = PyUnicode_Substring(fmt, start, pos);
/* str = str.replace('%%', '%') */
if (str && has_percents) {
_Py_DECLARE_STR(percent, "%");
_Py_DECLARE_STR(dbl_percent, "%%");
Py_SETREF(str, PyUnicode_Replace(str, &_Py_STR(dbl_percent),
&_Py_STR(percent), -1));
_Py_LATIN1_CHR('%'), -1));
}
if (!str) {
return NULL;

View file

@ -10,9 +10,7 @@
* See ast.unparse for a full unparser (written in Python)
*/
_Py_DECLARE_STR(open_br, "{");
_Py_DECLARE_STR(dbl_open_br, "{{");
_Py_DECLARE_STR(close_br, "}");
_Py_DECLARE_STR(dbl_close_br, "}}");
/* We would statically initialize this if doing so were simple enough. */
@ -580,11 +578,13 @@ escape_braces(PyObject *orig)
{
PyObject *temp;
PyObject *result;
temp = PyUnicode_Replace(orig, &_Py_STR(open_br), &_Py_STR(dbl_open_br), -1);
temp = PyUnicode_Replace(orig, _Py_LATIN1_CHR('{'),
&_Py_STR(dbl_open_br), -1);
if (!temp) {
return NULL;
}
result = PyUnicode_Replace(temp, &_Py_STR(close_br), &_Py_STR(dbl_close_br), -1);
result = PyUnicode_Replace(temp, _Py_LATIN1_CHR('}'),
&_Py_STR(dbl_close_br), -1);
Py_DECREF(temp);
return result;
}
@ -678,7 +678,7 @@ append_formattedvalue(_PyUnicodeWriter *writer, expr_ty e)
if (!temp_fv_str) {
return -1;
}
if (PyUnicode_Find(temp_fv_str, &_Py_STR(open_br), 0, 1, 1) == 0) {
if (PyUnicode_Find(temp_fv_str, _Py_LATIN1_CHR('{'), 0, 1, 1) == 0) {
/* Expression starts with a brace, split it with a space from the outer
one. */
outer_brace = "{ ";

View file

@ -913,24 +913,64 @@ exit:
}
PyDoc_STRVAR(sys_getunicodeinternedsize__doc__,
"getunicodeinternedsize($module, /)\n"
"getunicodeinternedsize($module, /, *, _only_immortal=False)\n"
"--\n"
"\n"
"Return the number of elements of the unicode interned dictionary");
#define SYS_GETUNICODEINTERNEDSIZE_METHODDEF \
{"getunicodeinternedsize", (PyCFunction)sys_getunicodeinternedsize, METH_NOARGS, sys_getunicodeinternedsize__doc__},
{"getunicodeinternedsize", _PyCFunction_CAST(sys_getunicodeinternedsize), METH_FASTCALL|METH_KEYWORDS, sys_getunicodeinternedsize__doc__},
static Py_ssize_t
sys_getunicodeinternedsize_impl(PyObject *module);
sys_getunicodeinternedsize_impl(PyObject *module, int _only_immortal);
static PyObject *
sys_getunicodeinternedsize(PyObject *module, PyObject *Py_UNUSED(ignored))
sys_getunicodeinternedsize(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
{
PyObject *return_value = NULL;
#if defined(Py_BUILD_CORE) && !defined(Py_BUILD_CORE_MODULE)
#define NUM_KEYWORDS 1
static struct {
PyGC_Head _this_is_not_used;
PyObject_VAR_HEAD
PyObject *ob_item[NUM_KEYWORDS];
} _kwtuple = {
.ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
.ob_item = { &_Py_ID(_only_immortal), },
};
#undef NUM_KEYWORDS
#define KWTUPLE (&_kwtuple.ob_base.ob_base)
#else // !Py_BUILD_CORE
# define KWTUPLE NULL
#endif // !Py_BUILD_CORE
static const char * const _keywords[] = {"_only_immortal", NULL};
static _PyArg_Parser _parser = {
.keywords = _keywords,
.fname = "getunicodeinternedsize",
.kwtuple = KWTUPLE,
};
#undef KWTUPLE
PyObject *argsbuf[1];
Py_ssize_t noptargs = nargs + (kwnames ? PyTuple_GET_SIZE(kwnames) : 0) - 0;
int _only_immortal = 0;
Py_ssize_t _return_value;
_return_value = sys_getunicodeinternedsize_impl(module);
args = _PyArg_UnpackKeywords(args, nargs, NULL, kwnames, &_parser, 0, 0, 0, argsbuf);
if (!args) {
goto exit;
}
if (!noptargs) {
goto skip_optional_kwonly;
}
_only_immortal = PyObject_IsTrue(args[0]);
if (_only_immortal < 0) {
goto exit;
}
skip_optional_kwonly:
_return_value = sys_getunicodeinternedsize_impl(module, _only_immortal);
if ((_return_value == -1) && PyErr_Occurred()) {
goto exit;
}
@ -1415,4 +1455,4 @@ exit:
#ifndef SYS_GETANDROIDAPILEVEL_METHODDEF
#define SYS_GETANDROIDAPILEVEL_METHODDEF
#endif /* !defined(SYS_GETANDROIDAPILEVEL_METHODDEF) */
/*[clinic end generated code: output=6d598acc26237fbe input=a9049054013a1b77]*/
/*[clinic end generated code: output=2a1fbdf7de450c63 input=a9049054013a1b77]*/

View file

@ -144,7 +144,9 @@ PyObject *_PyCodec_Lookup(const char *encoding)
if (v == NULL) {
return NULL;
}
PyUnicode_InternInPlace(&v);
/* Intern the string. We'll make it immortal later if lookup succeeds. */
_PyUnicode_InternMortal(interp, &v);
/* First, try to lookup the name in the registry dictionary */
PyObject *result = PyDict_GetItemWithError(interp->codec_search_cache, v);
@ -197,6 +199,8 @@ PyObject *_PyCodec_Lookup(const char *encoding)
goto onError;
}
_PyUnicode_InternImmortal(interp, &v);
/* Cache and return the result */
if (PyDict_SetItem(interp->codec_search_cache, v, result) < 0) {
Py_DECREF(result);

View file

@ -773,8 +773,7 @@ compiler_set_qualname(struct compiler *c)
}
if (base != NULL) {
_Py_DECLARE_STR(dot, ".");
name = PyUnicode_Concat(base, &_Py_STR(dot));
name = PyUnicode_Concat(base, _Py_LATIN1_CHR('.'));
Py_DECREF(base);
if (name == NULL) {
return ERROR;

View file

@ -1964,7 +1964,8 @@ new_kwtuple(const char * const *keywords, int total, int pos)
Py_DECREF(kwtuple);
return NULL;
}
PyUnicode_InternInPlace(&str);
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternImmortal(interp, &str);
PyTuple_SET_ITEM(kwtuple, i, str);
}
return kwtuple;

View file

@ -14,6 +14,7 @@
#include "pycore_long.h" // _PyLong_DigitCount
#include "pycore_hashtable.h" // _Py_hashtable_t
#include "marshal.h" // Py_MARSHAL_VERSION
#include "pycore_pystate.h" // _PyInterpreterState_GET()
/*[clinic input]
module marshal
@ -1158,8 +1159,12 @@ r_object(RFILE *p)
v = PyUnicode_FromKindAndData(PyUnicode_1BYTE_KIND, ptr, n);
if (v == NULL)
break;
if (is_interned)
PyUnicode_InternInPlace(&v);
if (is_interned) {
// marshal is meant to serialize .pyc files with code
// objects, and code-related strings are currently immortal.
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternImmortal(interp, &v);
}
retval = v;
R_REF(retval);
break;
@ -1191,8 +1196,12 @@ r_object(RFILE *p)
}
if (v == NULL)
break;
if (is_interned)
PyUnicode_InternInPlace(&v);
if (is_interned) {
// marshal is meant to serialize .pyc files with code
// objects, and code-related strings are currently immortal.
PyInterpreterState *interp = _PyInterpreterState_GET();
_PyUnicode_InternImmortal(interp, &v);
}
retval = v;
R_REF(retval);
break;

View file

@ -839,6 +839,13 @@ pycore_interp_init(PyThreadState *tstate)
return _PyStatus_ERR("failed to initialize deep-frozen modules");
}
// Per-interpreter interned string dict is created after deep-frozen
// modules have interned the global strings.
status = _PyUnicode_InitInternDict(interp);
if (_PyStatus_EXCEPTION(status)) {
return status;
}
status = pycore_init_types(interp);
if (_PyStatus_EXCEPTION(status)) {
goto done;

View file

@ -722,7 +722,7 @@ sys_displayhook(PyObject *module, PyObject *o)
if (o == Py_None) {
Py_RETURN_NONE;
}
if (PyObject_SetAttr(builtins, &_Py_ID(_), Py_None) != 0)
if (PyObject_SetAttr(builtins, _Py_LATIN1_CHR('_'), Py_None) != 0)
return NULL;
outf = _PySys_GetAttr(tstate, &_Py_ID(stdout));
if (outf == NULL || outf == Py_None) {
@ -744,10 +744,9 @@ sys_displayhook(PyObject *module, PyObject *o)
return NULL;
}
}
_Py_DECLARE_STR(newline, "\n");
if (PyFile_WriteObject(&_Py_STR(newline), outf, Py_PRINT_RAW) != 0)
if (PyFile_WriteObject(_Py_LATIN1_CHR('\n'), outf, Py_PRINT_RAW) != 0)
return NULL;
if (PyObject_SetAttr(builtins, &_Py_ID(_), o) != 0)
if (PyObject_SetAttr(builtins, _Py_LATIN1_CHR('_'), o) != 0)
return NULL;
Py_RETURN_NONE;
}
@ -927,8 +926,9 @@ sys_intern_impl(PyObject *module, PyObject *s)
/*[clinic end generated code: output=be680c24f5c9e5d6 input=849483c006924e2f]*/
{
if (PyUnicode_CheckExact(s)) {
PyInterpreterState *interp = _PyInterpreterState_GET();
Py_INCREF(s);
PyUnicode_InternInPlace(&s);
_PyUnicode_InternMortal(interp, &s);
return s;
}
else {
@ -1918,14 +1918,22 @@ sys_getallocatedblocks_impl(PyObject *module)
/*[clinic input]
sys.getunicodeinternedsize -> Py_ssize_t
*
_only_immortal: bool = False
Return the number of elements of the unicode interned dictionary
[clinic start generated code]*/
static Py_ssize_t
sys_getunicodeinternedsize_impl(PyObject *module)
/*[clinic end generated code: output=ad0e4c9738ed4129 input=726298eaa063347a]*/
sys_getunicodeinternedsize_impl(PyObject *module, int _only_immortal)
/*[clinic end generated code: output=29a6377a94a14f70 input=0330b3408dd5bcc6]*/
{
return _PyUnicode_InternedSize();
if (_only_immortal) {
return _PyUnicode_InternedSize_Immortal();
}
else {
return _PyUnicode_InternedSize();
}
}
/*[clinic input]

View file

@ -362,9 +362,14 @@ def generate_static_strings_initializer(identifiers, strings):
# This use of _Py_ID() is ignored by iter_global_strings()
# since iter_files() ignores .h files.
printer.write(f'string = &_Py_ID({i});')
printer.write(f'_PyUnicode_InternStatic(interp, &string);')
printer.write(f'assert(_PyUnicode_CheckConsistency(string, 1));')
printer.write(f'_PyUnicode_InternInPlace(interp, &string);')
# XXX What about "strings"?
printer.write(f'assert(PyUnicode_GET_LENGTH(string) != 1);')
for value, name in sorted(strings.items()):
printer.write(f'string = &_Py_STR({name});')
printer.write(f'_PyUnicode_InternStatic(interp, &string);')
printer.write(f'assert(_PyUnicode_CheckConsistency(string, 1));')
printer.write(f'assert(PyUnicode_GET_LENGTH(string) != 1);')
printer.write(END)
printer.write(after)
@ -406,15 +411,31 @@ def generate_global_object_finalizers(generated_immortal_objects):
def get_identifiers_and_strings() -> 'tuple[set[str], dict[str, str]]':
identifiers = set(IDENTIFIERS)
strings = {}
# Note that we store strings as they appear in C source, so the checks here
# can be defeated, e.g.:
# - "a" and "\0x61" won't be reported as duplicate.
# - "\n" appears as 2 characters.
# Probably not worth adding a C string parser.
for name, string, *_ in iter_global_strings():
if string is None:
if name not in IGNORED:
identifiers.add(name)
else:
if len(string) == 1 and ord(string) < 256:
# Give a nice message for common mistakes.
# To cover tricky cases (like "\n") we also generate C asserts.
raise ValueError(
'do not use &_Py_ID or &_Py_STR for one-character latin-1 '
+ f'strings, use _Py_LATIN1_CHR instead: {string!r}')
if string not in strings:
strings[string] = name
elif name != strings[string]:
raise ValueError(f'string mismatch for {name!r} ({string!r} != {strings[name]!r}')
overlap = identifiers & set(strings.keys())
if overlap:
raise ValueError(
'do not use both _Py_ID and _Py_DECLARE_STR for the same string: '
+ repr(overlap))
return identifiers, strings

View file

@ -86,6 +86,17 @@ Appender = Callable[[str], None]
Outputter = Callable[[], str]
TemplateDict = dict[str, str]
def c_id(name: str) -> str:
if len(name) == 1 and ord(name) < 256:
if name.isalnum():
return f"_Py_LATIN1_CHR('{name}')"
else:
return f'_Py_LATIN1_CHR({ord(name)})'
else:
return f'&_Py_ID({name})'
class _TextAccumulator(NamedTuple):
text: list[str]
append: Appender
@ -1504,7 +1515,7 @@ class CLanguage(Language):
template_dict['keywords_c'] = ' '.join('"' + k + '",'
for k in data.keywords)
keywords = [k for k in data.keywords if k]
template_dict['keywords_py'] = ' '.join('&_Py_ID(' + k + '),'
template_dict['keywords_py'] = ' '.join(c_id(k) + ','
for k in keywords)
template_dict['format_units'] = ''.join(data.format_units)
template_dict['parse_arguments'] = ', '.join(data.parse_arguments)