bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)

We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
This commit is contained in:
Eric Snow 2022-02-08 13:39:07 -07:00 committed by GitHub
parent c018d3037b
commit 81c72044a1
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
108 changed files with 2282 additions and 1573 deletions

View file

@ -23,17 +23,9 @@ static const char match_args_key[] = "__match_args__";
They are only allowed for indices < n_visible_fields. */
const char * const PyStructSequence_UnnamedField = "unnamed field";
_Py_IDENTIFIER(n_sequence_fields);
_Py_IDENTIFIER(n_fields);
_Py_IDENTIFIER(n_unnamed_fields);
static Py_ssize_t
get_type_attr_as_size(PyTypeObject *tp, _Py_Identifier *id)
get_type_attr_as_size(PyTypeObject *tp, PyObject *name)
{
PyObject *name = _PyUnicode_FromId(id);
if (name == NULL) {
return -1;
}
PyObject *v = PyDict_GetItemWithError(tp->tp_dict, name);
if (v == NULL && !PyErr_Occurred()) {
PyErr_Format(PyExc_TypeError,
@ -44,11 +36,14 @@ get_type_attr_as_size(PyTypeObject *tp, _Py_Identifier *id)
}
#define VISIBLE_SIZE(op) Py_SIZE(op)
#define VISIBLE_SIZE_TP(tp) get_type_attr_as_size(tp, &PyId_n_sequence_fields)
#define REAL_SIZE_TP(tp) get_type_attr_as_size(tp, &PyId_n_fields)
#define VISIBLE_SIZE_TP(tp) \
get_type_attr_as_size(tp, &_Py_ID(n_sequence_fields))
#define REAL_SIZE_TP(tp) \
get_type_attr_as_size(tp, &_Py_ID(n_fields))
#define REAL_SIZE(op) REAL_SIZE_TP(Py_TYPE(op))
#define UNNAMED_FIELDS_TP(tp) get_type_attr_as_size(tp, &PyId_n_unnamed_fields)
#define UNNAMED_FIELDS_TP(tp) \
get_type_attr_as_size(tp, &_Py_ID(n_unnamed_fields))
#define UNNAMED_FIELDS(op) UNNAMED_FIELDS_TP(Py_TYPE(op))
@ -622,21 +617,3 @@ PyStructSequence_NewType(PyStructSequence_Desc *desc)
{
return _PyStructSequence_NewType(desc, 0);
}
/* runtime lifecycle */
PyStatus _PyStructSequence_InitState(PyInterpreterState *interp)
{
if (!_Py_IsMainInterpreter(interp)) {
return _PyStatus_OK();
}
if (_PyUnicode_FromId(&PyId_n_sequence_fields) == NULL
|| _PyUnicode_FromId(&PyId_n_fields) == NULL
|| _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL)
{
return _PyStatus_ERR("can't initialize structseq state");
}
return _PyStatus_OK();
}