bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)

We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
This commit is contained in:
Eric Snow 2022-02-08 13:39:07 -07:00 committed by GitHub
parent c018d3037b
commit 81c72044a1
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
108 changed files with 2282 additions and 1573 deletions

View file

@ -703,7 +703,6 @@ r_string(Py_ssize_t n, RFILE *p)
read = fread(p->buf, 1, n, p->fp);
}
else {
_Py_IDENTIFIER(readinto);
PyObject *res, *mview;
Py_buffer buf;
@ -713,7 +712,7 @@ r_string(Py_ssize_t n, RFILE *p)
if (mview == NULL)
return NULL;
res = _PyObject_CallMethodId(p->readable, &PyId_readinto, "N", mview);
res = _PyObject_CallMethod(p->readable, &_Py_ID(readinto), "N", mview);
if (res != NULL) {
read = PyNumber_AsSsize_t(res, PyExc_ValueError);
Py_DECREF(res);
@ -1713,12 +1712,11 @@ marshal_dump_impl(PyObject *module, PyObject *value, PyObject *file,
/* XXX Quick hack -- need to do this differently */
PyObject *s;
PyObject *res;
_Py_IDENTIFIER(write);
s = PyMarshal_WriteObjectToString(value, version);
if (s == NULL)
return NULL;
res = _PyObject_CallMethodIdOneArg(file, &PyId_write, s);
res = _PyObject_CallMethodOneArg(file, &_Py_ID(write), s);
Py_DECREF(s);
return res;
}
@ -1745,7 +1743,6 @@ marshal_load(PyObject *module, PyObject *file)
/*[clinic end generated code: output=f8e5c33233566344 input=c85c2b594cd8124a]*/
{
PyObject *data, *result;
_Py_IDENTIFIER(read);
RFILE rf;
/*
@ -1755,7 +1752,7 @@ marshal_load(PyObject *module, PyObject *file)
* This can be removed if we guarantee good error handling
* for r_string()
*/
data = _PyObject_CallMethodId(file, &PyId_read, "i", 0);
data = _PyObject_CallMethod(file, &_Py_ID(read), "i", 0);
if (data == NULL)
return NULL;
if (!PyBytes_Check(data)) {