This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
* pycore_intrinsics.h does nothing if included twice
(add #ifndef and #define).
* Update Tools/cases_generator/generate_cases.py to generate the
Py_BUILD_CORE test.
* _bz2, _lzma, _opcode and zlib extensions now define the
Py_BUILD_CORE_MODULE macro to use internal headers
(pycore_code.h, pycore_intrinsics.h and pycore_blocks_output_buffer.h).
Here we are doing no more than adding the value for Py_mod_multiple_interpreters and using it for stdlib modules. We will start checking for it in gh-104206 (once PyInterpreterState.ceval.own_gil is added in gh-104204).
lzma.LZMADecompressor and bz2.BZ2Decompressor objects caused
segfaults when their `__init__()` methods were not called.
lzma.LZMADecompressor, lzma.LZMACompressor, bz2.BZ2Compressor,
and bz2.BZ2Decompressor objects would leak locks and internal buffers
when their `__init__()` methods were called multiple times.
https://bugs.python.org/issue23224
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
* Make functools types immutable
* Multibyte codec types are now immutable
* pyexpat.xmlparser is now immutable
* array.arrayiterator is now immutable
* _thread types are now immutable
* _csv types are now immutable
* _queue.SimpleQueue is now immutable
* mmap.mmap is now immutable
* unicodedata.UCD is now immutable
* sqlite3 types are now immutable
* _lsprof.Profiler is now immutable
* _overlapped.Overlapped is now immutable
* _operator types are now immutable
* winapi__overlapped.Overlapped is now immutable
* _lzma types are now immutable
* _bz2 types are now immutable
* _dbm.dbm and _gdbm.gdbm are now immutable
* Fix initial buffer size can't > UINT32_MAX in zlib module
After commit f9bedb630e, in 64-bit build,
if the initial buffer size > UINT32_MAX, ValueError will be raised.
These two functions are affected:
1. zlib.decompress(data, /, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
2. zlib.Decompress.flush([length])
This commit re-allows the size > UINT32_MAX.
* adds curly braces per PEP 7.
* Renames `Buffer_*` to `OutputBuffer_*` for clarity
Faster bz2/lzma/zlib via new output buffering.
Also adds .readall() function to _compression.DecompressReader class
to take best advantage of this in the consume-all-output at once scenario.
Often a 5-20% speedup in common scenarios due to less data copying.
Contributed by Ma Lin.
* 1. add test case with wrong behavior
* 2. fix bug when max_length == -1
* 3. allow b"" as valid input data for decompress_buf()
* 4. when max_length >= 0, let needs_input mechanism works
* add more asserts to test case
* Fix potential division by zero in BZ2_Malloc()
* Avoid division by zero in PyLzma_Malloc()
* Avoid division by zero and integer overflow in PyZlib_Malloc()
Reported by Svace static analyzer.