This adds several of unspecialized opcodes to superblocks:
TO_BOOL, BINARY_SUBSCR, STORE_SUBSCR,
UNPACK_SEQUENCE, LOAD_GLOBAL, LOAD_ATTR,
COMPARE_OP, BINARY_OP.
While we may not want that eventually, for now this helps finding bugs.
There is a rudimentary test checking for UNPACK_SEQUENCE.
Once we're ready to undo this, that would be simple:
just replace the call to variable_used_unspecialized
with a call to variable_used (as shown in a comment).
Or add individual opcdes to FORBIDDEN_NAMES_IN_UOPS.
Instead of special-casing specific instructions,
we add a few more special values to the 'size' field of expansions,
so in the future we can automatically handle
additional super-instructions in the generator.
The uops test wasn't testing anything by default,
and was failing when run with -Xuops.
Made the two executor-related context managers global,
so TestUops can use them (notably `with temporary_optimizer(opt)`).
Made clear_executor() a little more thorough.
Fixed a crash upon finalizing a uop optimizer,
by adding a `tp_dealloc` handler.
When `_PyOptimizer_BackEdge` returns `NULL`, we should restore `next_instr` (and `stack_pointer`). To accomplish this we should jump to `resume_with_error` instead of just `error`.
The problem this causes is subtle -- the only repro I have is in PR gh-106393, at commit d7df54b139bcc47f5ea094bfaa9824f79bc45adc. But the fix is real (as shown later in that PR).
While we're at it, also improve the debug output: the offsets at which traces are identified are now measured in bytes, and always show the start offset. This makes it easier to correlate executor calls with optimizer calls, and either with `dis` output.
<!-- gh-issue-number: gh-104584 -->
* Issue: gh-104584
<!-- /gh-issue-number -->
Remove private pylifecycle.h functions: move them to the internal C
API ( pycore_atexit.h, pycore_pylifecycle.h and pycore_signal.h). No
longer export most of these functions.
Move _testcapi.test_atexit() to _testinternalcapi.
Remove private _PyUnicode_TransformDecimalAndSpaceToASCII() and other
private _PyUnicode C API functions: move them to the internal C API
(pycore_unicodeobject.h). No longer most of these functions.
Replace _testcapi.unicode_transformdecimalandspacetoascii() with
_testinternal._PyUnicode_TransformDecimalAndSpaceToASCII().
Remove more private _PyUnicode C API functions:
move them to the internal C API (pycore_unicodeobject.h).
No longer export most pycore_unicodeobject.h functions.
- Tweak uops debugging output
- Fix the bug from gh-106290
- Rename `SET_IP` to `SAVE_IP` (per https://github.com/faster-cpython/ideas/issues/558)
- Add a `SAVE_IP` uop at the start of the trace (ditto)
- Allow `unbound_local_error`; this gives us uops for `LOAD_FAST_CHECK`, `LOAD_CLOSURE`, and `DELETE_FAST`
- Longer traces
- Support `STORE_FAST_LOAD_FAST`, `STORE_FAST_STORE_FAST`
- Add deps on pycore_uops.h to Makefile(.pre.in)
Remove the following functions from the C API, move them to the internal C
API: add a new pycore_modsupport.h internal header file:
* PyModule_CreateInitialized()
* _PyArg_NoKwnames()
* _Py_VaBuildStack()
No longer export these functions.
Remove private _PyThreadState and _PyInterpreterState C API
functions: move them to the internal C API (pycore_pystate.h and
pycore_interp.h). Don't export most of these functions anymore, but
still export functions used by tests.
Remove _PyThreadState_Prealloc() and _PyThreadState_Init() from the C
API, but keep it in the stable API.
Remove the "cpython/pytime.h" header file: it only contained private
functions. Move functions to the internal pycore_time.h header file.
Move tests from _testcapi to _testinternalcapi. Rename also test
methods to have the same name than tested C functions.
No longer export these functions:
* _PyTime_Add()
* _PyTime_As100Nanoseconds()
* _PyTime_FromMicrosecondsClamp()
* _PyTime_FromTimespec()
* _PyTime_FromTimeval()
* _PyTime_GetPerfCounterWithInfo()
* _PyTime_MulDiv()
Remove the following private functions of the C API:
* _PyCodecInfo_GetIncrementalDecoder()
* _PyCodecInfo_GetIncrementalEncoder()
* _PyCodec_DecodeText()
* _PyCodec_EncodeText()
* _PyCodec_Forget()
* _PyCodec_Lookup()
* _PyCodec_LookupTextEncoding()
Move these functions to a new pycore_codecs.h internal header file.
These functions are no longer exported.
* EOFError no longer overrides other errors such as MemoryError or OSError at
the start of the object.
* Raise more relevant error when the NULL object occurs as a code object
component.
* Minimize an overhead of calling PyErr_Occurred().
This produces longer traces (superblocks?).
Also improved debug output (uop names are now printed instead of numeric opcodes). This would be simpler if the numeric opcode values were generated by generate_cases.py, but that's another project.
Refactored some code in generate_cases.py so the essential algorithm for cache effects is only run once. (Deciding which effects are used and what the total cache size is, regardless of what's used.)
Remove the following private functions from the public C API:
* _Py_CheckFunctionResult()
* _PyObject_CallMethod()
* _PyObject_CallMethodId()
* _PyObject_CallMethodIdNoArgs()
* _PyObject_CallMethodIdObjArgs()
* _PyObject_CallMethodIdOneArg()
* _PyObject_MakeTpCall()
* _PyObject_VectorcallMethodId()
* _PyStack_AsDict()
Move these functions to the internal C API (pycore_call.h).
No longer export the following functions:
* _PyObject_Call()
* _PyObject_CallMethod()
* _PyObject_CallMethodId()
* _PyObject_CallMethodIdObjArgs()
* _PyObject_Call_Prepend()
* _PyObject_FastCallDictTstate()
* _PyStack_AsDict()
The following functions are still exported for stdlib shared
extensions:
* _Py_CheckFunctionResult()
* _PyObject_MakeTpCall()
Mark the following internal functions as extern:
* _PyStack_UnpackDict()
* _PyStack_UnpackDict_Free()
* _PyStack_UnpackDict_FreeNoDecRef()
This effectively reverts bb578a0, restoring the original DEOPT_IF() macro in ceval_macros.h, and redefining it in the Tier 2 interpreter. We can get rid of the PREDICTED() macros there as well!
Added a new, experimental, tracing optimizer and interpreter (a.k.a. "tier 2"). This currently pessimizes, so don't use yet -- this is infrastructure so we can experiment with optimizing passes. To enable it, pass ``-Xuops`` or set ``PYTHONUOPS=1``. To get debug output, set ``PYTHONUOPSDEBUG=N`` where ``N`` is a debug level (0-4, where 0 is no debug output and 4 is excessively verbose).
All of this code is likely to change dramatically before the 3.13 feature freeze. But this is a first step.
Remove old aliases which were kept backwards compatibility with
Python 3.8:
* _PyObject_CallMethodNoArgs()
* _PyObject_CallMethodOneArg()
* _PyObject_CallOneArg()
* _PyObject_FastCallDict()
* _PyObject_Vectorcall()
* _PyObject_VectorcallMethod()
* _PyVectorcall_Function()
Update code which used these aliases to use new names.
These functions are broken by design because they discard any exceptions raised
inside, including MemoryError and KeyboardInterrupt. They should not be
used in new code.
* PyDict_GetItem() and PyObject_HasAttr() suppress arbitrary errors and
should not be used.
* PyUnicode_CompareWithASCIIString() only works if the second argument
is ASCII string.
* Refleak in get_suggestions_for_name_error.
* Use of borrowed pointer after possible freeing (self).
* Add some missing error checks.
It now raises an exception if sys.modules doesn't hold a strong
reference to the module.
Elaborate the comment explaining why a weak reference is used to
create a borrowed reference.