Commit graph

69 commits

Author SHA1 Message Date
Ken Jin
7ef4907412
gh-128262: Allow specialization of calls to classes with __slots__ (GH-128263) 2024-12-31 12:24:17 +08:00
Neil Schemenauer
1b15c89a17
gh-115999: Specialize STORE_ATTR in free-threaded builds. (gh-127838)
* Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and
  use in place of `_PyDictKeys_StringLookup`.

* Change `_PyObject_TryGetInstanceAttribute` to use that function
  in the case of split keys.

* Add `unicodekeys_lookup_split` helper which allows code sharing
  between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.

* Fix locking for `STORE_ATTR_INSTANCE_VALUE`.  Create
  `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and
  `tp_version_tag` cannot change.

* Pass `tp_version_tag` to `specialize_dict_access()`, ensuring
  the version we store on the cache is the correct one (in case of
  it changing during the specalize analysis).

* Split `analyze_descriptor` into `analyze_descriptor_load` and
  `analyze_descriptor_store` since those don't share much logic.
  Add `descriptor_is_class` helper function.

* In `specialize_dict_access`, double check `_PyObject_GetManagedDict()`
  in case we race and dict was materialized before the lock.

* Avoid borrowed references in `_Py_Specialize_StoreAttr()`.

* Use `specialize()` and `unspecialize()` helpers.

* Add unit tests to ensure specializing happens as expected in FT builds.

* Add unit tests to attempt to trigger data races (useful for running under TSAN).

* Add `has_split_table` function to `_testinternalcapi`.
2024-12-19 10:21:17 -08:00
Donghee Na
48c70b8f7d
gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737) 2024-12-19 11:08:17 +09:00
mpage
2de048ce79
gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711)
We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds:

_CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version.

_LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
2024-12-13 10:17:16 -08:00
Donghee Na
7c2bd9b226
gh-115999: Use light-weight lock for UNPACK_SEQUENCE_LIST (gh-127514) 2024-12-03 00:14:40 +09:00
Donghee Na
e2713409cf
gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227) 2024-12-02 10:38:17 +09:00
Kirill Podoprigora
27486c3365
gh-115999: Add free-threaded specialization for UNPACK_SEQUENCE (#126600)
Add free-threaded specialization for `UNPACK_SEQUENCE` opcode.
`UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable.
`UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack.


---------

Co-authored-by: Matt Page <mpage@meta.com>
Co-authored-by: mpage <mpage@cs.stanford.edu>
2024-11-22 19:00:35 +02:00
Ken Jin
6293d00e72
gh-120619: Strength reduce function guards, support 2-operand uop forms (GH-124846)
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2024-11-09 11:35:33 +08:00
Mark Shannon
85036c8d61
GH-126222: Fix _PyUop_num_popped (GH-126507) 2024-11-07 10:48:27 +00:00
mpage
2e95c5ba3b
gh-115999: Implement thread-local bytecode and enable specialization for BINARY_OP (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
Mark Shannon
faa3272fb8
GH-125837: Split LOAD_CONST into three. (GH-125972)
* Add LOAD_CONST_IMMORTAL opcode

* Add LOAD_SMALL_INT opcode

* Remove RETURN_CONST opcode
2024-10-29 11:15:42 +00:00
Mark Shannon
06ca33020e
GH-125323: Convert DECREF_INPUTS_AND_REUSE_FLOAT into a function that takes PyStackRefs. (GH-125439) 2024-10-14 14:18:57 +01:00
mpage
f978fb4f8d
gh-115999: Refactor LOAD_GLOBAL specializations to avoid reloading {globals, builtins} keys (gh-124953)
Each of the `LOAD_GLOBAL` specializations is implemented roughly as:

1. Load keys version.
2. Load cached keys version.
3. Deopt if (1) and (2) don't match.
4. Load keys.
5. Load cached index into keys.
6. Load object from (4) at offset from (5).

This is not thread-safe in free-threaded builds; the keys object may be replaced
in between steps (3) and (4).

This change refactors the specializations to avoid reloading the keys object and
instead pass the keys object from guards to be consumed by downstream uops.
2024-10-09 15:18:25 +00:00
Mark Shannon
da071fa3e8
GH-119866: Spill the stack around escaping calls. (GH-124392)
* Spill the evaluation around escaping calls in the generated interpreter and JIT. 

* The code generator tracks live, cached values so they can be saved to memory when needed.

* Spills the stack pointer around escaping calls, so that the exact stack is visible to the cycle GC.
2024-10-07 14:56:39 +01:00
Savannah Ostrowski
65f1237098
GH-123516: Improve JIT memory consumption by invalidating cold executors (GH-124443)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-09-27 00:35:42 +00:00
Victor Stinner
f1a0d96f41
gh-123091: Use _Py_IsImmortalLoose() (#123511)
Use _Py_IsImmortalLoose() in bytesobject.c, typeobject.c
and ceval.c.
2024-09-02 14:25:19 +02:00
Mark Shannon
1eba8bae92
GH-123185: Check for NULL after calling _PyEvalFramePushAndInit (GH-123194) 2024-08-21 12:44:56 +01:00
Mark Shannon
bb1d30336e
GH-118093: Make CALL_ALLOC_AND_ENTER_INIT suitable for tier 2. (GH-123140)
* Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it

* Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
2024-08-20 16:52:58 +01:00
Mark Shannon
c13e7d98fb
GH-118093: Specialize CALL_KW (GH-123006) 2024-08-16 17:11:24 +01:00
Brandt Bucher
f84754b705
GH-118093: Turn some DEOPT_IFs into EXIT_IFs (GH-122998) 2024-08-14 07:54:42 -07:00
Mark Shannon
eec7bdaf01
GH-120024: Remove CHECK_EVAL_BREAKER macro. (GH-122968)
* Factor some instructions into micro-ops to isolate CHECK_EVAL_BREAKER for escape analysis

* Eliminate CHECK_EVAL_BREAKER macro
2024-08-14 12:04:05 +01:00
Mark Shannon
df13a1821a
GH-118095: Add tier two support for BINARY_SUBSCR_GETITEM (GH-120793) 2024-08-01 16:19:05 -07:00
Brandt Bucher
64857d849f
GH-122294: Burn in the addresses of side exits (GH-122295) 2024-07-26 09:40:15 -07:00
Mark Shannon
95a73917cd
GH-122029: Break INSTRUMENTED_CALL into micro-ops, so that its behavior is consistent with CALL (GH-122177) 2024-07-26 14:35:57 +01:00
Brandt Bucher
d9efa45d74
GH-118093: Add tier two support for BINARY_OP_INPLACE_ADD_UNICODE (GH-122253) 2024-07-25 14:45:07 -07:00
Brandt Bucher
5f6001130f
GH-118093: Add tier two support for LOAD_ATTR_PROPERTY (GH-122283) 2024-07-25 10:45:28 -07:00
Mark Shannon
2e14a52cce
GH-122160: Remove BUILD_CONST_KEY_MAP opcode. (GH-122164) 2024-07-25 16:24:29 +01:00
Brandt Bucher
794546fd53
GH-118093: Remove invalidated executors from side exits (GH-121885) 2024-07-24 09:16:30 -07:00
Brandt Bucher
7b36b67b1e
GH-118093: Add tier two support to several instructions (GH-121884) 2024-07-18 14:24:58 -07:00
Brandt Bucher
33903c53db
GH-116017: Get rid of _COLD_EXITs (GH-120960) 2024-07-01 13:17:40 -07:00
Mark Shannon
9cefcc0ee7
GH-120507: Lower the BEFORE_WITH and BEFORE_ASYNC_WITH instructions. (#120640)
* Remove BEFORE_WITH and BEFORE_ASYNC_WITH instructions.

* Add LOAD_SPECIAL instruction

* Reimplement `with` and `async with` statements using LOAD_SPECIAL
2024-06-18 12:17:46 +01:00
Mark Shannon
274f844830
GH-120619: Clean up RETURN_VALUE instruction (GH-120624)
* Rename _POP_FRAME to _RETURN_VALUE as it returns a value as well as popping a frame.

* Remove remaining _POP_FRAMEs
2024-06-17 14:40:11 +01:00
Jelle Zijlstra
80a4e38994
gh-119821: Support non-dict globals in LOAD_FROM_DICT_OR_GLOBALS (#119822)
Support non-dict globals in LOAD_FROM_DICT_OR_GLOBALS

The implementation basically copies LOAD_GLOBAL. Possibly it could be deduplicated,
but that seems like it may get hairy since the two operations have different operands.

This is important to fix in 3.14 for PEP 649, but it's a bug in earlier versions too,
and we should backport to 3.13 and 3.12 if possible.
2024-05-31 14:05:24 -07:00
Brandt Bucher
5cd3ffd6b7
GH-119258: Handle STORE_ATTR_WITH_HINT in tier two (GH-119481) 2024-05-28 12:47:54 -07:00
Jelle Zijlstra
98e855fcc1
gh-119180: Add LOAD_COMMON_CONSTANT opcode (#119321)
The PEP 649 implementation will require a way to load NotImplementedError
from the bytecode. @markshannon suggested implementing this by converting
LOAD_ASSERTION_ERROR into a more general mechanism for loading constants.

This PR adds this new opcode. I will work on the rest of the implementation
of the PEP separately.

Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
2024-05-22 00:46:39 +00:00
Mark Shannon
1ab6356ebe
GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322)
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.

* Remove CALL_PY_WITH_DEFAULTS specialization

* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
2024-05-04 12:11:11 +01:00
Mark Shannon
da2cfc4cb6
GH-113464: Remove the extra jump via _SIDE_EXIT in _EXIT_TRACE (GH-118545) 2024-05-04 08:50:24 +01:00
Mark Shannon
67bba9dd0f
GH-117442: Check eval-breaker at start (rather than end) of tier 2 loops (GH-118482) 2024-05-02 13:10:31 +01:00
Guido van Rossum
7d83f7bcc4
gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Mark Shannon
5b05d452cd
GH-118095: Add tier 2 support for YIELD_VALUE (GH-118380) 2024-04-30 11:33:13 +01:00
Mark Shannon
ab6eda0ee5
GH-118095: Allow a variant of RESUME_CHECK in tier 2 (GH-118286) 2024-04-29 07:54:05 +01:00
Mark Shannon
3e06c7f719
GH-118095: Add dynamic exit support and FOR_ITER_GEN support to tier 2 (GH-118279) 2024-04-26 18:08:50 +01:00
Mark Shannon
f180b31e76
GH-118095: Handle RETURN_GENERATOR in tier 2 (GH-118180) 2024-04-25 11:32:47 +01:00
Mark Shannon
77cd0428b6
GH-118095: Convert DEOPT_IFs on likely side exits to EXIT_IFs (GH-118106)
Covert DEOPT_IFs on likely side exits to EXIT_IFs
2024-04-24 14:37:55 +01:00
Mark Shannon
a6647d16ab
GH-115480: Reduce guard strength for binary ops when type of one operand is known already (GH-118050) 2024-04-22 13:34:06 +01:00
Mark Shannon
7e6fa5fced
GH-116202: Incorporate invalidation check into _START_EXECUTOR. (GH-118044) 2024-04-19 09:26:42 +01:00
Mark Shannon
d3bd6b5f3f
GH-115419: Improve list of escaping functions (GH-118054) 2024-04-19 09:25:07 +01:00
Ken Jin
375425abd1
Cases generator: Remove type_prop and passthrough (#117614) 2024-04-08 06:26:52 +08:00
Peter Lazorchak
1c43468886
gh-116168: Remove extra _CHECK_STACK_SPACE uops (#117242)
This merges all `_CHECK_STACK_SPACE` uops in a trace into a single `_CHECK_STACK_SPACE_OPERAND` uop that checks whether there is enough stack space for all calls included in the entire trace.
2024-04-03 17:14:18 +00:00
Mark Shannon
c32dc47aca
GH-115776: Embed the values array into the object, for "normal" Python objects. (GH-116115) 2024-04-02 11:59:21 +01:00