GH-135379: Top of stack caching for the JIT. (GH-135465)
Some checks are pending
Tests / Change detection (push) Waiting to run
Tests / Docs (push) Blocked by required conditions
Tests / Check if Autoconf files are up to date (push) Blocked by required conditions
Tests / Check if generated files are up to date (push) Blocked by required conditions
Tests / (push) Blocked by required conditions
Tests / Windows MSI (push) Blocked by required conditions
Tests / Ubuntu SSL tests with OpenSSL (push) Blocked by required conditions
Tests / Ubuntu SSL tests with AWS-LC (push) Blocked by required conditions
Tests / Android (aarch64) (push) Blocked by required conditions
Tests / Android (x86_64) (push) Blocked by required conditions
Tests / iOS (push) Blocked by required conditions
Tests / WASI (push) Blocked by required conditions
Tests / Hypothesis tests on Ubuntu (push) Blocked by required conditions
Tests / Address sanitizer (push) Blocked by required conditions
Tests / Sanitizers (push) Blocked by required conditions
Tests / Cross build Linux (push) Blocked by required conditions
Tests / CIFuzz (push) Blocked by required conditions
Tests / All required checks pass (push) Blocked by required conditions
JIT / Interpreter (Debug) (push) Waiting to run
JIT / aarch64-pc-windows-msvc/msvc (Release) (push) Blocked by required conditions
JIT / aarch64-pc-windows-msvc/msvc (Debug) (push) Blocked by required conditions
JIT / i686-pc-windows-msvc/msvc (Release) (push) Blocked by required conditions
JIT / i686-pc-windows-msvc/msvc (Debug) (push) Blocked by required conditions
JIT / aarch64-apple-darwin/clang (Release) (push) Blocked by required conditions
JIT / aarch64-unknown-linux-gnu/gcc (Release) (push) Blocked by required conditions
JIT / aarch64-apple-darwin/clang (Debug) (push) Blocked by required conditions
JIT / aarch64-unknown-linux-gnu/gcc (Debug) (push) Blocked by required conditions
JIT / x86_64-pc-windows-msvc/msvc (Release) (push) Blocked by required conditions
JIT / x86_64-pc-windows-msvc/msvc (Debug) (push) Blocked by required conditions
JIT / x86_64-apple-darwin/clang (Release) (push) Blocked by required conditions
JIT / x86_64-unknown-linux-gnu/gcc (Release) (push) Blocked by required conditions
JIT / x86_64-apple-darwin/clang (Debug) (push) Blocked by required conditions
JIT / x86_64-unknown-linux-gnu/gcc (Debug) (push) Blocked by required conditions
JIT / Free-Threaded (Debug) (push) Blocked by required conditions
JIT / JIT without optimizations (Debug) (push) Blocked by required conditions
JIT / JIT with tail calling interpreter (push) Blocked by required conditions
Lint / lint (push) Waiting to run
mypy / Run mypy on Lib/_pyrepl (push) Waiting to run
mypy / Run mypy on Lib/test/libregrtest (push) Waiting to run
mypy / Run mypy on Lib/tomllib (push) Waiting to run
mypy / Run mypy on Tools/build (push) Waiting to run
mypy / Run mypy on Tools/cases_generator (push) Waiting to run
mypy / Run mypy on Tools/clinic (push) Waiting to run
mypy / Run mypy on Tools/jit (push) Waiting to run
mypy / Run mypy on Tools/peg_generator (push) Waiting to run
Tail calling interpreter / aarch64-apple-darwin/clang (push) Waiting to run
Tail calling interpreter / aarch64-unknown-linux-gnu/gcc (push) Waiting to run
Tail calling interpreter / x86_64-pc-windows-msvc/msvc (push) Waiting to run
Tail calling interpreter / x86_64-apple-darwin/clang (push) Waiting to run
Tail calling interpreter / free-threading (push) Waiting to run
Tail calling interpreter / x86_64-unknown-linux-gnu/gcc (push) Waiting to run

Uses three registers to cache values at the top of the evaluation stack
This significantly reduces memory traffic for smaller, more common uops.
This commit is contained in:
Mark Shannon 2025-12-11 10:32:52 +00:00 committed by GitHub
parent 80c9756e3f
commit 469f191a85
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
30 changed files with 16865 additions and 1357 deletions

View file

@ -6,7 +6,7 @@
#include "pycore_tstate.h"
#include "pycore_initconfig.h" // _PyStatus_OK()
#include "pycore_uop_metadata.h" // _PyOpcode_uop_name
#include "pycore_uop_ids.h" // MAX_UOP_ID
#include "pycore_uop_ids.h" // MAX_UOP_REGS_ID
#include "pycore_pystate.h" // _PyThreadState_GET()
#include "pycore_runtime.h" // NUM_GENERATIONS
@ -33,8 +33,8 @@ _PyStats_GetLocal(void)
#endif
#if PYSTATS_MAX_UOP_ID < MAX_UOP_ID
#error "Not enough space allocated for pystats. Increase PYSTATS_MAX_UOP_ID to at least MAX_UOP_ID"
#if PYSTATS_MAX_UOP_ID < MAX_UOP_REGS_ID
#error "Not enough space allocated for pystats. Increase PYSTATS_MAX_UOP_ID to at least MAX_UOP_REGS_ID"
#endif
#define ADD_STAT_TO_DICT(res, field) \
@ -285,7 +285,7 @@ print_optimization_stats(FILE *out, OptimizationStats *stats)
stats->optimizer_failure_reason_no_memory);
fprintf(out, "Optimizer remove globals builtins changed: %" PRIu64 "\n", stats->remove_globals_builtins_changed);
fprintf(out, "Optimizer remove globals incorrect keys: %" PRIu64 "\n", stats->remove_globals_incorrect_keys);
for (int i = 0; i <= MAX_UOP_ID; i++) {
for (int i = 0; i <= MAX_UOP_REGS_ID; i++) {
if (stats->opcode[i].execution_count) {
fprintf(out, "uops[%s].execution_count : %" PRIu64 "\n", _PyUOpName(i), stats->opcode[i].execution_count);
}
@ -304,15 +304,15 @@ print_optimization_stats(FILE *out, OptimizationStats *stats)
}
}
for (int i = 1; i <= MAX_UOP_ID; i++){
for (int j = 1; j <= MAX_UOP_ID; j++) {
for (int i = 1; i <= MAX_UOP_REGS_ID; i++){
for (int j = 1; j <= MAX_UOP_REGS_ID; j++) {
if (stats->opcode[i].pair_count[j]) {
fprintf(out, "uop[%s].pair_count[%s] : %" PRIu64 "\n",
_PyOpcode_uop_name[i], _PyOpcode_uop_name[j], stats->opcode[i].pair_count[j]);
}
}
}
for (int i = 0; i < MAX_UOP_ID; i++) {
for (int i = 0; i < MAX_UOP_REGS_ID; i++) {
if (stats->error_in_opcode[i]) {
fprintf(
out,