Commit graph

148 commits

Author SHA1 Message Date
Victor Stinner
20c5f969dd
gh-131238: Remove more includes from pycore_interp.h (#131480) 2025-03-19 23:01:32 +01:00
Sam Gross
45bc120d45
gh-130519: Fix crash in QSBR when destructor reenters QSBR (gh-130553)
The `free_work_item()` function in QSBR may call arbitrary code via
Python object destructors, which may reenter the QSBR code. Reorder
the processing of work items to be robust to reentrancy.

Also fix the TODO for the out of memory situation.
2025-02-26 14:55:15 -05:00
Mark Shannon
014223649c
GH-130396: Use computed stack limits on linux (GH-130398)
* Implement C recursion protection with limit pointers for Linux, MacOS and Windows

* Remove calls to PyOS_CheckStack

* Add stack protection to parser

* Make tests more robust to low stacks

* Improve error messages for stack overflow
2025-02-25 09:24:48 +00:00
Petr Viktorin
ef29104f7d
GH-91079: Revert "GH-91079: Implement C stack limits using addresses, not counters. (GH-130007)" for now (GH130413)
Revert "GH-91079: Implement C stack limits using addresses, not counters. (GH-130007)" for now

Unfortunatlely, the change broke some buildbots.

This reverts commit 2498c22fa0.
2025-02-24 11:16:08 +01:00
Mark Shannon
2498c22fa0
GH-91079: Implement C stack limits using addresses, not counters. (GH-130007)
* Implement C recursion protection with limit pointers

* Remove calls to PyOS_CheckStack

* Add stack protection to parser

* Make tests more robust to low stacks

* Improve error messages for stack overflow
2025-02-19 11:44:57 +00:00
Brandt Bucher
002c4e2982
GH-129386: Use symbolic constants for specialization tests (GH-129415) 2025-01-29 10:49:58 -08:00
Brandt Bucher
828b27680f
GH-126599: Remove the PyOptimizer API (GH-129194) 2025-01-28 16:10:51 -08:00
Victor Stinner
1d485db953
gh-128863: Deprecate _PyLong_Sign() function (#129176)
Replace _PyLong_Sign() with PyLong_GetSign().
2025-01-23 03:11:53 +01:00
Mark Shannon
f0f7b978be
GH-128939: Refactor JIT optimize structs (GH-128940) 2025-01-20 15:49:15 +00:00
Victor Stinner
8ceb6cb117
gh-129033: Remove _PyInterpreterState_SetConfig() function (#129048)
Remove _PyInterpreterState_GetConfigCopy() and
_PyInterpreterState_SetConfig() private functions. PEP 741 "Python
Configuration C API" added a better public C API: PyConfig_Get() and
PyConfig_Set().
2025-01-20 16:31:33 +01:00
Xuanteng Huang
b44ff6d0df
GH-126599: Remove the "counter" optimizer/executor (GH-126853) 2025-01-16 15:57:04 -08:00
Neil Schemenauer
1b15c89a17
gh-115999: Specialize STORE_ATTR in free-threaded builds. (gh-127838)
* Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and
  use in place of `_PyDictKeys_StringLookup`.

* Change `_PyObject_TryGetInstanceAttribute` to use that function
  in the case of split keys.

* Add `unicodekeys_lookup_split` helper which allows code sharing
  between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.

* Fix locking for `STORE_ATTR_INSTANCE_VALUE`.  Create
  `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and
  `tp_version_tag` cannot change.

* Pass `tp_version_tag` to `specialize_dict_access()`, ensuring
  the version we store on the cache is the correct one (in case of
  it changing during the specalize analysis).

* Split `analyze_descriptor` into `analyze_descriptor_load` and
  `analyze_descriptor_store` since those don't share much logic.
  Add `descriptor_is_class` helper function.

* In `specialize_dict_access`, double check `_PyObject_GetManagedDict()`
  in case we race and dict was materialized before the lock.

* Avoid borrowed references in `_Py_Specialize_StoreAttr()`.

* Use `specialize()` and `unspecialize()` helpers.

* Add unit tests to ensure specializing happens as expected in FT builds.

* Add unit tests to attempt to trigger data races (useful for running under TSAN).

* Add `has_split_table` function to `_testinternalcapi`.
2024-12-19 10:21:17 -08:00
Mark Shannon
bc262de06b
GH-125174: Mark objects as statically allocated. (#127797)
* Set a bit in the unused part of the refcount on 64 bit machines and the free-threaded build.

* Use the top of the refcount range on 32 bit machines
2024-12-11 17:37:38 +00:00
Peter Bierma
d5d84c3f13
gh-127791: Fix, document, and test PyUnstable_AtExit (#127793) 2024-12-11 12:14:04 +01:00
Mark Shannon
a8dd821d5b
GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-127110)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Update docs

* Clearer calculation of work to do.
2024-12-02 10:12:17 +00:00
Hugo van Kemenade
899fdb213d
Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)" (#126983) 2024-11-19 11:25:09 +02:00
Brandt Bucher
4cd10762b0
GH-126795: Increase the JIT threshold from 16 to 4096 (GH-126816) 2024-11-18 11:11:23 -08:00
Mark Shannon
b0fcc2c47a
GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Remove lazy dict tracking

* Update docs

* Clearer calculation of work to do.
2024-11-18 14:31:26 +00:00
Peter Bierma
d00878b06a
gh-123619: Add an unstable C API function for enabling deferred reference counting (GH-123635)
Co-authored-by: Sam Gross <colesbury@gmail.com>
2024-11-13 13:27:16 +00:00
Eric Snow
73cf069099
gh-76785: Improved Subinterpreters Compatibility with 3.12 (2/2) (gh-126707)
These changes makes it easier to backport the _interpreters, _interpqueues, and _interpchannels modules to Python 3.12.

This involves the following:

* add the _PyXI_GET_STATE() and _PyXI_GET_GLOBAL_STATE() macros
* add _PyXIData_lookup_context_t and _PyXIData_GetLookupContext()
* add _Py_xi_state_init() and _Py_xi_state_fini()
2024-11-12 10:41:51 -07:00
Eric Snow
9357fdcaf0
gh-76785: Minor Cleanup of "Cross-interpreter" Code (gh-126457)
The primary objective here is to allow some later changes to be cleaner. Mostly this involves renaming things and moving a few things around.

* CrossInterpreterData -> XIData
* crossinterpdatafunc -> xidatafunc
* split out pycore_crossinterp_data_registry.h
* add _PyXIData_lookup_t
2024-11-07 09:32:42 -07:00
mpage
2e95c5ba3b
gh-115999: Implement thread-local bytecode and enable specialization for BINARY_OP (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
Sam Gross
332356b880
gh-125900: Clean-up logic around immortalization in free-threading (#125901)
* Remove `@suppress_immortalization` decorator
* Make suppression flag per-thread instead of per-interpreter
* Suppress immortalization in `eval()` to avoid refleaks in three tests
  (test_datetime.test_roundtrip, test_logging.test_config8_ok, and
   test_random.test_after_fork).
* frozenset() is constant, but not a singleton. When run multiple times,
  the test could fail due to constant interning.
2024-10-24 18:09:59 -04:00
Sam Gross
f4997bb3ac
gh-123923: Defer refcounting for f_funcobj in _PyInterpreterFrame (#124026)
Use a `_PyStackRef` and defer the reference to `f_funcobj` when
possible. This avoids some reference count contention in the common case
of executing the same code object from multiple threads concurrently in
the free-threaded build.
2024-09-24 20:08:18 +00:00
Serhiy Storchaka
32c7dbb2bc
gh-121485: Always use 64-bit integers for integers bits count (GH-121486)
Use 64-bit integers instead of platform specific size_t or Py_ssize_t
to represent the number of bits in Python integer.
2024-08-30 08:13:24 +03:00
Eric Snow
503af8fe9a
gh-117482: Make the Slot Wrapper Inheritance Tests Much More Thorough (gh-122867)
There were a still a number of gaps in the tests, including not looking
at all the builtin types and not checking wrappers in subinterpreters
that weren't in the main interpreter. This fixes all that.

I considered incorporating the names of the PyTypeObject fields
(a la gh-122866), but figured doing so doesn't add much value.
2024-08-12 19:19:33 +00:00
Victor Stinner
9e4a81f00f
gh-120642: Move private PyCode APIs to the internal C API (#120643)
* Move _Py_CODEUNIT and related functions to pycore_code.h.
* Move _Py_BackoffCounter to pycore_backoff.h.
* Move Include/cpython/optimizer.h content to pycore_optimizer.h.
* Remove Include/cpython/optimizer.h.
* Remove PyUnstable_Replace_Executor().

Rename functions:

* PyUnstable_GetExecutor() => _Py_GetExecutor()
* PyUnstable_GetOptimizer() => _Py_GetOptimizer()
* PyUnstable_SetOptimizer() => _Py_SetTier2Optimizer()
* PyUnstable_Optimizer_NewCounter() => _PyOptimizer_NewCounter()
* PyUnstable_Optimizer_NewUOpOptimizer() => _PyOptimizer_NewUOpOptimizer()
2024-06-26 13:54:03 +02:00
Eric Snow
a905721b9c
gh-120838: Add _PyThreadState_WHENCE_FINI (gh-121010)
We also add _PyThreadState_NewBound() and drop _PyThreadState_SetWhence().

This change only affects internal API.
2024-06-25 14:35:12 -06:00
Mark Shannon
00257c746c
GH-119462: Enforce invariants of type versioning (GH-120731)
* Remove uses of Py_TPFLAGS_VALID_VERSION_TAG
2024-06-19 17:38:45 +01:00
Saul Shanabrook
55402d3232
gh-119258: Eliminate Type Guards in Tier 2 Optimizer with Watcher (GH-119365)
Co-authored-by: parmeggiani <parmeggiani@spaziodati.eu>
Co-authored-by: dpdani <git@danieleparmeggiani.me>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
Co-authored-by: Ken Jin <kenjin@python.org>
2024-06-08 17:41:45 +08:00
Sam Gross
47fb4327b5
gh-117657: Fix race involving immortalizing objects (#119927)
The free-threaded build currently immortalizes objects that use deferred
reference counting (see gh-117783). This typically happens once the
first non-main thread is created, but the behavior can be suppressed for
tests, in subinterpreters, or during a compile() call.

This fixes a race condition involving the tracking of whether the
behavior is suppressed.
2024-06-03 20:58:41 +00:00
Irit Katriel
13a5fdc72f
gh-119744: move a few functions from compile.c to flowgraph.c (#119745) 2024-05-30 21:55:06 +01:00
Eric Snow
81865002ae
gh-119213: Be More Careful About _PyArg_Parser.kwtuple Across Interpreters (gh-119331)
_PyArg_Parser holds static global data generated for modules by Argument Clinic.  The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global.  In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters.  However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.

This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes.  It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime.  The solution here is to temporarily switch to the main interpreter.  The alternative would be to always statically allocate the tuple.

This change also fixes a bug where only the most recent parser was added to the global linked list.
2024-05-22 09:57:52 -06:00
Sam Gross
723d4d2fe8
gh-118527: Intern code consts in free-threaded build (#118667)
We already intern and immortalize most string constants. In the
free-threaded build, other constants can be a source of reference count
contention because they are shared by all threads running the same code
objects.
2024-05-06 20:12:39 -04:00
Brett Simmers
c2627d6eea
gh-116322: Add Py_mod_gil module slot (#116882)
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.

PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.

A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
2024-05-03 11:30:55 -04:00
Guido van Rossum
7d83f7bcc4
gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Sam Gross
7ccacb220d
gh-117783: Immortalize objects that use deferred reference counting (#118112)
Deferred reference counting is not fully implemented yet. As a temporary
measure, we immortalize objects that would use deferred reference
counting to avoid multi-threaded scaling bottlenecks.

This is only performed in the free-threaded build once the first
non-main thread is started. Additionally, some tests, including refleak
tests, suppress this behavior.
2024-04-29 14:36:02 -04:00
Eric Snow
09c2947581
gh-110693: Pending Calls Machinery Cleanups (gh-118296)
This does some cleanup in preparation for later changes.
2024-04-26 01:05:51 +00:00
Irit Katriel
c179c0e6cb
gh-117680: make _PyInstructionSequence a PyObject and use it in tests (#117629) 2024-04-17 16:42:04 +01:00
Eric Snow
fd259fdabe
gh-76785: Handle Legacy Interpreters Properly (gh-117490)
This is similar to the situation with threading._DummyThread.  The methods (incl. __del__()) of interpreters.Interpreter objects must be careful with interpreters not created by interpreters.create().  The simplest thing to start with is to disable any method that modifies or runs in the interpreter.  As part of this, the runtime keeps track of where an interpreter was created.  We also handle interpreter "refcounts" properly.
2024-04-11 23:23:25 +00:00
Eric Snow
993c3cca16
gh-76785: Add More Tests to test_interpreters.test_api (gh-117662)
In addition to the increase test coverage, this is a precursor to sorting out how we handle interpreters created directly via the C-API.
2024-04-10 18:37:01 -06:00
Guido van Rossum
060a96f1a9
gh-116968: Reimplement Tier 2 counters (#117144)
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.

The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
2024-04-04 15:03:27 +00:00
Peter Lazorchak
1c43468886
gh-116168: Remove extra _CHECK_STACK_SPACE uops (#117242)
This merges all `_CHECK_STACK_SPACE` uops in a trace into a single `_CHECK_STACK_SPACE_OPERAND` uop that checks whether there is enough stack space for all calls included in the entire trace.
2024-04-03 17:14:18 +00:00
Eric Snow
857d3151c9
gh-76785: Consolidate Some Interpreter-related Testing Helpers (gh-117485)
This eliminates the duplication of functionally identical helpers in the _testinternalcapi and _xxsubinterpreters modules.
2024-04-02 23:16:50 +00:00
Eric Snow
f341d6017d
gh-76785: Add PyInterpreterConfig Helpers (gh-117170)
These helpers make it easier to customize and inspect the config used to initialize interpreters.  This is especially valuable in our tests.  I found inspiration from the PyConfig API for the PyInterpreterConfig dict conversion stuff.  As part of this PR I've also added a bunch of tests.
2024-04-02 20:35:52 +00:00
Mark Shannon
c32dc47aca
GH-115776: Embed the values array into the object, for "normal" Python objects. (GH-116115) 2024-04-02 11:59:21 +01:00
Irit Katriel
d610d821fd
gh-112383: teach dis how to interpret ENTER_EXECUTOR (#117171) 2024-03-23 22:32:33 +00:00
Eric Snow
617158e078
gh-76785: Drop PyInterpreterID_Type (gh-117101)
I added it quite a while ago as a strategy for managing interpreter lifetimes relative to the PEP 554 (now 734) implementation.  Relatively recently I refactored that implementation to no longer rely on InterpreterID objects.  Thus now I'm removing it.
2024-03-21 17:15:02 +00:00
Eric Snow
bbee57fa8c
gh-76785: Clean Up Interpreter ID Conversions (gh-117048)
Mostly we unify the two different implementations of the conversion code (from PyObject * to int64_t.  We also drop the PyArg_ParseTuple()-style converter function, as well as rename and move PyInterpreterID_LookUp().
2024-03-21 09:56:12 -06:00
Antoine Pitrou
b8d808ddd7
GH-112536: Add more TSan tests (#116911)
These may all exercise some non-trivial aspects of thread synchronization.
2024-03-17 09:47:14 +01:00