Commit graph

1291 commits

Author SHA1 Message Date
Mark Shannon
d27acd4461
GH-111485: Increment next_instr consistently at the start of the instruction. (GH-111486) 2023-10-31 10:09:54 +00:00
Michael Droettboom
84b4533e84
gh-109329: Count tier2 opcode misses (#110561)
This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.
2023-10-30 17:02:45 -07:00
Eric Snow
c6fe0869ab
gh-76785: Move the Cross-Interpreter Code to Its Own File (gh-111502)
This is partly to clear this stuff out of pystate.c, but also in preparation for moving some code out of _xxsubinterpretersmodule.c.  This change also moves this stuff to the internal API (new: Include/internal/pycore_crossinterp.h).  @vstinner did this previously and I undid it.  Now I'm re-doing it. :/
2023-10-30 16:53:10 -06:00
Victor Stinner
801741ff81
gh-90815: Fix mimalloc atomic.h on Windows arm64 (#111527)
mi_atomic_load_explicit() casts 'p' argument to drop the 'const'
qualifier on Windows arm64 platform. Fix the compiler warning:

    'function': different 'const' qualifiers
    (compiling source file ..\Objects\mimalloc\options.c)
2023-10-30 22:33:49 +00:00
Sam Gross
6dfb8fe023
gh-110481: Implement biased reference counting (gh-110764) 2023-10-30 16:06:09 +00:00
Dino Viehland
05f2f0ac92
gh-90815: Add mimalloc memory allocator (#109914)
* Add mimalloc v2.12

Modified src/alloc.c to remove include of alloc-override.c and not
compile new handler.

Did not include the following files:

 - include/mimalloc-new-delete.h
 - include/mimalloc-override.h
 - src/alloc-override-osx.c
 - src/alloc-override.c
 - src/static.c
 - src/region.c

mimalloc is thread safe and shares a single heap across all runtimes,
therefore finalization and getting global allocated blocks across all
runtimes is different.

* mimalloc: minimal changes for use in Python:

 - remove debug spam for freeing large allocations
 - use same bytes (0xDD) for freed allocations in CPython and mimalloc
   This is important for the test_capi debug memory tests

* Don't export mimalloc symbol in libpython.
* Enable mimalloc as Python allocator option.
* Add mimalloc MIT license.
* Log mimalloc in Lib/test/pythoninfo.py.
* Document new mimalloc support.
* Use macro defs for exports as done in:
  https://github.com/python/cpython/pull/31164/

Co-authored-by: Sam Gross <colesbury@gmail.com>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Victor Stinner <vstinner@python.org>
2023-10-30 15:43:11 +00:00
Savannah Ostrowski
4a929d432b
GH-111339: Fix initialization and finalization of static optimizer types (GH-111430) 2023-10-29 13:53:25 -07:00
gsallam
21f068d80c
gh-109587: Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines (#109666) 2023-10-27 03:57:29 +00:00
Irit Katriel
a0c414c35d
gh-111354: define names for RESUME oparg values (#111365) 2023-10-26 16:30:18 +01:00
Irit Katriel
67a91f78e4
gh-109094: replace frame->prev_instr by frame->instr_ptr (#109095) 2023-10-26 13:43:10 +00:00
Pablo Galindo Salgado
90a1b2859f
gh-67224: Show source lines in tracebacks when using the -c option when running Python (#111200) 2023-10-26 15:17:28 +09:00
scoder
a8a89fcd1f
gh-106320: Re-add some PyLong/PyDict C-API functions (GH-#111162)
* gh-106320: Re-add _PyLong_FromByteArray(), _PyLong_AsByteArray() and _PyLong_GCD() to the public header files since they are used by third-party packages and there is no efficient replacement.

See https://github.com/python/cpython/issues/111140
See https://github.com/python/cpython/issues/111139

* gh-111262: Re-add _PyDict_Pop() to have a C-API until a new public one is designed.
2023-10-25 11:33:48 +02:00
Brandt Bucher
e5168ff3f8
GH-109214: _SET_IP before _PUSH_FRAME (but not _POP_FRAME) (GH-111001) 2023-10-24 13:27:42 -07:00
Radislav Chugunov
47d3e2ed93
gh-109894: Fix initialization of static MemoryError in subinterpreter (gh-110911)
Fixes #109894

* set `interp.static_objects.last_resort_memory_error.args` to empty tuple to avoid crash on `PyErr_Display()` call
* allow `_PyExc_InitGlobalObjects()` to be called on subinterpreter init

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2023-10-23 17:06:59 -06:00
Mark Shannon
52e902ccf0
GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally. (GH-110384) 2023-10-23 14:49:09 +01:00
Eric Snow
c58c63fdf6
gh-84570: Add Timeouts to SendChannel.send() and RecvChannel.recv() (gh-110567) 2023-10-17 23:05:49 +00:00
Eric Snow
a53d7cb672
gh-84570: Send-Wait Fixes for _xxinterpchannels (gh-111006)
There were a few things I did in gh-110565 that need to be fixed. I also forgot to add tests in that PR.

(Note that this PR exposes a refleak introduced by gh-110246. I'll take care of that separately.)
2023-10-17 16:32:00 -06:00
Donghee Na
2dcc57008b
gh-109693: Remove pycore_atomic.h (gh-110992) 2023-10-18 00:33:50 +09:00
Victor Stinner
6db6b30ac2
gh-85283: Build winsound extension with limited C API (#110978)
Replace type->tp_name with PyType_GetQualName().
2023-10-17 15:57:10 +02:00
Victor Stinner
be5e8a0103
gh-110964: Remove private _PyArg functions (#110966)
Move the following private functions and structures to
pycore_modsupport.h internal C API:

* _PyArg_BadArgument()
* _PyArg_CheckPositional()
* _PyArg_NoKeywords()
* _PyArg_NoPositional()
* _PyArg_ParseStack()
* _PyArg_ParseStackAndKeywords()
* _PyArg_Parser structure
* _PyArg_UnpackKeywords()
* _PyArg_UnpackKeywordsWithVararg()
* _PyArg_UnpackStack()
* _Py_ANY_VARARGS()

Changes:

* Python/getargs.h now includes pycore_modsupport.h to export
  functions.
* clinic.py now adds pycore_modsupport.h when one of these functions
  is used.
* Add pycore_modsupport.h includes when a C extension uses one of
  these functions.
* Define Py_BUILD_CORE_MODULE in C extensions which now include
  directly or indirectly (via code generated by Argument Clinic)
  pycore_modsupport.h:

  * _csv
  * _curses_panel
  * _dbm
  * _gdbm
  * _multiprocessing.posixshmem
  * _sqlite.row
  * _statistics
  * grp
  * resource
  * syslog

* _testcapi: bad_get() no longer uses METH_FASTCALL calling
  convention but METH_VARARGS. Replace _PyArg_UnpackStack() with
  PyArg_ParseTuple().
* _testcapi: add PYTESTCAPI_NEED_INTERNAL_API macro which is defined
  by _testcapi sub-modules which need the internal C API
  (pycore_modsupport.h): exceptions.c, float.c, vectorcall.c,
  watchers.c.
* Remove Include/cpython/modsupport.h header file.
  Include/modsupport.h no longer includes the removed header file.
* Fix mypy clinic.py
2023-10-17 14:30:31 +02:00
Donghee Na
86559ddfec
gh-109693: Update _gil_runtime_state.locked to use pyatomic.h (gh-110836) 2023-10-17 07:32:50 +09:00
Donghee Na
b2ab210aae
gh-109693: Update pyruntimestate._finalizing to use pyatomic.h (gh-110837) 2023-10-13 16:40:15 +00:00
Pablo Galindo Salgado
e1d8c65e1d
gh-110805: Allow the repl to show source code and complete tracebacks (#110775) 2023-10-13 09:25:37 +00:00
Donghee Na
2566434e59
gh-109693: Update _gil_runtime_state.last_holder to use pyatomic.h (#110605) 2023-10-13 10:07:27 +09:00
Pablo Galindo Salgado
e7331365b4
gh-110721: Use the traceback module for PyErr_Display() and fallback to the C implementation (#110702) 2023-10-12 14:52:14 +00:00
Irit Katriel
7dd3c2b800
gh-109094: remove redundant arg to _PyFrame_PushTrampolineUnchecked (GH-110759) 2023-10-12 11:02:42 +01:00
Mark Shannon
19b7ead5eb
GH-109214: Convert _SAVE_CURRENT_IP to _SET_IP in tier 2 trace creation. (GH-110755) 2023-10-12 10:34:32 +01:00
Donghee Na
5bc1b7f08d
gh-109693: Update pycore_interp.h to use pyatomic.h (#110604) 2023-10-10 23:17:08 +09:00
Donghee Na
67e8d416cc
gh-109693: Use pyatomic.h for signal module (gh-110480) 2023-10-10 08:26:29 +09:00
Eric Snow
7bd560ce8d
gh-76785: Add SendChannel.send_buffer() (#110246)
(This is still a test module.)
2023-10-09 07:39:51 -06:00
Masaru Tsuchiyama
de2a4036cb
gh-108277: Add os.timerfd_create() function (#108382)
Add wrapper for timerfd_create, timerfd_settime, and timerfd_gettime to os module.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2023-10-07 19:33:22 +02:00
Sam Gross
6e97a9647a
gh-109549: Add new states to PyThreadState to support PEP 703 (gh-109915)
This adds a new field 'state' to PyThreadState that can take on one of three values: _Py_THREAD_ATTACHED, _Py_THREAD_DETACHED, or _Py_THREAD_GC.  The "attached" and "detached" states correspond closely to acquiring and releasing the GIL.  The "gc" state is current unused, but will be used to implement stop-the-world GC for --disable-gil builds in the near future.
2023-10-05 09:46:33 -06:00
Sam Gross
cf6f23b0e3
gh-88402: Add new sysconfig variables on Windows (GH-110049)
Co-authored-by: Filipe Laíns <filipe.lains@gmail.com>
2023-10-04 22:50:29 +00:00
Eric Snow
80dc39e1dc
gh-110310: Add a Per-Interpreter XID Registry for Heap Types (gh-110311)
We do the following:

* add a per-interpreter XID registry (PyInterpreterState.xidregistry)
* put heap types there (keep static types in _PyRuntimeState.xidregistry)
* clear the registries during interpreter/runtime finalization
* avoid duplicate entries in the registry (when _PyCrossInterpreterData_RegisterClass() is called more than once for a type)
* use Py_TYPE() instead of PyObject_Type() in _PyCrossInterpreterData_Lookup()

The per-interpreter registry helps preserve isolation between interpreters.  This is important when heap types are registered, which is something we haven't been doing yet but I will likely do soon.
2023-10-04 16:35:27 -06:00
Michael Droettboom
e561e98058
GH-109329: Add tier 2 stats (GH-109913) 2023-10-04 14:52:28 -07:00
Mark Shannon
bf4bc36069
GH-109369: Merge all eval-breaker flags and monitoring version into one word. (GH-109846) 2023-10-04 16:09:48 +01:00
Guido van Rossum
7c149a76b2
gh-104909: Split more LOAD_ATTR specializations (GH-110317)
* Split LOAD_ATTR_MODULE

* Split LOAD_ATTR_WITH_HINT

* Split _GUARD_TYPE_VERSION out of the latter

* Split LOAD_ATTR_CLASS

* Split LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES

* Fix indent of DEOPT_IF in macros

* Split LOAD_ATTR_METHOD_LAZY_DICT

* Split LOAD_ATTR_NONDESCRIPTOR_NO_DICT

* Fix omission of _CHECK_ATTR_METHOD_LAZY_DICT
2023-10-04 16:08:02 +01:00
Guido van Rossum
625ecbe92e
gh-109979: Unify _GUARD_TYPE_VERSION{,_STORE} (#110301)
Now the target for `DEOPT_IF()` is auto-filled,
we don't need a separate `_GUARD_TYPE_VERSION_STORE` uop.
2023-10-03 22:37:21 +00:00
Victor Stinner
d73501602f
gh-108867: Add PyThreadState_GetUnchecked() function (#108870)
Add PyThreadState_GetUnchecked() function: similar to
PyThreadState_Get(), but don't issue a fatal error if it is NULL. The
caller is responsible to check if the result is NULL. Previously,
this function was private and known as _PyThreadState_UncheckedGet().
2023-10-03 16:53:51 +00:00
Eric Snow
f5198b09e1
gh-109860: Use a New Thread State When Switching Interpreters, When Necessary (gh-110245)
In a few places we switch to another interpreter without knowing if it has a thread state associated with the current thread.  For the main interpreter there wasn't much of a problem, but for subinterpreters we were *mostly* okay re-using the tstate created with the interpreter (located via PyInterpreterState_ThreadHead()).  There was a good chance that tstate wasn't actually in use by another thread.

However, there are no guarantees of that.  Furthermore, re-using an already used tstate is currently fragile.  To address this, now we create a new thread state in each of those places and use it.

One consequence of this change is that PyInterpreterState_ThreadHead() may not return NULL (though that won't happen for the main interpreter).
2023-10-03 09:20:48 -06:00
Eric Snow
1dd9dee45d
gh-105716: Support Background Threads in Subinterpreters Consistently (gh-109921)
The existence of background threads running on a subinterpreter was preventing interpreters from getting properly destroyed, as well as impacting the ability to run the interpreter again. It also affected how we wait for non-daemon threads to finish.

We add PyInterpreterState.threads.main, with some internal C-API functions.
2023-10-02 20:12:12 +00:00
Victor Stinner
7513994c92
gh-110014: Include explicitly <unistd.h> header (#110155)
* Remove unused <locale.h> includes.
* Remove unused <fcntl.h> include in traceback.h.
* Remove redundant <assert.h> and <stddef.h> includes. They  are already
  included by "Python.h".
* Remove <object.h> include in faulthandler.c. Python.h already includes it.
* Add missing <stdbool.h> in pycore_pythread.h if HAVE_PTHREAD_STUBS
  is defined.
* Fix also warnings in pthread_stubs.h: don't redefine macros if they
  are already defined, like the __NEED_pthread_t macro.
2023-09-30 20:06:45 +00:00
Victor Stinner
74e425ec18
gh-110014: Fix _POSIX_THREADS and _POSIX_SEMAPHORES usage (#110139)
* pycore_pythread.h is now the central place to make sure that
  _POSIX_THREADS and _POSIX_SEMAPHORES macros are defined if
  available.
* Make sure that pycore_pythread.h is included when _POSIX_THREADS
  and _POSIX_SEMAPHORES macros are tested.
* PY_TIMEOUT_MAX is now defined as a constant, since its value
  depends on _POSIX_THREADS, instead of being defined as a macro.
* Prevent integer overflow in the preprocessor when computing
  PY_TIMEOUT_MAX_VALUE on Windows:
  replace "0xFFFFFFFELL * 1000 < LLONG_MAX"
  with "0xFFFFFFFELL < LLONG_MAX / 1000".
* Document the change and give hints how to fix affected code.
* Add an exception for PY_TIMEOUT_MAX  name to smelly.py
* Add PY_TIMEOUT_MAX to the stable ABI
2023-09-30 19:25:54 +02:00
Guido van Rossum
5bb6f0fcba
gh-104909: Split some more insts into ops (#109943)
These are the most popular specializations of `LOAD_ATTR` and `STORE_ATTR`
that weren't already viable uops:

* Split LOAD_ATTR_METHOD_WITH_VALUES
* Split LOAD_ATTR_METHOD_NO_DICT
* Split LOAD_ATTR_SLOT
* Split STORE_ATTR_SLOT
* Split STORE_ATTR_INSTANCE_VALUE

Also:

* Add `-v` flag to code generator which prints a list of non-viable uops
  (easter-egg: it can print execution counts -- see source)
* Double _Py_UOP_MAX_TRACE_LENGTH to 128



I had dropped one of the DEOPT_IF() calls! :-(
2023-09-27 15:27:44 -07:00
Eric Snow
32466c97c0
gh-109793: Allow Switching Interpreters During Finalization (gh-109794)
Essentially, we should check the thread ID rather than the thread state pointer.
2023-09-27 13:41:06 -06:00
Serhiy Storchaka
b8d1744e7b
gh-109611: Add convenient C API function _PyFile_Flush() (GH-109612) 2023-09-23 09:35:30 +03:00
Sam Gross
2aceb21ae6
gh-109693: Remove pycore_atomic_funcs.h (#109694)
_PyUnicode_FromId() now uses pyatomic.h functions instead.
2023-09-21 22:57:20 +02:00
Eric Snow
fd7e08a6f3
gh-76785: Use Pending Calls When Releasing Cross-Interpreter Data (gh-109556)
This fixes some crashes in the _xxinterpchannels module, due to a race between interpreters.
2023-09-19 15:01:34 -06:00
Sam Gross
9df6712c12
gh-108724: Fix _PySemaphore compile error on WASM (gh-109583)
Some WASM platforms have POSIX semaphores, but not sem_timedwait.
2023-09-19 17:35:11 +00:00
Sam Gross
0c89056fe5
gh-108724: Add PyMutex and _PyParkingLot APIs (gh-109344)
PyMutex is a one byte lock with fast, inlineable lock and unlock functions for the common uncontended case.  The design is based on WebKit's WTF::Lock.

PyMutex is built using the _PyParkingLot APIs, which provides a cross-platform futex-like API (based on WebKit's WTF::ParkingLot).  This internal API will be used for building other synchronization primitives used to implement PEP 703, such as one-time initialization and events.

This also includes tests and a mini benchmark in Tools/lockbench/lockbench.py to compare with the existing PyThread_type_lock.

Uncontended acquisition + release:
* Linux (x86-64): PyMutex: 11 ns, PyThread_type_lock: 44 ns
* macOS (arm64): PyMutex: 13 ns, PyThread_type_lock: 18 ns
* Windows (x86-64): PyMutex: 13 ns, PyThread_type_lock: 38 ns

PR Overview:

The primary purpose of this PR is to implement PyMutex, but there are a number of support pieces (described below).

* PyMutex:  A 1-byte lock that doesn't require memory allocation to initialize and is generally faster than the existing PyThread_type_lock.  The API is internal only for now.
* _PyParking_Lot:  A futex-like API based on the API of the same name in WebKit.  Used to implement PyMutex.
* _PyRawMutex:  A word sized lock used to implement _PyParking_Lot.
* PyEvent:  A one time event.  This was used a bunch in the "nogil" fork and is useful for testing the PyMutex implementation, so I've included it as part of the PR.
* pycore_llist.h:  Defines common operations on doubly-linked list.  Not strictly necessary (could do the list operations manually), but they come up frequently in the "nogil" fork. ( Similar to https://man.freebsd.org/cgi/man.cgi?queue)

---------

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2023-09-19 09:54:29 -06:00