Commit graph

544 commits

Author SHA1 Message Date
Tim Peters
2fc68e180f
gh-135551: Change how sorting picks minimum run length (#135553)
New scheme from Stefan Pochmann for picking minimum run lengths.

By allowing them to change a little from one run to the next, it's possible to
arrange for that all merges, at all levels, strongly tend to be as evenly balanced
as possible, for randomly ordered data. Meaning the number of initial runs is a
power of 2, and all merges involve runs whose lengths differ by no more than 1.
2025-06-26 23:48:05 -05:00
Neil Schemenauer
113de8545f
GH-133136: Revise QSBR to reduce excess memory held (gh-135473)
The free threading build uses QSBR to delay the freeing of dictionary
keys and list arrays when the objects are accessed by multiple threads
in order to allow concurrent reads to proceed with holding the object
lock. The requests are processed in batches to reduce execution
overhead, but for large memory blocks this can lead to excess memory
usage.

Take into account the size of the memory block when deciding when to
process QSBR requests.

Also track the amount of memory being held by QSBR for mimalloc pages.  Advance the write sequence if this memory exceeds a limit.  Advancing the sequence will allow it to be freed more quickly.

Process the held QSBR items from the "eval breaker", rather than from `_PyMem_FreeDelayed()`.  This gives a higher chance that the global read sequence has advanced enough so that items can be freed.

Co-authored-by: Sam Gross <colesbury@gmail.com>
2025-06-25 00:06:32 -07:00
Irit Katriel
5529213d4e
gh-100239: specialize BINARY_OP/SUBSCR for list-slice (#132626) 2025-05-01 10:28:52 +00:00
Mark Shannon
44e4c479fb
GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
Bénédikt Tran
a99bfaa53c
gh-133073: avoid NULL + 0 arithmetic in list_extend_* functions (#133074) 2025-04-28 15:59:09 +02:00
Victor Stinner
de9deb7ca7
gh-132713: Simplify list_repr_impl() (#132811) 2025-04-23 08:59:30 +02:00
Victor Stinner
a4ea80d523
gh-132713: Fix repr(list) race condition (#132801)
Hold a strong reference to the item while calling repr(item).
2025-04-22 22:09:35 +02:00
Bénédikt Tran
edbf7fb129
gh-111178: remove redundant casts for functions with correct signatures (#131673) 2025-04-01 17:18:11 +02:00
Mark Shannon
a45f25361d
GH-131238: More refactoring of core header files (GH-131351)
Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.
2025-03-17 14:41:05 +00:00
Mark Shannon
a1aeec61c4
GH-131238: Core header refactor (GH-131250)
* Moves most structs in pycore_ header files into pycore_structs.h and pycore_runtime_structs.h

* Removes many cross-header dependencies
2025-03-17 09:19:04 +00:00
T. Wouters
de2f7da77d
gh-115999: Add free-threaded specialization for FOR_ITER (#128798)
Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)
2025-03-12 16:21:46 +01:00
Victor Stinner
9d759b63d8
gh-111178: Change Argument Clinic signature for METH_O (#130682)
Use "PyObject*" for METH_O functions to fix an undefined behavior.
2025-03-11 16:33:36 +01:00
mpage
d7bb7c7817
gh-118331: Fix a couple of issues when list allocation fails (#130811)
* Fix use after free in list objects

Set the items pointer in the list object to NULL after the items array
is freed during list deallocation. Otherwise, we can end up with a list
object added to the free list that contains a pointer to an already-freed
items array.

* Mark `_PyList_FromStackRefStealOnSuccess` as escaping

I think technically it's not escaping, because the only object that
can be decrefed if allocation fails is an exact list, which cannot
execute arbitrary code when it is destroyed. However, this seems less
intrusive than trying to special cases objects in the assert in `_Py_Dealloc`
that checks for non-null stackpointers and shouldn't matter for performance.
2025-03-05 10:42:09 -08:00
T. Wouters
388e1ca9f0
gh-115999: Make list and tuple iteration more thread-safe. (#128637)
Make tuple iteration more thread-safe, and actually test concurrent iteration of tuple, range and list. (This is prep work for enabling specialization of FOR_ITER in free-threaded builds.) The basic premise is:

Iterating over a shared iterable (list, tuple or range) should be safe, not involve data races, and behave like iteration normally does.

Using a shared iterator should not crash or involve data races, and should only produce items regular iteration would produce. It is not guaranteed to produce all items, or produce each item only once. (This is not the case for range iteration even after this PR.)

Providing stronger guarantees is possible for some of these iterators, but it's not always straight-forward and can significantly hamper the common case. Since iterators in general aren't shared between threads, and it's simply impossible to concurrently use many iterators (like generators), better to make sharing iterators without explicit synchronization clearly wrong.

Specific issues fixed in order to make the tests pass:

 - List iteration could occasionally fail an assertion when a shared list was shrunk and an item past the new end was retrieved concurrently. There's still some unsafety when deleting/inserting multiple items through for example slice assignment, which uses memmove/memcpy.

 - Tuple iteration could occasionally crash when the iterator's reference to the tuple was cleared on exhaustion. Like with list iteration, in free-threaded builds we can't safely and efficiently clear the iterator's reference to the iterable (doing it safely would mean extra, slow refcount operations), so just keep the iterable reference around.
2025-02-18 16:52:46 -08:00
sobolevn
63f0406d5a
gh-129643: Fix PyList_Insert in free-threading builds (#129680) 2025-02-06 15:54:40 +03:00
Kumar Aditya
fb5d1c9236
gh-129643: fix thread safety of PyList_SetItem (#129644) 2025-02-05 13:08:02 +05:30
Pieter Eendebak
1a80214f11
gh-126703: Add freelists for list and tuple iterators (GH-128592) 2025-01-29 09:15:24 +00:00
Mark Shannon
470a0a68eb
GH-128682: Change a couple of functions to only steal references on success. (GH-129132)
Change PyTuple_FromStackRefSteal and PyList_FromStackRefSteal to only steal on success to avoid escaping
2025-01-22 10:51:37 +00:00
da-woods
42f7a00ae8
Clean up redundant ifdef in list getitem (#128257)
It's already inside a `Py_GIL_DISABLED` block so the `#else` clause is always unused.
2024-12-26 14:40:48 +00:00
Mark Shannon
5a23994a3d
GH-127058: Make PySequence_Tuple safer and probably faster. (#127758)
* Use a small buffer, then list when constructing a tuple from an arbitrary sequence.
2024-12-11 14:02:59 +00:00
Sam Gross
e51da64ac3
gh-127536: Add missing locks in listobject.c (GH-127580)
We were missing locks around some list operations in the free threading
build.
2024-12-04 14:12:15 -05:00
Sam Gross
c7dec02de2
gh-127521: Mark list as "shared" before resizing if necessary (#127524)
In the free threading build, if a non-owning thread resizes a list,
it must use QSBR to free the old list array because there may be a
concurrent access (without a lock) from the owning thread.

To match the pattern in dictobject.c, we just mark the list as "shared"
before resizing if it's from a non-owning thread and not already marked
as shared.
2024-12-02 14:38:26 -05:00
Donghee Na
e2713409cf
gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227) 2024-12-02 10:38:17 +09:00
Mark Shannon
fa40922597
GH-126547: Pre-assign version numbers for a few common classes (GH-126551) 2024-11-08 16:44:44 +00:00
Victor Stinner
1b2a5485f9
gh-125196: PyUnicodeWriter_Discard(NULL) does nothing (#125222) 2024-10-09 23:32:02 +00:00
Victor Stinner
52f70da19c
gh-125196: Use PyUnicodeWriter for repr(list) (#125202)
Replace the private _PyUnicodeWriter with the public PyUnicodeWriter.

Replace PyObject_Repr() + _PyUnicodeWriter_WriteStr()
with PyUnicodeWriter_WriteRepr().
2024-10-09 23:56:30 +02:00
Donghee Na
ad7c778546
gh-123990: Good bye WITH_FREELISTS macro (gh-124358) 2024-09-24 01:28:59 +00:00
Sam Gross
ab094d1b2b
gh-117139: Replace _PyList_FromArraySteal with stack ref variant (#122830)
This replaces `_PyList_FromArraySteal` with `_PyList_FromStackRefSteal`.
It's functionally equivalent, but takes a `_PyStackRef` array instead of
an array of `PyObject` pointers.

Co-authored-by: Ken Jin <kenjin@python.org>
2024-08-12 14:49:49 -04:00
Sam Gross
5716cc3529
gh-100240: Use a consistent implementation for freelists (#121934)
This combines and updates our freelist handling to use a consistent
implementation. Objects in the freelist are linked together using the
first word of memory block.

If configured with freelists disabled, these operations are essentially
no-ops.
2024-07-22 12:08:27 -04:00
Serhiy Storchaka
8ecb8962e3
gh-121288: Make error message for index() methods consistent (GH-121395)
Make error message for index() methods consistent

Remove the repr of the searched value (which can be arbitrary large)
from ValueError messages for list.index(), range.index(), deque.index(),
deque.remove() and ShareableList.index().  Make the error messages
consistent with error messages for other index() and remove()
methods.
2024-07-05 10:50:45 -07:00
Xie Yanbo
0153fd0940
Fix typos in comments (#120821) 2024-06-24 19:47:00 +02:00
Sam Gross
8f17d69b7b
gh-119344: Make critical section API public (#119353)
This makes the following macros public as part of the non-limited C-API for
locking a single object or two objects at once.

* `Py_BEGIN_CRITICAL_SECTION(op)` / `Py_END_CRITICAL_SECTION()`
* `Py_BEGIN_CRITICAL_SECTION2(a, b)` / `Py_END_CRITICAL_SECTION2()`

The supporting functions and structs used by the macros are also exposed for
cases where C macros are not available.
2024-06-21 15:50:18 -04:00
Nikita Sobolev
8334a1b55c
gh-120384: Fix array-out-of-bounds crash in list_ass_subscript (#120442) 2024-06-21 13:48:38 +03:00
Nikita Sobolev
141babad9b
gh-120298: Fix use-after-free in list_richcompare_impl (#120303)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-06-11 10:04:27 +03:00
Donghee Na
ab4263a82a
gh-119053: Implement the fast path for list.__getitem__ (gh-119112) 2024-05-21 09:49:18 -04:00
Sam Gross
2402715e10
gh-118561: Fix crash involving list.extend in free-threaded build (#118723)
The `list_preallocate_exact` function did not zero initialize array
contents. In the free-threaded build, this could expose uninitialized
memory to concurrent readers between the call to
`list_preallocate_exact` and the filling of the array contents with
items.
2024-05-09 18:52:27 +00:00
Dino Viehland
1e67b9207c
gh-117657: Fix TSAN list set failure (#118260)
* Fix TSAN list set failure

* Relaxed atomic is sufficient, add targetted test

* More list

* Remove atomic assign in list

* Fixup white space
2024-05-02 13:03:05 -07:00
Donghee Na
94444ea45a
gh-112069: Add _PySet_NextEntryRef to be thread-safe. (gh-117990) 2024-04-19 00:18:22 +09:00
Sam Gross
027fa2eccf
gh-112087: Make list.extend(dict) behave atomically (#117438)
Add a special case for `list.extend(dict)` and `list(dict)` so that those
patterns behave atomically with respect to modifications to the list or
dictionary.

This is required by multiprocessing, which assumes that
`list(_finalizer_registry)` is atomic.
2024-04-02 10:45:00 -04:00
Tim Peters
8383915031
GH-116939: Rewrite binarysort() (#116940)
Rewrote binarysort() for clarity.

Also changed the signature to be more coherent (it was mixing sortslice with raw pointers).

No change in method or functionality. However, I left some experiments in, disabled for now
via `#if` tricks. Since this code was first written, some kinds of comparisons have gotten
enormously faster (like for lists of floats), which changes the tradeoffs.

For example, plain insertion sort's simpler innermost loop and highly predictable branches
leave it very competitive (even beating, by a bit) binary insertion when comparisons are
very cheap, despite that it can do many more compares. And it wins big on runs that
are already sorted (moving the next one in takes only 1 compare then).

So I left code for a plain insertion sort, to make future experimenting easier.

Also made the maximum value of minrun a `#define` (``MAX_MINRUN`) to make
experimenting with that easier too.

And another bit of `#if``-disabled code rewrites binary insertion's innermost loop to
remove its unpredictable branch. Surprisingly, this doesn't really seem to help
overall. I'm unclear on why not. It certainly adds more instructions, but they're very
simple, and it's hard to be believe they cost as much as a branch miss.
2024-03-21 22:27:25 -05:00
Donghee Na
a3cf0fada0
gh-116621: Specialize list.extend for dict items (gh-116888) 2024-03-19 12:18:07 +09:00
Victor Stinner
f6cdc6b4a1
Revert "gh-96844: Improve error message of list.remove (gh-106455)" (#116956)
This reverts commit 217f47d6e5.
2024-03-18 13:54:45 +00:00
Donghee Na
8da83f3386
gh-116621: Specialize list.extend for dict keys/values (gh-116816) 2024-03-15 23:48:34 +09:00
Tim Peters
bf121d6a69
GH-116554: Relax list.sort()'s notion of "descending" runs (#116578)
* GH-116554: Relax list.sort()'s notion of "descending" run

Rewrote `count_run()` so that sub-runs of equal elements no longer end a descending run. Both ascending and descending runs can have arbitrarily many sub-runs of arbitrarily many equal elements now. This is tricky, because we only use ``<`` comparisons, so checking for equality doesn't come "for free". Surprisingly, it turned out there's a very cheap (one comparison) way to determine whether an ascending run consisted of all-equal elements. That sealed the deal.

In addition, after a descending run is reversed in-place, we now go on to see whether it can be extended by an ascending run that just happens to be adjacent. This succeeds in finding at least one additional element to append about half the time, and so appears to more than repay its cost (the savings come from getting to skip a binary search, when a short run is artificially forced to length MIINRUN later, for each new element `count_run()` can add to the initial run).

While these have been in the back of my mind for years, a question on StackOverflow pushed it to action:

https://stackoverflow.com/questions/78108792/

They were wondering why it took about 4x longer to sort a list like:

[999_999, 999_999, ..., 2, 2, 1, 1, 0, 0]

than "similar" lists. Of course that runs very much faster after this patch.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
2024-03-12 19:59:42 -05:00
Donghee Na
3325699ffa
gh-116621: Set manual critical section for list.extend (gh-116657) 2024-03-13 07:28:23 +09:00
Donghee Na
5b2f21faf3
gh-112087: Make list.sort to be thread-safe for PEP 703. (gh-116553) 2024-03-10 00:45:42 +00:00
Donghee Na
17d31bf384
gh-112087: Store memory allocation information into _PyListArray (gh-116529) 2024-03-09 23:50:28 +00:00
Ken Jin
41457c7fdb
gh-116381: Remove bad specializations, add fail stats (GH-116464)
* Remove bad specializations, add fail stats
2024-03-08 00:21:21 +08:00
Ken Jin
7114cf20c0
gh-116381: Specialize CONTAINS_OP (GH-116385)
* Specialize CONTAINS_OP

* 📜🤖 Added by blurb_it.

* Add PyAPI_FUNC for JIT

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2024-03-07 03:30:11 +08:00
Donghee Na
d2f1b0eb49
gh-112087: Update list_get_item_ref to optimistically avoid locking (gh-116353)
Co-authored-by: Sam Gross <colesbury@gmail.com>
2024-03-06 08:21:33 +09:00