Commit graph

571 commits

Author SHA1 Message Date
Miss Islington (bot)
320a1dcd97
[3.12] gh-127563: use dk_log2_index_bytes=3 in empty dicts (GH-127568) (GH-127813)
This fixes a UBSan failure (unaligned zero-size memcpy) in `dictobject.c`.
(cherry picked from commit 9af96f4406)

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-12-12 12:50:51 +01:00
Miss Islington (bot)
1fc1a185ed
[3.12] gh-116938: Fix dict.update docstring and remove erraneous full stop from dict documentation (GH-125421) (#126151)
gh-116938: Fix `dict.update` docstring and remove erraneous full stop from `dict` documentation (GH-125421)
(cherry picked from commit 5527c4051c)

Co-authored-by: Prometheus3375 <35541026+Prometheus3375@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-10-29 23:22:20 +00:00
Petr Viktorin
49f6beb56a
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-09-27 13:28:48 -07:00
Dino Viehland
c8f5ca6810
[3.12] gh-122208: Don't delivery PyDict_EVENT_ADDED until it can't fail (#122327)
Don't delivery PyDict_EVENT_ADDED until it can't fail
2024-07-30 09:13:40 -07:00
Dong-hee Na
b9fcfa6024
gh-104717: Add comment about manual loop unrolling (gh-104718) 2023-05-21 21:08:28 +09:00
Eric Snow
b8f7ab5783
gh-104252: Immortalize Py_EMPTY_KEYS (gh-104253)
This was missed in gh-19474.  It matters for with a per-interpreter GIL since PyDictKeysObject.dk_refcnt breaks isolation and leads to races.
2023-05-10 07:28:40 -06:00
Eric Snow
743687434c
gh-102304: Move the Total Refcount to PyInterpreterState (gh-102545)
Moving it valuable with a per-interpreter GIL.  However, it is also useful without one, since it allows us to identify refleaks within a single interpreter or where references are escaping an interpreter.  This becomes more important as we move the obmalloc state to PyInterpreterState.

https://github.com/python/cpython/issues/102304
2023-03-21 11:46:09 -06:00
Inada Naoki
65fb7c4055
gh-102701: Fix overflow in dictobject.c (GH-102750) 2023-03-17 22:39:09 +09:00
Eric Snow
b45d14b886
gh-100227: Move dict_state.global_version to PyInterpreterState (gh-102338)
https://github.com/python/cpython/issues/100227
2023-03-09 08:16:30 -07:00
Eric Snow
5e5acd291f
gh-100227: Move next_keys_version to PyInterpreterState (gh-102335)
https://github.com/python/cpython/issues/100227
2023-03-08 18:04:16 -07:00
Eric Snow
cbb0aa71d0
gh-102304: Consolidate Direct Usage of _Py_RefTotal (gh-102514)
This simplifies further changes to _Py_RefTotal (e.g. make it atomic or move it to PyInterpreterState).

https://github.com/python/cpython/issues/102304
2023-03-08 12:03:50 -07:00
Irit Katriel
11a2c6ce51
gh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (in Objects/) (#102218) 2023-03-08 17:03:18 +00:00
Carl Meyer
1e703a4733
gh-102381: don't call watcher callback with dead object (#102382)
Co-authored-by: T. Wouters <thomas@python.org>
2023-03-07 17:10:58 -07:00
Mark Shannon
feec49c407
GH-101578: Normalize the current exception (GH-101607)
* Make sure that the current exception is always normalized.

* Remove redundant type and traceback fields for the current exception.

* Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException

* Add new API functions: PyException_GetArgs, PyException_SetArgs
2023-02-08 09:31:12 +00:00
Victor Stinner
4246fe977d
gh-99845: Change _PyDict_KeysSize() return type to size_t (#99848)
* Change _PyDict_KeysSize() and shared_keys_usable_size() return type
  from signed (Py_ssize_t) to unsigned (size_t) type.
* new_values() argument type is now unsigned (size_t).
* init_inline_values() now uses size_t rather than int for the 'i'
  iterator variable.
* type.__sizeof__() implementation now uses unsigned (size_t) type.
2022-11-29 12:12:17 +01:00
Eric Snow
9db1e17c80
gh-81057: Move the global Dict-Related Versions to _PyRuntimeState (gh-99497)
We also move the global func version.

https://github.com/python/cpython/issues/81057
2022-11-16 10:37:29 -07:00
Victor Stinner
8211cf5d28
gh-99300: Replace Py_INCREF() with Py_NewRef() (#99530)
Replace Py_INCREF() and Py_XINCREF() using a cast with Py_NewRef()
and Py_XNewRef().
2022-11-16 18:34:24 +01:00
Victor Stinner
6dedf42527
gh-99300: Use Py_NewRef() in Objects/dictobject.c (#99333)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in Objects/dictobject.c.
2022-11-10 16:27:53 +01:00
Carl Meyer
e82d977eb0
gh-91052: Add PyDict_Unwatch for unwatching a dictionary (#98055) 2022-10-07 17:37:46 -07:00
Carl Meyer
a4b7794887
GH-91052: Add C API for watching dictionaries (GH-31787) 2022-10-07 01:08:00 +01:00
Samuel
8563966be4
Fix minor comment typo in dictobject.c (GH-96960)
Fix a minor comment typo in the Objects/dictobject.c file.

Automerge-Triggered-By: GH:methane
2022-09-20 05:17:40 -07:00
Nikita Sobolev
30cc1901ef
gh-96364: Fix text signatures of __getitem__ for list and dict (GH-96365) 2022-09-09 17:37:02 +09:00
Matthias Görgens
4a6fa89465
Remove dead code in _PyDict_GetItemHint and rename to _PyDict_LookupIndex (GH-95948) 2022-08-18 10:19:21 +01:00
Mark Shannon
de388c0a7b
GH-95245: Store object values and dict pointers in single tagged pointer. (GH-95278) 2022-08-01 14:34:54 +01:00
Mark Shannon
27055d766a
GH-92678: Expose managed dict clear and visit functions (#95246) 2022-07-25 22:30:53 +01:00
Géry Ogam
a95138b2c5
bpo-43857: Improve the AttributeError message when deleting a missing attribute (#25424)
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
2022-05-05 06:37:26 -07:00
Mark Shannon
836b17c9c3
Add more stats for freelist use and allocations. (GH-92211) 2022-05-03 16:40:24 -06:00
Victor Stinner
804f2529d8
gh-91320: Use _PyCFunction_CAST() (#92251)
Replace "(PyCFunction)(void(*)(void))func" cast with
_PyCFunction_CAST(func).

Change generated by the command:

sed -i -e \
  's!(PyCFunction)(void(\*)(void)) *\([A-Za-z0-9_]\+\)!_PyCFunction_CAST(\1)!g' \
  $(find -name "*.c")
2022-05-03 21:42:14 +02:00
Victor Stinner
c14d7e4b81
bpo-47164: Add _PyASCIIObject_CAST() macro (GH-32191)
Add macros to cast objects to PyASCIIObject*, PyCompactUnicodeObject*
and PyUnicodeObject*: _PyASCIIObject_CAST(),
_PyCompactUnicodeObject_CAST() and _PyUnicodeObject_CAST(). Using
these new macros make the code more readable and check their argument
with: assert(PyUnicode_Check(op)).

Remove redundant assert(PyUnicode_Check(op)) in macros using directly
or indirectly these new CAST macros.

Replacing existing casts with these macros.
2022-03-31 09:59:27 +02:00
Mark Shannon
03c2a36b2b
bpo-46903: Handle str-subclasses in virtual instance dictionaries. (GH-31658) 2022-03-04 11:31:29 +00:00
Inada Naoki
3241cba4ec
dict: Fix refleak (GH-31650) 2022-03-03 14:30:58 +09:00
Inada Naoki
03642df1a1
dict: Internal cleanup (GH-31641)
* Make empty_key from split table to combined table.
* Use unicode_get_hash() when possible.
2022-03-02 19:05:12 +09:00
Inada Naoki
9833bb91e4
bpo-46845: Reduce dict size when all keys are Unicode (GH-31564) 2022-03-02 08:09:28 +09:00
Victor Stinner
042f31da55
bpo-45459: C API uses type names rather than structure names (GH-31528)
Thanks to the new pytypedefs.h, it becomes to use type names like
PyObject rather like structure names like "struct _object".
2022-02-24 17:51:59 +01:00
Inada Naoki
1e344684d8
dict: Add dk_log2_index_bytes (GH-31439) 2022-02-22 11:03:15 +00:00
Inada Naoki
5543d9c559
dict: Use DK_LOG_SIZE in hot loop. (GH-31405)
DK_LOG_SIZE(key) < 8 is faster than DK_SIZE(key) <= 0xff, at least on GCC.
2022-02-19 13:15:20 +09:00
Eric Snow
81c72044a1
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
2022-02-08 13:39:07 -07:00
Mark Shannon
25db2b361b
bpo-46675: Allow object value arrays and split key dictionaries larger than 16 (GH-31191) 2022-02-08 11:50:38 +00:00
Victor Stinner
760349198d
bpo-46670: Remove unused macros in the Objects directory (GH-31193) 2022-02-07 16:21:41 +01:00
Mark Shannon
48be46ec1f
bpo-46072: Add some object layout and allocation stats (GH-31051) 2022-02-01 15:05:18 +00:00
Victor Stinner
ac1f152421
bpo-46417: Use _PyType_CAST() in Objects directory (GH-30764) 2022-01-21 23:33:43 +01:00
Mark Shannon
8319114fee
bpo-45947: Place dict and values pointer at fixed (negative) offset just before GC header. (GH-29879)
* Place __dict__ immediately before GC header for plain Python objects.

* Fix up lazy dict creation logic to use managed dict pointers.

* Manage values pointer, placing them directly before managed dict pointers.

* Convert hint-based load/store attr specialization target managed dict classes.

* Specialize LOAD_METHOD for managed dict objects.

* Remove unsafe _PyObject_GC_Calloc function.

* Remove unsafe _PyObject_GC_Malloc() function.

* Add comment explaning use of Py_TPFLAGS_MANAGED_DICT.
2021-12-07 16:02:53 +00:00
Dennis Sweeney
036fead695
bpo-45609: Specialize STORE_SUBSCR (GH-29242)
* Specialize STORE_SUBSCR for list[int], and dict[object]

* Adds _PyDict_SetItem_Take2 which consumes references to the key and values.
2021-11-19 10:30:37 +00:00
Christian Heimes
9942f42a93
bpo-45522: Allow to disable freelists on build time (GH-29056)
Freelists for object structs can now be disabled. A new ``configure``
option ``--without-freelists`` can be used to disable all freelists
except empty tuple singleton. Internal Py*_MAXFREELIST macros can now
be defined as 0 without causing compiler warnings and segfaults.

Signed-off-by: Christian Heimes <christian@python.org>
2021-10-21 06:12:20 -07:00
Mark Shannon
a8b9350964
bpo-45340: Don't create object dictionaries unless actually needed (GH-28802)
* Never change types' cached keys. It could invalidate inline attribute objects.

* Lazily create object dictionaries.

* Update specialization of LOAD/STORE_ATTR.

* Don't update shared keys version for deletion of value.

* Update gdb support to handle instance values.

* Rename SPLIT_KEYS opcodes to INSTANCE_VALUE.
2021-10-13 14:19:34 +01:00
Victor Stinner
d943d19172
bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895)
* Move _PyObject_CallNoArgs() to pycore_call.h (internal C API).
* _ssl, _sqlite and _testcapi extensions now call the public
  PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs().
* _lsprof extension is now built with Py_BUILD_CORE_MODULE macro
  defined to get access to internal _PyObject_CallNoArgs().
2021-10-12 08:38:19 +02:00
Victor Stinner
ce3489cfdb
bpo-45439: Rename _PyObject_CallNoArg() to _PyObject_CallNoArgs() (GH-28891)
Fix typo in the private _PyObject_CallNoArg() function name: rename
it to _PyObject_CallNoArgs() to be consistent with the public
function PyObject_CallNoArgs().
2021-10-12 00:42:23 +02:00
Mark Shannon
a7252f88d3
bpo-40116: Add insertion order bit-vector to dict values to allow dicts to share keys more freely. (GH-28520) 2021-10-06 13:19:53 +01:00
Serhiy Storchaka
f25f2e2e8c
Clean up initialization __class_getitem__ with Py_GenericAlias. (GH-28450)
The cast to PyCFunction is redundant. Overuse of redundant casts
can hide actual bugs.
2021-09-19 18:05:30 +03:00
Mark Shannon
064464fc38
bpo-45219: Factor dictkey indexing (GH-28389) 2021-09-17 12:20:51 +01:00