cpython/Python
Emma Smith 3b4333583f
gh-132983: Introduce _zstd bindings module (GH-133027)
* Add _zstd module for https://peps.python.org/pep-0784/

This commit introduces the `_zstd` module, with bindings to libzstd from
the pyzstd project. It also includes the unix build system configuration.
Windows build system support will be integrated independently as it
depends on integration with cpython-source-deps.

* Add _zstd to modules

* Fix path for compression.zstd module

* Ignore _zstd module like _io

* Expand module state macros to improve code quality

Also removes module state references from the classes in the _zstd
module and instead uses PyType_GetModuleState()

* Remove backticks suggested in review

Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>

* Use critical sections to lock object state

This should avoid races and deadlocks.

* Remove compress/decompress and mark module as not reliant on the GIL

The `compress`/`decompress` functions will be moved to Python code for simplicity.
C implementations can always be re-added in the future.

Also, mark _zstd as not requiring the GIL.

* Lift critical section to avoid clang warning

* Respond to comments by picnixz

* Call out pyzstd explicitly in license description

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>

* Use a much more robust implementation...

... for `get_zstd_state_from_type`

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

* Use PyList_GetItemRef for thread safety purposes

* Use a macro for the minimum supported version

* remove const from primivite types

* Use PyMem_New in another spot

* Simplify error handling in _get_frame_size

* Another simplification of error handling in get_frame_info

* Rename _module_state to mod_state

* Rewrite comment explaining the context of the code

* Add link to pyzstd

* Add TODO about refactoring dict training code

* Use PyModule_AddObjectRef over PyModule_AddObject

PyModule_AddObject is soft-deprecated, so we should use PyModule_AddObjectRef

* Check result of OutputBufferGrow

* Simplify return logic in `add_constant_to_type`

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

* Ignore return value of _zstd_clear()

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

* Remove redundant comments

* Remove __reduce__ from ZstdDict

We should instead document that to pickle a dictionary a user should use
the `.dict_content` attribute.

* Use PyUnicode_FromFormat instead of a buffer

* Don't use C constants/types in error messages

* Make error messages easier to understand for Python users

* Lower minimum required version 1.4.0

* Use casts and make slot function signatures correct

* Be consistent with CPython on const usage

* Make else clauses in line with PEP 7

* Fix over-indented blocks in argument clinic

* Add critical section around ZSTD_DCtx_setParameter

* Add a TODO about refactoring critical sections

* Use Py_UNREACHABLE

* Move bytes operations out of Py_BEGIN_ALLOW_THREADS

* Add TODO about ensuring a lock is held

* Remove asserts that may not be correct

* Add TODO to make ZstdDict and others GC objects

* Make objects GC tracked

* Remove unused include

* Fix some memory issues

* Fix refleaks on module and in ZstdDict

* Update configure to check for ZDICT_finalizeDictionary

* Properly check version in configure

* exit(1) if check fails

* Use AC_RUN_IFELSE

* Use a define() to re-use version check

* Actually properly set _zstd module status based on version

---------

Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-05-04 01:29:55 +00:00
..
clinic gh-128384: Use a context variable for warnings.catch_warnings (gh-130010) 2025-04-09 16:18:54 -07:00
frozen_modules
_contextvars.c gh-128384: Use a context variable for warnings.catch_warnings (gh-130010) 2025-04-09 16:18:54 -07:00
_warnings.c gh-131927: Prevent emitting optimizer warnings twice in the REPL (#131993) 2025-04-12 11:34:36 +01:00
asdl.c
asm_trampoline.S gh-120400 :Support Linux perf profile to see Python calls on RISC-V architecture (#120089) 2024-06-12 14:24:46 +01:00
assemble.c gh-87859: Track Code Object Local Kinds For Arguments (gh-132980) 2025-04-29 02:21:47 +00:00
ast.c gh-132661: Implement PEP 750 (#132662) 2025-04-30 11:46:41 +02:00
ast_opt.c gh-132661: Implement PEP 750 (#132662) 2025-04-30 11:46:41 +02:00
ast_unparse.c gh-132661: Implement PEP 750 (#132662) 2025-04-30 11:46:41 +02:00
bltinmodule.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
bootstrap_hash.c GH-131238: Core header refactor (GH-131250) 2025-03-17 09:19:04 +00:00
brc.c Fix typos in documentation and comments (#119763) 2024-06-04 10:22:22 +00:00
bytecodes.c gh-132744: Check recursion limit in CALL_PY_GENERAL (GH-132746) 2025-05-02 17:36:29 +01:00
ceval.c Remove duplicate includes: Python/{bytecodes,ceval,optimizer_analysis}.c (#132622) 2025-05-01 12:07:53 +01:00
ceval_gil.c gh-132859: Run debugger scripts in their own namespaces (#132860) 2025-04-23 23:40:24 +00:00
ceval_macros.h gh-132758: Fix tail call and pystats builds (GH-132759) 2025-04-23 18:17:35 +08:00
codecs.c gh-133036: Deprecate codecs.open (#133038) 2025-04-30 10:11:09 +09:00
codegen.c gh-133279: Assert with HAS_TARGET in the codegen_addop_j function (#133280) 2025-05-02 13:52:48 +01:00
compile.c gh-130907: Treat all module-level annotations as conditional (#131550) 2025-04-28 06:10:28 -07:00
condvar.h gh-104530: Enable native Win32 condition variables by default (GH-104531) 2024-02-02 13:50:51 +00:00
config_common.h gh-76785: Add PyInterpreterConfig Helpers (gh-117170) 2024-04-02 20:35:52 +00:00
context.c gh-132002: Fix crash of ContextVar on unhashable str subtype (#132003) 2025-04-02 14:48:47 +03:00
critical_section.c gh-114203: Optimise simple recursive critical sections (#128126) 2024-12-23 13:31:33 +01:00
crossinterp.c gh-132775: Add _PyPickle_GetXIData() (gh-133107) 2025-04-30 17:34:05 -06:00
crossinterp_data_lookup.h gh-132775: Add _PyBytes_GetXIData() (gh-133101) 2025-04-28 12:52:36 -06:00
crossinterp_exceptions.h gh-132781: fix refleaks in crossinterp_exceptions.h post gh-132782 (#132989) 2025-04-26 12:14:14 +02:00
dtoa.c gh-131238: Add explicit includes to pycore headers (#131257) 2025-03-17 12:32:43 +01:00
dup2.c gh-108765: Python.h no longer includes <unistd.h> (#108783) 2023-09-02 16:50:18 +02:00
dynamic_annotations.c
dynload_hpux.c gh-88402: Add new sysconfig variables on Windows (GH-110049) 2023-10-04 22:50:29 +00:00
dynload_shlib.c gh-131238: Remove more includes from pycore_interp.h (#131480) 2025-03-19 23:01:32 +01:00
dynload_stub.c gh-88402: Add new sysconfig variables on Windows (GH-110049) 2023-10-04 22:50:29 +00:00
dynload_win.c gh-131238: Remove pycore_runtime.h from pycore_pystate.h (#131356) 2025-03-19 17:33:24 +01:00
emscripten_signal.c GH-108614: Unbreak emscripten build (GH-109132) 2023-09-08 17:54:45 +01:00
emscripten_trampoline.c gh-128627: Skip wasm-gc on iOS Safari where it's broken (#130418) 2025-02-24 07:26:04 +08:00
errors.c gh-132781: Cleanup Code Related to NotShareableError (gh-132782) 2025-04-25 14:43:38 -06:00
executor_cases.c.h gh-132744: Check recursion limit in CALL_PY_GENERAL (GH-132746) 2025-05-02 17:36:29 +01:00
fileutils.c gh-124476: Fix decoding from the locale encoding in the C.UTF-8 locale (GH-132477) 2025-04-14 21:32:41 +03:00
flowgraph.c gh-132775: Add _PyCode_ReturnsOnlyNone() (gh-132981) 2025-04-28 20:12:52 -06:00
formatter_unicode.c GH-131238: Core header refactor (GH-131250) 2025-03-17 09:19:04 +00:00
frame.c gh-131173: Improve exception handling during take_ownership processing (#132620) 2025-04-17 13:38:34 -07:00
frozen.c GH-89435: os.path should not be a frozen module (#126924) 2024-11-22 18:50:30 +00:00
frozenmain.c gh-105716: Fix _PyInterpreterState_IsRunningMain() For Embedders (gh-117140) 2024-03-21 18:20:20 -06:00
future.c gh-126139: Improve error message location for future statement with unknown feature (#126140) 2024-10-29 23:57:59 +00:00
gc.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
gc_free_threading.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
gc_gil.c gh-100240: Use a consistent implementation for freelists (#121934) 2024-07-22 12:08:27 -04:00
generated_cases.c.h gh-132744: Check recursion limit in CALL_PY_GENERAL (GH-132746) 2025-05-02 17:36:29 +01:00
getargs.c gh-132987: Support __index__() for "k" and "K" formats in PyArg_Parse (GH-132988) 2025-04-26 17:14:18 +03:00
getcompiler.c
getcopyright.c gh-126133: Only use start year in PSF copyright, remove end years (#126236) 2024-11-12 15:59:19 +02:00
getopt.c Make the Python CLI error message style more consistent (GH-128129) 2025-01-11 11:17:35 +02:00
getplatform.c
getversion.c gh-119132: Update sys.version to identify free-threaded or not. (gh-119134) 2024-05-18 19:44:40 +00:00
hamt.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
hashtable.c gh-111545: Add Py_HashPointer() function (#112096) 2023-12-06 15:09:22 +01:00
import.c gh-132775: Drop PyUnstable_InterpreterState_GetMainModule() (gh-132978) 2025-04-28 12:46:22 -06:00
importdl.c Remove duplicate includes: Python/importdl.c (#132623) 2025-04-18 02:49:19 +01:00
index_pool.c gh-130740: Move some stdbool.h includes after Python.h (#130738) 2025-03-02 09:56:49 +00:00
initconfig.c gh-107954: Add audit event to PyConfig_Set() (#132958) 2025-04-25 18:30:39 +02:00
instruction_sequence.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
instrumentation.c gh-132336: Mark a few "slow path" functions used by the interpreter loop as noinline (#132337) 2025-04-10 10:41:15 +02:00
interpconfig.c GH-131238: Core header refactor (GH-131250) 2025-03-17 09:19:04 +00:00
intrinsics.c Get rid of ERROR_IF's "label" parameter (GH-132654) 2025-04-29 17:21:53 -07:00
jit.c gh-132661: Implement PEP 750 (#132662) 2025-04-30 11:46:41 +02:00
legacy_tracing.c gh-111178: fix UBSan failures for Python/legacy_tracing.c (#131611) 2025-03-24 11:00:32 +01:00
lock.c gh-111178: Fix function signatures to fix undefined behavior (#131191) 2025-03-14 09:52:15 +00:00
marshal.c gh-131238: Add explicit includes to pycore headers (#131257) 2025-03-17 12:32:43 +01:00
modsupport.c gh-132909: handle overflow for 'K' format in do_mkvalue (#132911) 2025-04-25 11:02:57 +00:00
mysnprintf.c
mystrtoul.c gh-108765: Python.h no longer includes <ctype.h> (#108831) 2023-09-03 18:54:27 +02:00
object_stack.c gh-100240: Use a consistent implementation for freelists (#121934) 2024-07-22 12:08:27 -04:00
opcode_targets.h gh-100239: specialize BINARY_OP/SUBSCR for list-slice (#132626) 2025-05-01 10:28:52 +00:00
optimizer.c GH-131726: Split up _CHECK_VALIDITY_AND_SET_IP (GH-131810) 2025-04-01 16:55:05 -07:00
optimizer_analysis.c Remove duplicate includes: Python/{bytecodes,ceval,optimizer_analysis}.c (#132622) 2025-05-01 12:07:53 +01:00
optimizer_bytecodes.c gh-131798: JIT - Use sym_new_type instead of sym_new_not_null for _BUILD_STRING, _BUILD_SET (GH-132564) 2025-04-27 20:30:28 +08:00
optimizer_cases.c.h gh-132744: Check recursion limit in CALL_PY_GENERAL (GH-132746) 2025-05-02 17:36:29 +01:00
optimizer_symbols.c GH-131331: Rename "not" to "invert" (GH-131334) 2025-03-20 16:59:41 -07:00
parking_lot.c gh-76785: Improved Subinterpreters Compatibility with 3.12 (1/2) (gh-126704) 2024-11-11 15:58:46 -07:00
pathconfig.c gh-111924: Fix data races when swapping allocators (gh-130287) 2025-02-20 11:31:15 -05:00
perf_jit_trampoline.c gh-131238: Add explicit includes to pycore headers (#131257) 2025-03-17 12:32:43 +01:00
perf_trampoline.c gh-131238: Remove includes from pycore_interp.h (#131495) 2025-03-20 11:35:23 +00:00
preconfig.c gh-106320: Remove private pylifecycle.h functions (#106400) 2023-07-04 09:41:43 +00:00
pyarena.c Chore: Fix typo in pyarena.c (#126527) 2024-11-07 16:37:41 +01:00
pyctype.c
pyfpe.c
pyhash.c gh-122854: Add Py_HashBuffer() function (#122855) 2024-08-30 15:42:27 +00:00
pylifecycle.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
pymath.c
pystate.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
pystrcmp.c gh-108767: Replace ctype.h functions with pyctype.h functions (#108772) 2023-09-01 18:36:53 +02:00
pystrhex.c gh-108765: pystrhex: Replace stdlib.h abs() with Py_ABS() (#108830) 2023-09-02 23:15:54 +02:00
pystrtod.c gh-120026: soft deprecate Py_HUGE_VAL macro (#120027) 2024-11-01 22:04:31 +00:00
Python-ast.c gh-132661: Implement PEP 750 (#132662) 2025-04-30 11:46:41 +02:00
Python-tokenize.c gh-111178: Fix function signatures for test_types (#131455) 2025-03-19 13:46:17 +00:00
pythonrun.c gh-131238: Remove more includes from pycore_interp.h (#131480) 2025-03-19 23:01:32 +01:00
pytime.c gh-131238: Remove pycore_runtime.h from pycore_pystate.h (#131356) 2025-03-19 17:33:24 +01:00
qsbr.c gh-131238: Add explicit includes to pycore headers (#131257) 2025-03-17 12:32:43 +01:00
README
remote_debug.h gh-91048: Chain some exceptions in _testexternalinspection.c (#132970) 2025-05-03 01:35:30 +02:00
remote_debugging.c gh-91048: Refactor _testexternalinspection and add Windows support (#132852) 2025-04-25 14:12:16 +01:00
specialize.c gh-100239: specialize BINARY_OP/SUBSCR for list-slice (#132626) 2025-05-01 10:28:52 +00:00
stackrefs.c GH-132508: Use tagged integers on the evaluation stack for the last instruction offset (GH-132545) 2025-04-29 18:00:35 +01:00
stdlib_module_names.h gh-132983: Introduce _zstd bindings module (GH-133027) 2025-05-04 01:29:55 +00:00
structmember.c gh-132685: fix thread safety of PyMember_GetOne with _Py_T_OBJECT (#132690) 2025-04-18 21:03:42 +05:30
suggestions.c GH-131238: Core header refactor (GH-131250) 2025-03-17 09:19:04 +00:00
symtable.c gh-132661: Implement PEP 750 (#132662) 2025-04-30 11:46:41 +02:00
sysmodule.c gh-132950: Check for Py_SUPPORTS_REMOTE_DEBUG in sys.is_remote_debug_enabled (#132959) 2025-04-25 16:38:48 +00:00
thread.c gh-131238: Add explicit includes to pycore headers (#131257) 2025-03-17 12:32:43 +01:00
thread_nt.h GH-131296: Add missing UNREACHABLE mark in thread_nt.h (GH-131589) 2025-03-31 20:28:35 +01:00
thread_pthread.h gh-130115: fix thread identifiers for 32-bit musl (#130391) 2025-04-04 16:31:37 +02:00
thread_pthread_stubs.h gh-125161: return non zero value in pthread_self on wasi (#125303) 2024-10-13 20:59:41 +05:30
tier2_engine.md Docs: fix spelling of the word 'transferring' (#116641) 2024-03-13 23:53:32 +01:00
traceback.c GH-124715: Move trashcan mechanism into Py_Dealloc (GH-132280) 2025-04-30 11:37:53 +01:00
tracemalloc.c gh-131296: fix clang-cl warning in tracemalloc.c (#131514) 2025-03-22 10:38:47 +01:00
uniqueid.c gh-128923: Use zero to indicate unassigned unique id (#128925) 2025-01-17 16:42:27 +01:00
vm-state.md gh-133079: Remove Py_C_RECURSION_LIMIT & PyThreadState.c_recursion_remaining (GH-133080) 2025-04-29 12:56:20 +02:00

Miscellaneous source files for the main Python shared library