Commit graph

57 commits

Author SHA1 Message Date
Petr Viktorin
49f6beb56a
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) (#123065)
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

* Allow interned strings to be mortal, and fix related issues (GH-120520)

  * Add an InternalDocs file describing how interning should work and how to use it.

  * Add internal functions to *explicitly* request what kind of interning is done:
    - `_PyUnicode_InternMortal`
    - `_PyUnicode_InternImmortal`
    - `_PyUnicode_InternStatic`

  * Switch uses of `PyUnicode_InternInPlace` to those.

  * Disallow using `_Py_SetImmortal` on strings directly.
    You should use `_PyUnicode_InternImmortal` instead:
    - Strings should be interned before immortalization, otherwise you're possibly
      interning a immortalizing copy.
    - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
      `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
      backports, as they are now part of public API and version-specific ABI.

  * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

   Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
    - `_Py_ID`
    - `_Py_STR` (including the empty string)
    - one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

  * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

  * Intern `_Py_STR` singletons at startup.

  * Beef up the tests. Cover internal details (marked with `@cpython_only`).

  * Add lots of assertions

* Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (GH-121364)

  * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

  * Document immortality in some functions that take `const char *`

  This is PyUnicode_InternFromString;
  PyDict_SetItemString, PyObject_SetAttrString;
  PyObject_DelAttrString; PyUnicode_InternFromString;
  and the PyModule_Add convenience functions.

  Always point out a non-immortalizing alternative.

  * Don't immortalize user-provided attr names in _ctypes

* Immortalize names in code objects to avoid crash (GH-121903)

* Intern latin-1 one-byte strings at startup (GH-122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general `_Py_ID`/`_Py_STR`.)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-09-27 13:28:48 -07:00
Eric Snow
0d5fe2c7b4
[3.12] gh-119213: Be More Careful About _PyArg_Parser.kwtuple Across Interpreters (gh-119331) (gh-119425)
_PyArg_Parser holds static global data generated for modules by Argument Clinic.  The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global.  In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters.  However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.

This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes.  It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime.  The solution here is to temporarily switch to the main interpreter.  The alternative would be to always statically allocate the tuple.

This change also fixes a bug where only the most recent parser was added to the global linked list.

(cherry picked from commit 81865002ae)
2024-05-22 22:26:58 +00:00
Miss Islington (bot)
9ae49e3f3b
gh-88745: Add _winapi.CopyFile2 and update shutil.copy2 to use it (GH-105055)
(cherry picked from commit cda1bd3c9d)

Co-authored-by: Steve Dower <steve.dower@python.org>
2023-05-30 20:33:17 +01:00
Marta Gómez Macías
6715f91edc
gh-102856: Python tokenizer implementation for PEP 701 (#104323)
This commit replaces the Python implementation of the tokenize module with an implementation
that reuses the real C tokenizer via a private extension module. The tokenize module now implements
a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward
compatibility.

As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via
the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation.

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2023-05-21 01:03:02 +01:00
Matthias Görgens
6e39fa1955
gh-94906: Support multiple steps in math.nextafter (#103881)
This PR updates `math.nextafter` to add a new `steps` argument. The behaviour is as though `math.nextafter` had been called `steps` times in succession.

---------

Co-authored-by: Mark Dickinson <mdickinson@enthought.com>
2023-05-19 21:03:49 +01:00
Carl Meyer
0589c6a4d3
gh-104615: don't make unsafe swaps in apply_static_swaps (#104620) 2023-05-18 21:22:03 +00:00
Jelle Zijlstra
24d8b88420
gh-103763: Implement PEP 695 (#103764)
This implements PEP 695, Type Parameter Syntax. It adds support for:

- Generic functions (def func[T](): ...)
- Generic classes (class X[T](): ...)
- Type aliases (type X = ...)
- New scoping when the new syntax is used within a class body
- Compiler and interpreter changes to support the new syntax and scoping rules 

Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>
Co-authored-by: Eric Traut <eric@traut.com>
Co-authored-by: Larry Hastings <larry@hastings.org>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-05-15 20:36:23 -07:00
Irit Katriel
2c2dc61e8d
gh-104240: make _PyCompile_CodeGen support different compilation modes (#104241) 2023-05-07 18:47:28 +01:00
Jelle Zijlstra
04f6733275
gh-102500: Implement PEP 688 (#102521)
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
2023-05-04 07:59:46 -07:00
Irit Katriel
80b714835d
gh-87092: Expose assembler to unit tests (#103988) 2023-05-01 22:29:30 +01:00
Itamar Ostricher
a474e04388
gh-97696: asyncio eager tasks factory (#102853)
Co-authored-by: Jacob Bower <jbower@meta.com>
Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
2023-05-01 15:10:13 -06:00
Erlend E. Aasland
222c63fc6b
gh-103015: Add entrypoint keyword param to sqlite3.Connection.load_extension (#103073) 2023-04-26 21:22:03 +02:00
Irit Katriel
e1e9bab006
gh-102778: Add sys.last_exc, deprecate sys.last_type, sys.last_value,sys.last_traceback (#102779) 2023-03-18 11:47:11 +00:00
Steve Dower
cb35882773
gh-102519: Add os.listdrives, os.listvolumes and os.listmounts on Windows (GH-102544) 2023-03-10 12:21:37 +00:00
Erlend E. Aasland
eb0c485b6c
gh-101819: Remove _PyWindowsConsoleIO_Type from the Windows DLL (GH-101904)
Automerge-Triggered-By: GH:erlend-aasland
2023-02-15 05:07:59 -08:00
Gregory P. Smith
052f53d65d
gh-39615: Add warnings.warn() skip_file_prefixes support (#100840)
`warnings.warn()` gains the ability to skip stack frames based on code
filename prefix rather than only a numeric `stacklevel=` via a new
`skip_file_prefixes=` keyword argument.
2023-01-27 18:35:14 -08:00
Gregory P. Smith
894f2c3c16
gh-100228: Warn from os.fork() if other threads exist. (#100229)
Not comprehensive, best effort warning. There are cases when threads exist on some platforms that this code cannot detect. macOS when API permissions allow and Linux with a readable /proc procfs present are the currently supported cases where a warning should show up reliably.

Starting with a DeprecationWarning for now, it is less disruptive than something like RuntimeWarning and most likely to only be seen in people's CI tests - a good place to start with this messaging.
2022-12-29 14:41:39 -08:00
Eric Snow
cda9f0236f
gh-81057: Move OS-Related Globals to _PyRuntimeState (gh-100082)
https://github.com/python/cpython/issues/81057
2022-12-08 15:38:06 -07:00
colorfulappl
0da728387c
gh-64490: Fix bugs in argument clinic varargs processing (#32092) 2022-11-24 20:56:50 +01:00
colorfulappl
c450c8c9ed
gh-96002: Add functional test for Argument Clinic (#96178)
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2022-11-21 15:08:45 +01:00
Nikita Sobolev
a3360facba
gh-99284: [ctypes] remove _use_broken_old_ctypes_structure_semantics_ (GH-99285)
It was untested and undocumented. No code has been found in the wild that ever used it.
2022-11-18 22:25:32 -08:00
Irit Katriel
a3ac9232f8
gh-87092: expose the compiler's codegen to python for unit tests (GH-99111) 2022-11-14 13:56:40 +00:00
Erlend E. Aasland
c95f554a40
gh-83638: Add sqlite3.Connection.autocommit for PEP 249 compliant behaviour (#93823)
Introduce the autocommit attribute to Connection and the autocommit
parameter to connect() for PEP 249-compliant transaction handling.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
2022-11-12 23:44:41 +01:00
Mark Shannon
1e197e63e2
GH-96421: Insert shim frame on entry to interpreter (GH-96319)
* Adds EXIT_INTERPRETER instruction to exit PyEval_EvalDefault()

* Simplifies RETURN_VALUE, YIELD_VALUE and RETURN_GENERATOR instructions as they no longer need to check for entry frames.
2022-11-10 12:34:57 +00:00
Kumar Aditya
be0d5008b3
GH-90699: Remove remaining _Py_IDENTIFIER stdlib usage (GH-99067) 2022-11-07 12:06:23 -08:00
Pablo Galindo Salgado
99e2e60cb2
gh-99139: Improve NameError error suggestion for instances (#99140) 2022-11-06 13:52:06 +00:00
Kaushik Kulkarni
67ade403a2
gh-98284: better error message for undefined abstractmethod (#97971) 2022-11-05 09:31:57 -07:00
Kumar Aditya
0ee59a9ca3
GH-90699: Remove _Py_IDENTIFIER usage from _ctypes (GH-99054) 2022-11-03 13:20:10 -07:00
Kumar Aditya
18fc232e07
GH-90699: Remove _Py_IDENTIFIER usage from _asyncio module (#99010) 2022-11-02 10:16:06 -07:00
Kumar Aditya
780757ac58
GH-90699: Remove _Py_IDENTIFIER usage from _json module (GH-98956) 2022-11-02 09:03:38 -07:00
Pablo Galindo Salgado
7cfbb49fcd
gh-91058: Add error suggestions to 'import from' import errors (#98305) 2022-10-25 23:56:59 +01:00
Noam Cohen
a371a7e03e
gh-95023: Added os.setns and os.unshare functions (#95046)
Added os.setns and os.unshare to easily switch between namespaces
on Linux.

Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Victor Stinner <vstinner@python.org>
2022-10-20 11:08:54 +02:00
Victor Stinner
1863302d61
gh-97669: Create Tools/build/ directory (#97963)
Create Tools/build/ directory. Move the following scripts from
Tools/scripts/ to Tools/build/:

* check_extension_modules.py
* deepfreeze.py
* freeze_modules.py
* generate_global_objects.py
* generate_levenshtein_examples.py
* generate_opcode_h.py
* generate_re_casefix.py
* generate_sre_constants.py
* generate_stdlib_module_names.py
* generate_token.py
* parse_html5_entities.py
* smelly.py
* stable_abi.py
* umarshal.py
* update_file.py
* verify_ensurepip_wheels.py

Update references to these scripts.
2022-10-17 12:01:00 +02:00
Noam Cohen
5405537813
gh-95011: Migrate syslog module to Argument Clinic (GH-95012) 2022-10-08 21:31:57 +03:00
Nikita Sobolev
24a6645894
gh-97955: Migrate zoneinfo to Argument Clinic (#97958) 2022-10-07 11:06:23 -07:00
Nikita Sobolev
83cbe84dc2
gh-64373: Convert _functools to Argument Clinic (#96640) 2022-10-07 10:36:40 -07:00
Steve Dower
de33df27aa
gh-89545: Updates platform module to use new internal _wmi module on Windows to directly query OS properties (GH-96289) 2022-09-07 21:09:20 +01:00
Gregory P. Smith
511ca94520
gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96499)
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.

This PR comes fresh from a pile of work done in our private PSRT security response team repo.

Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
2022-09-02 09:35:08 -07:00
Irit Katriel
420f39f457
gh-93678: add _testinternalcapi.optimize_cfg() and test utils for compiler optimization unit tests (GH-96007) 2022-08-24 11:02:53 +01:00
Eric Snow
6f6a4e6cc5
gh-90928: Statically Initialize the Keywords Tuple in Clinic-Generated Code (gh-95860)
We only statically initialize for core code and builtin modules.  Extension modules still create
the tuple at runtime.  We'll solve that part of interpreter isolation separately.

This change includes generated code. The non-generated changes are in:

* Tools/clinic/clinic.py
* Python/getargs.c
* Include/cpython/modsupport.h
* Makefile.pre.in (re-generate global strings after running clinic)
* very minor tweaks to Modules/_codecsmodule.c and Python/Python-tokenize.c

All other changes are generated code (clinic, global strings).
2022-08-11 15:25:49 -06:00
Serhiy Storchaka
6fd4c8ec77
gh-93741: Add private C API _PyImport_GetModuleAttrString() (GH-93742)
It combines PyImport_ImportModule() and PyObject_GetAttrString()
and saves 4-6 lines of code on every use.

Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.
2022-06-14 07:15:26 +03:00
Kumar Aditya
9331087966
GH-90699: use statically allocated strings in typeobject.c (gh-93751) 2022-06-13 01:38:18 +09:00
Serhiy Storchaka
3473817106
gh-91162: Support splitting of unpacked arbitrary-length tuple over TypeVar and TypeVarTuple parameters (alt) (GH-93412)
For example:

  A[T, *Ts][*tuple[int, ...]] -> A[int, *tuple[int, ...]]
  A[*Ts, T][*tuple[int, ...]] -> A[*tuple[int, ...], int]
2022-06-12 16:22:01 +03:00
Serhiy Storchaka
9d25db9db1
gh-91162: Fix substitution of unpacked tuples in generic aliases (GH-92335) 2022-05-08 18:32:32 +03:00
Serhiy Storchaka
1ed8d035f1
gh-87390: Fix starred tuple equality and pickling (GH-92337) 2022-05-05 20:16:06 +03:00
Serhiy Storchaka
e8c2f72b94
bpo-43224: Implement substitution of unpacked TypeVarTuple in C (GH-31828)
Co-authored-by: Matthew Rahtz <mrahtz@gmail.com>
2022-04-30 08:22:46 +03:00
Dennis Sweeney
37965d2fb4
gh-78607: Replace __ltrace__ with __lltrace__ (GH-91619) 2022-04-16 18:57:00 -04:00
Irit Katriel
d4c4a76ed1
gh-89770: Implement PEP-678 - Exception notes (GH-31317) 2022-04-16 19:59:52 +01:00
Inada Naoki
4216dce04b
bpo-47000: Make io.text_encoding() respects UTF-8 mode (GH-32003)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2022-04-04 11:46:57 +09:00
Eric Snow
21412d037b
bpo-46541: Add a Comment About When to Use _Py_DECLARE_STR(). (gh-32063)
In a gh-32003 comment, I realized it wasn't very clear how _Py_DECLARE_STR() should be used.  This changes adds a comment to clarify.

https://bugs.python.org/issue46541
2022-03-23 09:52:50 -06:00