* [3.9] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().
_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58623)
(cherry picked from commit 6279eb8c07)
(cherry picked from commit a75953b347)
(cherry picked from commit 0c33e5baed)
(cherry picked from commit 8b528cacbb)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
When ValueError is raised if an integer is larger than the limit,
mention sys.set_int_max_str_digits() in the error message.
(cherry picked from commit e841ffc915)
Co-authored-by: Ned Deily <nad@python.org>
gh-97616: list_resize() checks for integer overflow (GH-97617)
Fix multiplying a list by an integer (list *= int): detect the
integer overflow when the new allocated length is close to the
maximum size. Issue reported by Jordan Limor.
list_resize() now checks for integer overflow before multiplying the
new allocated length by the list item size (sizeof(PyObject*)).
(cherry picked from commit a5f092f3c4)
Co-authored-by: Victor Stinner <vstinner@python.org>
* Correctly pre-check for int-to-str conversion (#96537)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
If the error handler returns position less or equal than the starting
position of non-encodable characters, most of built-in encoders didn't
properly re-size the output buffer. This led to out-of-bounds writes,
and segfaults.
(cherry picked from commit 18b07d773e)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
The left-hand side expression of the if-check can be converted to a
constant by the compiler, but the addition on the right-hand side is
performed during runtime.
Move the addition from the right-hand side to the left-hand side by
turning it into a subtraction there. Since the values are known to
be large enough to not turn negative, this is a safe operation.
Prevents a very unlikely integer overflow on 32 bit systems.
Fixes GH-91421.
(cherry picked from commit 0859368335)
Co-authored-by: Tobias Stoeckmann <stoeckmann@users.noreply.github.com>
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
(cherry picked from commit 8be7c2bc5a)
Co-authored-by: Dave Goncalves <davegoncalves@gmail.com>
Rename the private undocumented float.__set_format__() method to
float.__setformat__() to fix a typo introduced in Python 3.7. The
method is only used by test_float.
The change enables again test_float tests on the float format which
were previously skipped because of the typo.
The typo was introduced in Python 3.7 by bpo-20185
in commit b5c51d3dd9.
(cherry picked from commit 7d03c8be5a)
Ensure strong references are acquired whenever using `set_next()`. Added randomized test cases for `__eq__` methods that sometimes mutate sets when called.
(cherry picked from commit 4a66615ba7)
Fix a race condition on setting a type __bases__ attribute: the
internal function add_subclass() now gets the
PyTypeObject.tp_subclasses member after calling PyWeakref_NewRef()
which can trigger a garbage collection which can indirectly modify
PyTypeObject.tp_subclasses.
(cherry picked from commit f1c6ae3270)
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Victor Stinner <vstinner@python.org>
The docstrings for MappingProxyType's keys(), values(), and items()
methods were never updated to reflect the changes that Python 3 brought
to these APIs, namely returning views rather than lists.
(cherry picked from commit 2d10fa9bc4)
Co-authored-by: Joshua Bronson <jabronson@gmail.com>
* Use Py_EnterRecursiveCall() in issubclass()
Reviewed-by: Gregory P. Smith <greg@krypto.org> [Google]
(cherry picked from commit 423fa1c181)
Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
They support now splitting escape sequences between input chunks.
Add the third parameter "final" in codecs.raw_unicode_escape_decode().
It is True by default to match the former behavior.
(cherry picked from commit 39aa98346d)
They support now splitting escape sequences between input chunks.
Add the third parameter "final" in codecs.unicode_escape_decode().
It is True by default to match the former behavior.
(cherry picked from commit c96d1546b1)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
It happened with fast range iterator when the calculated stop = start + step * len
was out of the C long range.
(cherry picked from commit 936f6a16b9)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* bpo-45083: Include the exception class qualname when formatting an exception (GH-28119)
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
(cherry picked from commit b4b6342848)
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Fix a crash at Python exit when a deallocator function removes the
last strong reference to a heap type.
Don't read type memory after calling basedealloc() since
basedealloc() can deallocate the type and free its memory.
_PyMem_IsPtrFreed() argument is now constant.
(cherry picked from commit 615069eb08)
Co-authored-by: Victor Stinner <vstinner@python.org>
The non-GC-type branch of subtype_dealloc is using the type of an object after freeing in the same unsafe way as GH-26274 fixes. (I believe the old news entry covers this change well enough.)
https://bugs.python.org/issue44184
(cherry picked from commit 074e7659f2)
Co-authored-by: T. Wouters <thomas@python.org>
(cherry picked from commit d33943a6c3)
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
(cherry picked from commit 5924243199)
Co-authored-by: Mark Dickinson <mdickinson@enthought.com>
Co-authored-by: Mark Dickinson <mdickinson@enthought.com>
These are passed and called as PyCFunction, however they are defined here without the (ignored) args parameter.
This works fine in some C compilers, but fails in webassembly or anything else that has strict function pointer call type checking.
(cherry picked from commit ab383eb6f0)
Co-authored-by: Joe Marshall <joe.marshall@nottingham.ac.uk>
Co-authored-by: Joe Marshall <joe.marshall@nottingham.ac.uk>