gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96499)

Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.

This PR comes fresh from a pile of work done in our private PSRT security response team repo.

Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
This commit is contained in:
Gregory P. Smith 2022-09-02 09:35:08 -07:00 committed by GitHub
parent 656167db81
commit 511ca94520
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
28 changed files with 803 additions and 20 deletions

View file

@ -11,6 +11,41 @@ extern "C" {
#include "pycore_global_objects.h" // _PY_NSMALLNEGINTS
#include "pycore_runtime.h" // _PyRuntime
/*
* Default int base conversion size limitation: Denial of Service prevention.
*
* Chosen such that this isn't wildly slow on modern hardware and so that
* everyone's existing deployed numpy test suite passes before
* https://github.com/numpy/numpy/issues/22098 is widely available.
*
* $ python -m timeit -s 's = * "1"*4300' 'int(s)'
* 2000 loops, best of 5: 125 usec per loop
* $ python -m timeit -s 's = * "1"*4300; v = int(s)' 'str(v)'
* 1000 loops, best of 5: 311 usec per loop
* (zen2 cloud VM)
*
* 4300 decimal digits fits a ~14284 bit number.
*/
#define _PY_LONG_DEFAULT_MAX_STR_DIGITS 4300
/*
* Threshold for max digits check. For performance reasons int() and
* int.__str__() don't checks values that are smaller than this
* threshold. Acts as a guaranteed minimum size limit for bignums that
* applications can expect from CPython.
*
* % python -m timeit -s 's = "1"*640; v = int(s)' 'str(int(s))'
* 20000 loops, best of 5: 12 usec per loop
*
* "640 digits should be enough for anyone." - gps
* fits a ~2126 bit decimal number.
*/
#define _PY_LONG_MAX_STR_DIGITS_THRESHOLD 640
#if ((_PY_LONG_DEFAULT_MAX_STR_DIGITS != 0) && \
(_PY_LONG_DEFAULT_MAX_STR_DIGITS < _PY_LONG_MAX_STR_DIGITS_THRESHOLD))
# error "_PY_LONG_DEFAULT_MAX_STR_DIGITS smaller than threshold."
#endif
/* runtime lifecycle */