gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508)
`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit GH-25595.
This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329).
I simplified the docs by eliding the state of the world explanatory
paragraph in this security release only backport. (people will see
that in the mainline /3/ docs)
(cherry picked from commit 2f630e1ce1)
(cherry picked from commit 610cc0ab1b)
(cherry picked from commit f48a96a280)
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
Do not expose the local server's on-disk location from `SimpleHTTPRequestHandler` when generating a directory index. (unnecessary information disclosure)
(cherry picked from commit c7c3a60c88)
Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Fixes CVE-2023-0286 (High) and a couple of Medium security issues.
https://www.openssl.org/news/secadv/20230207.txt
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Ned Deily <nad@python.org>
(cherry picked from commit 1cf3d78c92)
(cherry picked from commit 88fe8d701a)
Co-authored-by: Jeremy Paige <ucodery@gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* [3.9] Update copyright years to 2023. (gh-100848).
(cherry picked from commit 11f99323c2)
Co-authored-by: Benjamin Peterson <benjamin@python.org>
* Update additional copyright years to 2023.
Co-authored-by: Ned Deily <nad@python.org>
* gh-100001: Omit control characters in http.server stderr logs. (GH-100002)
Replace control characters in http.server.BaseHTTPRequestHandler.log_message with an escaped \xHH sequence to avoid causing problems for the terminal the output is printed to.
(cherry picked from commit d8ab0a4dfa)
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* also escape \s (backport of PR #100038).
* add versionadded and remove extra 'to'
Co-authored-by: Gregory P. Smith <greg@krypto.org>
There was an unnecessary quadratic loop in idna decoding. This restores
the behavior to linear.
(cherry picked from commit d315722564)
(cherry picked from commit a6f6c3a3d6)
Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Linux abstract sockets are insecure as they lack any form of filesystem
permissions so their use allows anyone on the system to inject code into
the process.
This removes the default preference for abstract sockets in
multiprocessing introduced in Python 3.9+ via
https://github.com/python/cpython/pull/18866 while fixing
https://github.com/python/cpython/issues/84031.
Explicit use of an abstract socket by a user now generates a
RuntimeWarning. If we choose to keep this warning, it should be
backported to the 3.7 and 3.8 branches.
(cherry picked from commit 49f61068f4)
Co-authored-by: Gregory P. Smith <greg@krypto.org>
This is a port of the applicable part of XKCP's fix [1] for
CVE-2022-37454 and avoids the segmentation fault and the infinite
loop in the test cases published in [2].
[1]: fdc6fef075
[2]: https://mouha.be/sha-3-buffer-overflow/
Regression test added by: Gregory P. Smith [Google LLC] <greg@krypto.org>
(cherry picked from commit 0e4e058602)
Co-authored-by: Theo Buehler <botovq@users.noreply.github.com>
Update libexpat from 2.4.9 to 2.5.0 to address CVE-2022-43680.
Co-authored-by: Shaun Walbridge <shaun.walbridge@gmail.com>
(cherry picked from commit 3e07f827b3)
gh-96710: Make the test timing more lenient for the int/str DoS regression test. (GH-96717)
A regression would still absolutely fail and even a flaky pass isn't
harmful as it'd fail most of the time across our N system test runs.
Windows has a low resolution timer and CI systems are prone to odd
timing so this just gives more leeway to avoid flakiness.
(cherry picked from commit 11e3548fd1)
Co-authored-by: Gregory P. Smith <greg@krypto.org>
gh-68966: Make mailcap refuse to match unsafe filenames/types/params (GH-91993)
(cherry picked from commit b9509ba7a9)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed80458e8)
Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
The macOS 13 SDK includes support for the `mkfifoat` and `mknodat` system calls.
Using the `dir_fd` option with either `os.mkfifo` or `os.mknod` could result in a
segfault if cpython is built with the macOS 13 SDK but run on an earlier
version of macOS. Prevent this by adding runtime support for detection of
these system calls ("weaklinking") as is done for other newer syscalls on
macOS.
(cherry picked from commit 6d0a0191a4)
Co-authored-by: Ned Deily <nad@python.org>
gh-96848: Fix -X int_max_str_digits option parsing (GH-96988)
Fix command line parsing: reject "-X int_max_str_digits" option with
no value (invalid) when the PYTHONINTMAXSTRDIGITS environment
variable is set to a valid limit.
(cherry picked from commit 41351662bc)
Co-authored-by: Victor Stinner <vstinner@python.org>
When ValueError is raised if an integer is larger than the limit,
mention sys.set_int_max_str_digits() in the error message.
(cherry picked from commit e841ffc915)
Co-authored-by: Ned Deily <nad@python.org>
gh-97005: Update libexpat from 2.4.7 to 2.4.9 (gh-97006)
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
(cherry picked from commit 10e3d398c3)
Co-authored-by: Dong-hee Na <donghee.na@python.org>
Co-authored-by: Ned Deily <nad@python.org>
gh-97616: list_resize() checks for integer overflow (GH-97617)
Fix multiplying a list by an integer (list *= int): detect the
integer overflow when the new allocated length is close to the
maximum size. Issue reported by Jordan Limor.
list_resize() now checks for integer overflow before multiplying the
new allocated length by the list item size (sizeof(PyObject*)).
(cherry picked from commit a5f092f3c4)
Co-authored-by: Victor Stinner <vstinner@python.org>
gh-97612: Fix shell injection in get-remote-certificate.py (GH-97613)
Fix a shell code injection vulnerability in the
get-remote-certificate.py example script. The script no longer uses a
shell to run "openssl" commands. Issue reported and initial fix by
Caleb Shortt.
Remove the Windows code path to send "quit" on stdin to the "openssl
s_client" command: use DEVNULL on all platforms instead.
Co-authored-by: Caleb Shortt <caleb@rgauge.com>
(cherry picked from commit 83a0f44ffd)
Co-authored-by: Victor Stinner <vstinner@python.org>
This documents the behavior that has always been the case since timeout
support was introduced in Python 3.3.
(cherry picked from commit b05dd79649)
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* Correctly pre-check for int-to-str conversion (#96537)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>