cpython

mirror of https://github.com/python/cpython.git synced 2025-12-23 09:19:18 +00:00

Author	SHA1	Message	Date
Gregory P. Smith	cec1e9dfd7	[3.9] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96502 ) * Correctly pre-check for int-to-str conversion (#96537) Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =) The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact. The justification for the current check. The C code check is: ```c max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10 ``` In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is: $$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$ From this it follows that $$\frac{M}{3L} < \frac{s-1}{10}$$ hence that $$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$ So $$2^{L(s-1)} > 10^M.$$ But our input integer $a$ satisfies $\|a\| \ge 2^{L(s-1)}$, so $\|a\|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check. <!-- gh-issue-number: gh-95778 --> * Issue: gh-95778 <!-- /gh-issue-number --> Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org> Co-authored-by: Christian Heimes <christian@python.org> Co-authored-by: Mark Dickinson <dickinsm@gmail.com>	2022-09-05 11:21:03 +02:00
Miss Islington (bot)	a657bff349	bpo-46762: Fix an assert failure in f-strings where > or < is the last character if the f-string is missing a trailing right brace. (GH-31365) (cherry picked from commit `ffd9f8ff84`) Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>	2022-02-16 03:18:16 -08:00
Miss Islington (bot)	c314e3e829	bpo-46503: Prevent an assert from firing when parsing some invalid \N sequences in f-strings. (GH-30865) (30867) * bpo-46503: Prevent an assert from firing. Also fix one nearby tiny PEP-7 nit. * Added blurb. (cherry picked from commit `0daf72194b`) Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com> Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>	2022-01-24 22:08:42 -05:00
Pablo Galindo Salgado	e5cf31d3c2	[3.9] bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177) (#30215 ) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>. (cherry picked from commit `e9898bf153`) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>	2021-12-20 17:18:13 +00:00
Victor Stinner	93a540d74c	bpo-45866: pegen strips directory of "generated from" header (GH-29777) (GH-29792) (GH-29797) "make regen-all" now produces the same output when run from a directory other than the source tree: when building Python out of the source tree. (cherry picked from commit `253b7a0a9f`) (cherry picked from commit `b6defde2af`)	2021-11-26 17:23:41 +01:00
Miss Islington (bot)	00ee14e814	[3.9] bpo-45820: Fix a segfault when the parser fails without reading any input (GH-29580) (GH-29584) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com> Co-authored-by: Łukasz Langa <lukasz@langa.pl>	2021-11-18 01:24:43 +01:00
Pablo Galindo Salgado	0ef308a289	bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided (GH-29582) (GH-29585) (cherry picked from commit `da20d7401d`)	2021-11-18 00:18:16 +01:00
Pablo Galindo Salgado	142fcb40b6	bpo-45738: Fix computation of error location for invalid continuation characters in the parser (GH-29550) (GH-29552) (cherry picked from commit `25835c518a`)	2021-11-14 01:47:27 +00:00
Łukasz Langa	88f4ec88e2	[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) (#29071 ) There are two errors that this commit fixes: * The parser was not correctly computing the offset and the string source for E_LINECONT errors due to the incorrect usage of strtok(). * The parser was not correctly unwinding the call stack when a tokenizer exception happened in rules involving optionals ('?', [...]) as we always make them return valid results by using the comma operator. We need to check first if we don't have an error before continuing.. (cherry picked from commit `a106343f63`) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com> NOTE: unlike the cherry-picked original, this commit points at a crazy location due to a bug in the tokenizer that required a big refactor in 3.10 to fix. We are leaving as-is for 3.9.	2021-10-20 18:51:13 +02:00
Serhiy Storchaka	7c722e32bf	[3.9] bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) (GH-28945) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.unicode_escape_decode(). It is True by default to match the former behavior. (cherry picked from commit `c96d1546b1`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2021-10-14 20:03:29 +03:00
Łukasz Langa	4e4d35d332	[3.9] bpo-44947: Refine the syntax error for trailing commas in import statements (GH-27814) (GH-27817) (cherry picked from commit `b2f68b1900`) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>	2021-08-18 23:03:59 +02:00
Pablo Galindo Salgado	4b86c9c514	[3.9] bpo-44885: Correct the ast locations of f-strings with format specs and repeated expressions (GH-27729) (GH-27744) (cherry picked from commit `8e832fb2a2`) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>	2021-08-12 18:46:35 +01:00
Pablo Galindo	0d0a9eaa82	[3.9] bpo-44409: Fix error location in tokenizer errors that happen during initialization (GH-26712). (GH-26723) (cherry picked from commit `507ed6fa1d`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2021-06-14 18:07:51 +01:00
Lysandros Nikolaou	3ce35bfbbe	[3.9] bpo-44385: Remove unused grammar rules (GH-26655) (GH-26659) (cherry picked from commit `e7b4644607`)	2021-06-10 15:52:49 -07:00
Pablo Galindo	d4a9264ab8	[3.9] bpo-44168: Fix error message in the parser for keyword arguments for invalid expressions (GH-26210) (GH-26250) (cherry picked from commit `33c0c90dea`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2021-05-19 19:26:59 +01:00
Erlend Egeberg Aasland	76d270ec2b	[3.9] bpo-43779: Fix possible refleak involving _PyArena_AddPyObject (GH-25289). (GH-25294) * [3.9] Fix possible refleak involving _PyArena_AddPyObject (GH-25289). (cherry picked from commit `c0e11a3ceb`) Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no> * Update Parser/pegen/pegen.c Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2021-04-09 18:46:32 +01:00
Miss Islington (bot)	994a519915	bpo-43555: Report the column offset for invalid line continuation character (GH-24939) (#24975 ) (cherry picked from commit `96eeff5162`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2021-03-22 19:07:05 +00:00
Pablo Galindo	bfc413ce4f	[3.9] bpo-42806: Fix ast locations of f-strings inside parentheses (GH-24067) (GH-24069) (cherry picked from commit `bd2728b1e8`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2021-01-03 01:32:43 +00:00
Lysandros Nikolaou	9a608ac17c	[3.9] bpo-40631: Disallow single parenthesized star target (GH-24027) (GH-24068) (cherry picked from commit `2ea320dddd`) Automerge-Triggered-By: GH:pablogsal	2021-01-02 16:59:39 -08:00
Pablo Galindo	87c87b5bd6	[3.9] bpo-42381: Allow walrus in set literals and set comprehensions (GH-23332) (GH-23333) Currently walruses are not allowerd in set literals and set comprehensions: >>> {y := 4, 42, 33} File "<stdin>", line 1 {y := 4, 42, 33} ^ SyntaxError: invalid syntax but they should be allowed as well per PEP 572. (cherry picked from commit `b0aba1fcdc`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-11-18 23:44:30 +00:00
Miss Islington (bot)	994c68f586	bpo-40998: Address compiler warnings found by ubsan (GH-20929) Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: GH:tiran (cherry picked from commit `07f2adedf0`) Co-authored-by: Christian Heimes <christian@python.org>	2020-11-18 08:01:48 -08:00
Lysandros Nikolaou	2b800ef809	bpo-42374: Allow unparenthesized walrus in genexps (GH-23319) (GH-23329) This fixes a regression that was introduced by the new parser. (cherry picked from commit `cb3e5ed071`)	2020-11-17 01:38:58 +02:00
Lysandros Nikolaou	cfcb952e30	[3.9] bpo-42218: Correctly handle errors in left-recursive rules (GH-23065) (GH-23066) Left-recursive rules need to check for errors explicitly, since even if the rule returns NULL, the parsing might continue and lead to long-distance failures. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> (cherry picked from commit `02cdfc93f8`) Automerge-Triggered-By: GH:lysnikolaou	2020-10-31 12:06:03 -07:00
Pablo Galindo	ddcd57e3ea	[3.9] bpo-42214: Fix check for NOTEQUAL token in the PEG parser for the barry_as_flufl rule (GH-23048) (GH-23051) (cherry picked from commit `06f8c3328d`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-10-31 00:40:42 +00:00
Lysandros Nikolaou	24a7c298d4	[3.9] bpo-42123: Run the parser two times and only enable invalid rules on the second run (GH-22111) (GH-23011) * Implement running the parser a second time for the errors messages The first parser run is only responsible for detecting whether there is a `SyntaxError` or not. If there isn't the AST gets returned. Otherwise, the parser is run a second time with all the `invalid_*` rules enabled so that all the customized error messages get produced. (cherry picked from commit `bca7014032`)	2020-10-28 02:14:15 +02:00
Lysandros Nikolaou	c4b58cea47	[3.9] bpo-41659: Disallow curly brace directly after primary (GH-22996) (#23006 ) (cherry picked from commit `15acc4eaba`)	2020-10-28 00:38:42 +02:00
Miss Skeleton (bot)	0b290dd217	bpo-42150: Avoid buffer overflow in the new parser (GH-22978) (cherry picked from commit `e68c67805e`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-10-25 16:24:56 -07:00
Batuhan Taskaya	42157b9eaa	[3.9] bpo-41979: Accept star-unpacking on with-item targets (GH-22611) (GH-22612) Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com> Automerge-Triggered-By: @pablogsal	2020-10-09 03:31:07 -07:00
Pablo Galindo	be17295280	[3.9] bpo-41697: Correctly handle KeywordOrStarred when parsing arguments in the parser (GH-22077) (GH-22079) (cherry picked from commit `315a61f7a9`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-09-03 16:35:17 +01:00
Pablo Galindo	8de34cdb95	[3.9] bpo-41690: Use a loop to collect args in the parser instead of recursion (GH-22053) (GH-22067) This program can segfault the parser by stack overflow: ``` import ast code = "f(" + ",".join(['a' for _ in range(100000)]) + ")" print("Ready!") ast.parse(code) ``` the reason is that the rule for arguments has a simple recursion when collecting args: args[expr_ty]: [...] \| a=named_expression b=[',' c=args { c }] { [...] }. (cherry picked from commit `4a97b1517a`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-09-02 21:30:51 +01:00
Pablo Galindo	bc2c0e9a57	[3.9] Validate the AST produced by the parser in debug mode (GH-21643) (GH-21646) This will improve the debug experience if something fails in the produced AST. Previously, errors in the produced AST can be felt much later like in the garbage collector or the compiler, making debugging them much more difficult.. (cherry picked from commit `1332226b32`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-07-28 00:12:31 +01:00
Miss Islington (bot)	9d8b8c3ed2	Fix trivial typo in the PEG string parser (GH-21508) (cherry picked from commit `0275e0452a`) Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>	2020-07-16 09:30:19 -07:00
Miss Islington (bot)	961703cdc8	Fix possibly-unitialized warning in string_parser.c. (GH-21503) GCC says ``` ../cpython/Parser/string_parser.c: In function ‘fstring_find_expr’: ../cpython/Parser/string_parser.c:404:93: warning: ‘cols’ may be used uninitialized in this function [-Wmaybe-uninitialized] 404 \| p2->starting_col_offset = p->tok->first_lineno == p->tok->lineno ? t->col_offset + cols : cols; \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ ../cpython/Parser/string_parser.c:384:16: note: ‘cols’ was declared here 384 \| int lines, cols; \| ^~~~ ../cpython/Parser/string_parser.c:403:45: warning: ‘lines’ may be used uninitialized in this function [-Wmaybe-uninitialized] 403 \| p2->starting_lineno = t->lineno + lines - 1; \| ~~~~~~~~~~~~~~~~~~^~~ ../cpython/Parser/string_parser.c:384:9: note: ‘lines’ was declared here 384 \| int lines, cols; \| ^~~~~ ``` and, indeed, if `PyBytes_AsString` somehow fails, lines & cols will not be initialized. (cherry picked from commit `2ad7e9c011`) Co-authored-by: Benjamin Peterson <benjamin@python.org>	2020-07-16 06:25:31 -07:00
Miss Islington (bot)	edeaf61b68	bpo-41215: Make assertion in the new parser more strict (GH-21364) (cherry picked from commit `782f44b8fb`) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>	2020-07-06 16:35:10 -07:00
Pablo Galindo	54f115dd53	[3.9] bpo-41215: Don't use NULL by default in the PEG parser keyword list (GH-21355) (GH-21356) (cherry picked from commit `39e76c0fb0`) Co-authored-by: Pablo Galindo <pablogsal@gmail.com> Automerge-Triggered-By: @lysnikolaou	2020-07-06 12:29:59 -07:00
Guido van Rossum	2a1ee1d970	[3.9] bpo-35975: Only use cf_feature_version if PyCF_ONLY_AST in cf_flags (#21022 )	2020-06-27 17:34:30 -07:00
Pablo Galindo	dab533d0ee	[3.9] bpo-41076: Pre-feed the parser with the f-string expression location (GH-21054) (GH-21190) This commit changes the parsing of f-string expressions with the new parser. The parser gets pre-fed with the location of the expression itself (not the f-string, which was what we were doing before). This allows us to completely skip the shifting of the AST nodes after the parsing is completed.. (cherry picked from commit `1f0f4abb11`)	2020-06-28 01:15:28 +01:00
Pablo Galindo	102ca529ef	[3.9] bpo-40769: Allow extra surrounding parentheses for invalid annotated assignment rule (GH-20387) (GH-21186) (cherry picked from commit `c8f29ad986`)	2020-06-28 00:40:41 +01:00
Miss Islington (bot)	cb0dc52d37	bpo-41084: Adjust message when an f-string expression causes a SyntaxError (GH-21084) Prefix the error message with `fstring: `, when parsing an f-string expression throws a `SyntaxError`. (cherry picked from commit `2e0a920e9e`) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>	2020-06-27 12:43:49 -07:00
Lysandros Nikolaou	5193d0a665	[3.9] bpo-41132: Use pymalloc allocator in the f-string parser (GH-21173) (GH-21183) (cherry picked from commit `6dcbc2422d`) Automerge-Triggered-By: @pablogsal	2020-06-27 11:35:18 -07:00
Lysandros Nikolaou	d01a3e76ee	[3.9] bpo-41119: Output correct error message for list/tuple followed by colon (GH-21160) (GH-21172) (cherry picked from commit `4b85e60601`)	2020-06-27 00:14:12 +01:00
Lysandros Nikolaou	71bb921829	[3.9] bpo-41060: Avoid SEGFAULT when calling GET_INVALID_TARGET in the grammar (GH-21020) (GH-21024) `GET_INVALID_TARGET` might unexpectedly return `NULL`, which if not caught will cause a SEGFAULT. Therefore, this commit introduces a new inline function `RAISE_SYNTAX_ERROR_INVALID_TARGET` that always checks for `GET_INVALID_TARGET` returning NULL and can be used in the grammar, replacing the long C ternary operation used till now. (cherry picked from commit `6c4e0bd974`) Automerge-Triggered-By: @pablogsal	2020-06-20 19:47:22 -07:00
Miss Islington (bot)	c9f83c173b	bpo-40958: Avoid 'possible loss of data' warning on Windows (GH-20970) (cherry picked from commit `861efc6e8f`) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>	2020-06-20 10:35:03 -07:00
Lysandros Nikolaou	a5442b26f4	[3.9] bpo-40334: Produce better error messages on invalid targets (GH-20106) (GH-20973) * bpo-40334: Produce better error messages on invalid targets (GH-20106) The following error messages get produced: - `cannot delete ...` for invalid `del` targets - `... is an illegal 'for' target` for invalid targets in for statements - `... is an illegal 'with' target` for invalid targets in with statements Additionally, a few `cut`s were added in various places before the invocation of the `invalid_*` rule, in order to speed things up. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> (cherry picked from commit `01ece63d42`)	2020-06-19 01:03:58 +01:00
Miss Islington (bot)	7795ae8f05	bpo-40958: Avoid buffer overflow in the parser when indexing the current line (GH-20875) (GH-20919) (cherry picked from commit `51c5896b62`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-06-16 18:36:59 +01:00
Pablo Galindo	30b59fd7cf	[3.9] Improve readability and style in parser files (GH-20884) (GH-20885) (cherry picked from commit `fb61c42`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-06-15 15:08:00 +01:00
Pablo Galindo	3782497cc2	[3.9] bpo-40939: Fix test_keyword for the old parser (GH-20814)	2020-06-11 19:29:13 +01:00
Miss Islington (bot)	d55ed7b107	Raise specialised syntax error for invalid lambda parameters (GH-20776) (cherry picked from commit `c6483c9896`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-06-10 06:24:41 -07:00
Miss Islington (bot)	8df4f3942f	bpo-40903: Handle multiple '=' in invalid assignment rules in the PEG parser (GH-20697) Automerge-Triggered-By: @pablogsal (cherry picked from commit `9f495908c5`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-06-08 02:22:06 -07:00
Miss Islington (bot)	6440911736	bpo-40904: Fix segfault in the new parser with f-string containing yield statements with no value (GH-20701) (cherry picked from commit `972ab03276`) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-06-07 18:08:53 -07:00

1 2 3

107 commits