Commit graph

81 commits

Author SHA1 Message Date
Miss Islington (bot)
4513e4aba8
gh-95355: Check tokens[0] after allocating memory (GH-95356)
GH-95355

Automerge-Triggered-By: GH:pablogsal
(cherry picked from commit b946f529ef)

Co-authored-by: Honglin Zhu <zhuhonglin.zhl@alibaba-inc.com>
2022-07-28 03:45:01 -07:00
Pablo Galindo Salgado
697e78ca05
[3.10] gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin (GH-94386) (GH-94574)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>

(cherry picked from commit 36fcde61ba)
2022-07-05 20:14:28 +02:00
Matthieu Dartiailh
94609e3192
[3.10] Backport bpo-47212 (GH-32302) to Python 3.10 (GH-32334)
(cherry picked from commit aa0f056a00)

# Conflicts:
#	Grammar/python.gram
#	Parser/action_helpers.c

Automerge-Triggered-By: GH:pablogsal
2022-04-05 09:21:49 -07:00
Pablo Galindo Salgado
27ee431834
[3.10] bpo-47117: Don't crash if we fail to decode characters when the tokenizer buffers are uninitialized (GH-32129) (GH-32130)
Automerge-Triggered-By: GH:pablogsal.
(cherry picked from commit 26cca8067b)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-03-26 18:26:05 +00:00
Pablo Galindo Salgado
14284b0e71
[3.10] Allow the parser to avoid nested processing of invalid rules (GH-31252). (GH-31257)
(cherry picked from commit 390459de6d)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-02-10 14:38:31 +00:00
Pablo Galindo Salgado
5b58db7529
[3.10] bpo-46521: Fix codeop to use a new partial-input mode of the parser (GH-31010). (GH-31213)
(cherry picked from commit 69e10976b2)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-02-08 12:25:15 +00:00
Pablo Galindo Salgado
633db1c4eb
[3.10] bpo-46240: Correct the error for unclosed parentheses when the tokenizer is not finished (GH-30378). (GH-30819)
(cherry picked from commit 70f415fb8b)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-01-23 03:10:37 +00:00
Miss Islington (bot)
1fb1f5d8bd
[3.10] bpo-46339: Fix crash in the parser when computing error text for multi-line f-strings (GH-30529) (GH-30542)
* bpo-46339: Fix crash in the parser when computing error text for multi-line f-strings (GH-30529)

Automerge-Triggered-By: GH:pablogsal
(cherry picked from commit cedec19be8)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>

* Fix interactive mode

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-01-20 13:05:10 +00:00
Miss Islington (bot)
19a85501ce
bpo-46237: Fix the line number of tokenizer errors inside f-strings (GH-30463)
(cherry picked from commit 6fa8b2ceee)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2022-01-11 08:33:08 -08:00
Miss Islington (bot)
576e38f9db
bpo-42918: Improve built-in function compile() in mode 'single' (GH-29934) (GH-30040)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
(cherry picked from commit 28179aac79)

Co-authored-by: Weipeng Hong <hongweichen8888@sina.com>
2021-12-27 17:15:44 +01:00
Pablo Galindo Salgado
dc73199a21
[3.10] bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177) (GH-30214)
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>.
(cherry picked from commit e9898bf153)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-12-20 16:23:37 +00:00
Pablo Galindo Salgado
c72311d917
[3.10] bpo-45727: Only trigger the 'did you forgot a comma' error suggestion if inside parentheses. (GH-29767)
Backport of GH-29757

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2021-11-25 01:01:40 +00:00
Pablo Galindo Salgado
07cf66fd03
[3.10] Ensure the str member of the tokenizer is always initialised (GH-29681). (GH-29683)
(cherry picked from commit 4f006a789a)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-21 04:15:22 +00:00
Miss Islington (bot)
a427eb862f
bpo-45494: Fix error location in EOF tokenizer errors (GH-29108)
(cherry picked from commit 79ff0d1687)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-20 09:59:34 -08:00
Pablo Galindo Salgado
511ee1c0fa
[3.10] bpo-45727: Make the syntax error for missing comma more consistent (GH-29427) (GH-29647)
(cherry picked from commit 546cefcda7)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-20 17:39:17 +00:00
Łukasz Langa
904af3de2b
[3.10] bpo-45848: Allow the parser to get error lines from encoded files (GH-29646) (GH-29661)
(cherry picked from commit fdcc46d955)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-20 16:34:56 +01:00
Miss Islington (bot)
b455df59a8
bpo-45820: Fix a segfault when the parser fails without reading any input (GH-29580)
(cherry picked from commit df4ae55e66)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-17 15:43:14 -08:00
Pablo Galindo Salgado
e3aa9fd77b
[3.10] bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided (GH-29582) (GH-29586)
(cherry picked from commit da20d7401d)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-18 00:17:18 +01:00
Miss Islington (bot)
bf26a6da7a
bpo-45738: Fix computation of error location for invalid continuation (GH-29550)
characters in the parser
(cherry picked from commit 25835c518a)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-11-13 17:30:03 -08:00
Łukasz Langa
5c9cab595e
[3.10] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) (GH-29070)
There are two errors that this commit fixes:

* The parser was not correctly computing the offset and the string
  source for E_LINECONT errors due to the incorrect usage of strtok().
* The parser was not correctly unwinding the call stack when a tokenizer
  exception happened in rules involving optionals ('?', [...]) as we
  always make them return valid results by using the comma operator. We
  need to check first if we don't have an error before continuing..
(cherry picked from commit a106343f63)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-10-19 22:31:18 +02:00
Pablo Galindo Salgado
4ce55a2353
[3.10] bpo-45408: Don't override previous tokenizer errors in the second parser pass (GH-28812). (GH-28813)
(cherry picked from commit 0219017df7)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-10-08 00:50:10 +01:00
Miss Islington (bot)
9e209d48ca
bpo-43914: Correctly highlight SyntaxError exceptions for invalid generator expression in function calls (GH-28576)
(cherry picked from commit e5f13ce5b4)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-09-27 07:05:20 -07:00
Pablo Galindo Salgado
b977f8510e
[3.10] bpo-34013: Generalize the invalid legacy statement error message (GH-27389). (GH-27391)
(cherry picked from commit 6948964ecf)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-07-27 18:52:32 +01:00
Miss Islington (bot)
11f1a30cdb
bpo-44456: Improve the syntax error when mixing keyword and positional patterns (GH-26793)
(cherry picked from commit 0acc258fe6)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-24 08:34:28 -07:00
Miss Islington (bot)
133cddf76e
bpo-44409: Fix error location in tokenizer errors that happen during initialization (GH-26712)
(cherry picked from commit 507ed6fa1d)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-14 10:07:52 -07:00
Serhiy Storchaka
c43317d41e
[3.10] Add more const modifiers. (GH-26691). (GH-26692)
(cherry picked from commit be8b631b7a)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2021-06-12 18:44:32 +01:00
Miss Islington (bot)
f807a4fad4
bpo-44368: Ensure we don't raise incorrect custom syntax errors with soft keywords (GH-26630)
(cherry picked from commit 457ce60fc7)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-09 14:45:43 -07:00
Miss Islington (bot)
c0496093e5
bpo-44349: Fix edge case when displaying text from files with encoding in syntax errors (GH-26611) (GH-26616)
(cherry picked from commit 9fd21f649d)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-09 01:29:21 +01:00
Miss Islington (bot)
2a8d7122e0
bpo-44335: Ensure the tokenizer doesn't go into Python with the error set (GH-26608)
(cherry picked from commit bafe0aade5)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-08 12:25:17 -07:00
Miss Islington (bot)
933b5b6359
bpo-44335: Fix a regression when identifying invalid characters in syntax errors (GH-26589)
(cherry picked from commit d334c73b56)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-08 04:46:56 -07:00
Pablo Galindo
3283bf4519
[3.10] bpo-44273: Improve syntax error message for assigning to "..." (GH-26477) (GH-26478)
Use "ellipsis" instead of "Ellipsis" in syntax error messages to eliminate confusion with built-in variable Ellipsis.
(cherry picked from commit 39dd141)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2021-06-03 22:22:28 +01:00
Miss Islington (bot)
1fb6b9e91d
bpo-44201: Avoid side effects of "invalid_*" rules in the REPL (GH-26298) (GH-26313)
When the parser does a second pass to check for errors, these rules can
have some small side-effects as they may advance the parser more than
the point reached in the first pass. This can cause the tokenizer to ask
for extra tokens in interactive mode causing the tokenizer to show the
prompt instead of failing instantly.

To avoid this, add a new mode to the tokenizer that is activated in the
second pass and deactivates asking for new tokens when the interactive
line is finished. As the parsing should have reached the last line in
the first pass, the second pass should not need to ask for more tokens.

(cherry picked from commit bd7476dae3)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-05-22 23:23:26 +01:00
Miss Islington (bot)
ae1732d461
bpo-44180: Fix edge cases in invalid assigment rules in the parser (GH-26283)
The invalid assignment rules are very delicate since the parser can
easily raise an invalid assignment when a keyword argument is provided.
As they are very deep into the grammar tree, is very difficult to
specify in which contexts these rules can be used and in which don't.
For that, we need to use a different version of the rule that doesn't do
error checking in those situations where we don't want the rule to raise
(keyword arguments and generator expressions).

We also need to check if we are in left-recursive rule, as those can try
to eagerly advance the parser even if the parse will fail at the end of
the expression. Failing to do this allows the parser to start parsing a
call as a tuple and incorrectly identify a keyword argument as an
invalid assignment, before it realizes that it was not a tuple after all.
(cherry picked from commit c878a97968)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-05-21 11:20:43 -07:00
Miss Islington (bot)
07dba474c5
bpo-44180: Report generic syntax errors in the furthest position reached in the first parser pass (GH-26253) (GH-26281)
(cherry picked from commit b51081c1a8)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-05-21 16:29:58 +01:00
Miss Islington (bot)
1afaaf5a2d
bpo-44143: Fix crash in the parser when raising tokenizer errors with an exception set (GH-26144) (GH-26148)
(cherry picked from commit 80b089179f)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-05-15 18:39:18 +01:00
Miss Islington (bot)
756b7b9248
bpo-43822: Prioritize tokenizer errors over custom syntax errors when raising parser exceptions (GH-25866)
(cherry picked from commit 9142088e74)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-05-03 18:06:45 -07:00
Brandt Bucher
dbe60ee09d
bpo-43892: Validate the first term of complex literal value patterns (GH-25735) 2021-04-29 17:19:28 -07:00
Nick Coghlan
1e7b858575
bpo-43892: Make match patterns explicit in the AST (GH-25585)
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2021-04-28 22:58:44 -07:00
Pablo Galindo
a77aac4fca
bpo-43914: Highlight invalid ranges in SyntaxErrors (#25525)
To improve the user experience understanding what part of the error messages associated with SyntaxErrors is wrong, we can highlight the whole error range and not only place the caret at the first character. In this way:

>>> foo(x, z for z in range(10), t, w)
  File "<stdin>", line 1
    foo(x, z for z in range(10), t, w)
           ^
SyntaxError: Generator expression must be parenthesized

becomes

>>> foo(x, z for z in range(10), t, w)
  File "<stdin>", line 1
    foo(x, z for z in range(10), t, w)
           ^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
2021-04-23 14:27:05 +01:00
Pablo Galindo
b280248be8
bpo-43822: Improve syntax errors for missing commas (GH-25377) 2021-04-15 21:38:45 +01:00
Pablo Galindo
b86ed8e3bb
bpo-43797: Improve syntax error for invalid comparisons (#25317)
* bpo-43797: Improve syntax error for invalid comparisons

* Update Lib/test/test_fstring.py

Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>

* Apply review comments

* can't -> cannot

Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
2021-04-12 16:59:30 +01:00
Matthew Suozzo
75a06f067b
bpo-43798: Add source location attributes to alias (GH-25324)
* Add source location attributes to alias.
* Move alias star construction to pegen helper.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-04-10 22:56:28 +02:00
Pablo Galindo
d00a449d6d
Simplify _PyPegen_fill_token in pegen.c (GH-25295) 2021-04-09 01:32:25 +01:00
Pablo Galindo
58bafe42ab
Sanitize macros and debug functions in pegen.c (GH-25291) 2021-04-09 01:17:31 +01:00
Pablo Galindo
4f642dae4e
Break down some complex functions in pegen.c for readability (GH-25292) 2021-04-09 00:48:53 +01:00
Erlend Egeberg Aasland
c0e11a3ceb
Fix possible refleak involving _PyArena_AddPyObject (GH-25289) 2021-04-09 00:05:44 +01:00
Victor Stinner
d27f8d2e07
bpo-43244: Rename pycore_ast.h functions to _PyAST_xxx() (GH-25252)
Rename AST functions of pycore_ast.h to use the "_PyAST_" prefix.
Remove macros creating aliases without prefix. For example, Module()
becomes _PyAST_Module(). Update Grammar/python.gram to use
_PyAST_xxx() functions.
2021-04-07 21:34:22 +02:00
Victor Stinner
8370e07e1e
bpo-43244: Remove the pyarena.h header (GH-25007)
Remove the pyarena.h header file with functions:

* PyArena_New()
* PyArena_Free()
* PyArena_Malloc()
* PyArena_AddPyObject()

These functions were undocumented, excluded from the limited C API,
and were only used internally by the compiler.

Add pycore_pyarena.h header. Rename functions:

* PyArena_New() => _PyArena_New()
* PyArena_Free() => _PyArena_Free()
* PyArena_Malloc() => _PyArena_Malloc()
* PyArena_AddPyObject() => _PyArena_AddPyObject()
2021-03-24 02:23:01 +01:00
Victor Stinner
57364ce34e
bpo-43244: Remove parser_interface.h header file (GH-25001)
Remove parser functions using the "struct _mod" type, because the
AST C API was removed:

* PyParser_ASTFromFile()
* PyParser_ASTFromFileObject()
* PyParser_ASTFromFilename()
* PyParser_ASTFromString()
* PyParser_ASTFromStringObject()

These functions were undocumented and excluded from the limited C
API.

Add pycore_parser.h internal header file. Rename functions:

* PyParser_ASTFromFileObject() => _PyParser_ASTFromFile()
* PyParser_ASTFromStringObject() => _PyParser_ASTFromString()

These functions are no longer exported (replace PyAPI_FUNC() with
extern).

Remove also _PyPegen_run_parser_from_file() function. Update
test_peg_generator to use _PyPegen_run_parser_from_file_pointer()
instead.
2021-03-24 01:29:09 +01:00
Victor Stinner
94faa0724f
bpo-43244: Remove ast.h, asdl.h, Python-ast.h headers (GH-24933)
These functions were undocumented and excluded from the limited C
API.

Most names defined by these header files were not prefixed by "Py"
and so could create names conflicts. For example, Python-ast.h
defined a "Yield" macro which was conflict with the "Yield" name used
by the Windows <winbase.h> header.

Use the Python ast module instead.

* Move Include/asdl.h to Include/internal/pycore_asdl.h.
* Move Include/Python-ast.h to Include/internal/pycore_ast.h.
* Remove ast.h header file.
* pycore_symtable.h no longer includes Python-ast.h.
2021-03-23 20:47:40 +01:00