"make regen-all" now produces the same output when run from a
directory other than the source tree: when building Python out of the
source tree.
(cherry picked from commit 253b7a0a9f)
(cherry picked from commit b6defde2af)
There are two errors that this commit fixes:
* The parser was not correctly computing the offset and the string
source for E_LINECONT errors due to the incorrect usage of strtok().
* The parser was not correctly unwinding the call stack when a tokenizer
exception happened in rules involving optionals ('?', [...]) as we
always make them return valid results by using the comma operator. We
need to check first if we don't have an error before continuing..
(cherry picked from commit a106343f63)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
NOTE: unlike the cherry-picked original, this commit points at a crazy location
due to a bug in the tokenizer that required a big refactor in 3.10 to fix.
We are leaving as-is for 3.9.
Currently walruses are not allowerd in set literals and set comprehensions:
>>> {y := 4, 4**2, 3**3}
File "<stdin>", line 1
{y := 4, 4**2, 3**3}
^
SyntaxError: invalid syntax
but they should be allowed as well per PEP 572.
(cherry picked from commit b0aba1fcdc)
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
Left-recursive rules need to check for errors explicitly, since
even if the rule returns NULL, the parsing might continue and lead
to long-distance failures.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
(cherry picked from commit 02cdfc93f8)
Automerge-Triggered-By: GH:lysnikolaou
* Implement running the parser a second time for the errors messages
The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
(cherry picked from commit bca7014032)
This program can segfault the parser by stack overflow:
```
import ast
code = "f(" + ",".join(['a' for _ in range(100000)]) + ")"
print("Ready!")
ast.parse(code)
```
the reason is that the rule for arguments has a simple recursion when collecting args:
args[expr_ty]:
[...]
| a=named_expression b=[',' c=args { c }] {
[...] }.
(cherry picked from commit 4a97b1517a)
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
`GET_INVALID_TARGET` might unexpectedly return `NULL`, which if not
caught will cause a SEGFAULT. Therefore, this commit introduces a new
inline function `RAISE_SYNTAX_ERROR_INVALID_TARGET` that always
checks for `GET_INVALID_TARGET` returning NULL and can be used in
the grammar, replacing the long C ternary operation used till now.
(cherry picked from commit 6c4e0bd974)
Automerge-Triggered-By: @pablogsal
* bpo-40334: Produce better error messages on invalid targets (GH-20106)
The following error messages get produced:
- `cannot delete ...` for invalid `del` targets
- `... is an illegal 'for' target` for invalid targets in for
statements
- `... is an illegal 'with' target` for invalid targets in
with statements
Additionally, a few `cut`s were added in various places before the
invocation of the `invalid_*` rule, in order to speed things
up.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
(cherry picked from commit 01ece63d42)
The error message, generated for a non-parenthesized generator expression
in function calls, was still the generic `invalid syntax`, when the generator expression wasn't appearing as the first argument in the call. With this patch, even on input like `f(a, b, c for c in d, e)`, the correct error message gets produced.
(cherry picked from commit ae14583302)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
The following improvements are implemented in this commit:
- `p->error_indicator` is set, in case malloc or realloc fail.
- Avoid memory leaks in the case that realloc fails.
- Call `PyErr_NoMemory()` instead of `PyErr_Format()`, because it requires no memory.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
This commit fixes the new parser to disallow invalid targets in the
following scenarios:
- Augmented assignments must only accept a single target (Name,
Attribute or Subscript), but no tuples or lists.
- `except` clauses should only accept a single `Name` as a target.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
This commit fixes SyntaxError locations when the caret is not displayed,
by doing the following:
- `col_number` always gets set to the location of the offending
node/expr. When no caret is to be displayed, this gets achieved
by setting the object holding the error line to None.
- Introduce a new function `_PyPegen_raise_error_known_location`,
which can be called, when an arbitrary `lineno`/`col_offset`
needs to be passed. This function then gets used in the grammar
(through some new macros and inline functions) so that SyntaxError
locations of the new parser match that of the old.
This is for the C generator:
- Disallow rule and variable names starting with `_`
- Rename most local variable names generated by the parser to start with `_`
Exceptions:
- Renaming `p` to `_p` will be a separate PR
- There are still some names that might clash, e.g.
- anything starting with `Py`
- C reserved words (`if` etc.)
- Macros like `EXTRA` and `CHECK`
When parsing something like `f(g()=2)`, where the name of a default arg
is not a NAME, but an arbitrary expression, a specialised error message
is emitted.
When parsing things like `def f(*): pass` the old parser used to output `SyntaxError: named arguments must follow bare *`, which the new parser wasn't able to do.
`ast.parse` and `compile` support a `feature_version` parameter that
tells the parser to parse the input string, as if it were written in
an older Python version.
The `feature_version` is propagated to the tokenizer, which uses it
to handle the three different stages of support for `async` and
`await`. Additionally, it disallows the following at parser level:
- The '@' operator in < 3.5
- Async functions in < 3.5
- Async comprehensions in < 3.6
- Underscores in numeric literals in < 3.6
- Await expression in < 3.5
- Variable annotations in < 3.6
- Async for-loops in < 3.5
- Async with-statements in < 3.5
- F-strings in < 3.6
Closeswe-like-parsers/cpython#124.