PEP758 removes the requirement for parentheses to surround exceptions
in except and except* expressions when 'as' is not present.
This pr implements support for parsing these types of statements
#1343
Adds support to parse t-strings
Couple things of note:
TemplatedString* is largely a copy of FormattedString*
Since clients operate of libcst objects I consider this this part of a public API - following the python grammar (where TStrings are distinct from FStrings) seems like a good way to avoid changes to the API in the future.
Within the tokenizer we reuse the fstring machinery
I consider this an implementation detail, fstrings and tstrings are (for now) identical, we can change this later without changes to the public api.
Since 2 -> we have a new FTStringType enum
We need to discriminate between f and t strings to know which token to return, a bit clumsy to use in my opinion - so looking for feedback here on how to improve this.
* add failing test
* fix issue
* fixes an issue with PositionProvider not working with case statement
* remove comments
---------
Co-authored-by: steve <steve@patreon.com>
* Keep old exception messages (avoid breaking-changes for users relying on exception messages)
* Move ``get_expected_str`` out of _exceptions.py, where it does not belong, to its own file in _parser/_parsing_check.py
Fixes#1160.
This PR also
- fixes `whitespace_before_colon` being swallowed during visitation on `MatchCase`s
- adds a new type of roundtrip test that catches issues of this class: the test applies a noop transformer to exercise the visitation API and compares the result with the original source.
- adds a few more cases to the match fixture
Summary:
This PR removes the `typing_extensions` and `typing_inspect` dependencies as we can now rely on the built-in `typing` module since Python 3.9.
Test Plan:
existing tests
```
match a:
case 1, 2: pass
```
This is parsed correctly by the grammar, but the default values of `MatchList.lbracket` and `MatchList.rbracket` are inconsistent between Python and Rust, causing the above snippet to round-trip (from Python) to:
```
match a:
case [1, 2]: pass
```
Fixes#1096.
This PR adds support for parsing and representing Type Parameters and Type Aliases as specified by PEP 695. What's missing are the scope rules, to be implemented in a future PR.
Notable (user visible) changes:
- new `TypeAlias` CST node, which is a `SmallStatement`
- new CST nodes to represent TypeVarLikes: `TypeVar`, `TypeVarTuple`, `ParamSpec`
- new helper CST nodes: `TypeParameters` to serve as a container for multiple TypeVarLikes, and `TypeParam` which is a single item in a `TypeParameters` (owning the separating comma)
- extended `FunctionDef` and `ClassDef` with an optional `type_parameters` field, as well as `whitespace_after_type_parameters` to own the extra whitespace between type parameters and the following token
- these new fields are added after all others to avoid breaking callers passing in fields as positional arguments
- in `FunctionDef` and `ClassDef`, `whitespace_after_name` now owns the whitespace before the type parameters if they exist
* Allow walrus in slices
See https://github.com/python/cpython/pull/23317
Raised in #930.
* Fix parsing of nested f-string specifiers
For an expression like `f"{one:{two:}{three}}"`, `three` is not in an f-string spec, and should be tokenized accordingly.
This PR fixes the `format_spec_count` bookkeeping in the tokenizer, so it properly decrements it when a closing `}` is encountered but only if the `}` closes a format_spec.
Reported in #930.
* Fix tokenizing `0else`
This is an obscure one.
`_ if 0else _` failed to parse with some very weird errors. It turns out that the tokenizer tries to parse `0else` as a single number, but when it encounters `l` it realizes it can't be a single number and it backtracks.
Unfortunately the backtracking logic was broken, and it failed to correctly backtrack one of the offsets used for whitespace parsing (the byte offset since the start of the line). This caused whitespace nodes to refer to incorrect parts of the input text, eventually resulting in the above behavior.
This PR fixes the bookkeeping when the tokenizer backtracks.
Reported in #930.
* Allow no whitespace between lambda keyword and params in certain cases
Python accepts code where `lambda` follows a `*`, so this PR relaxes validation rules for Lambdas.
Raised in #930.
* Allow any expression in comprehensions' evaluated expression
This PR relaxes the accepted types for the `elt` field in `ListComp`, `SetComp`, and `GenExp`, as well as the `key` and `value` fields in `DictComp`.
Fixes#500.
* Allow no space around an ifexp in certain cases
For example in `_ if _ else""if _ else _`.
Raised in #930. Also fixes#854.
* Allow no spaces after `as` in a contextmanager in certain cases
Like in `with foo()as():pass`
Raised in #930.
* Allow no spaces around walrus in certain cases
Like in `[_:=''for _ in _]`
Raised in #930.
* Allow no whitespace after lambda body in certain cases
Like in `[lambda:()for _ in _]`
Reported in #930.
* Fix type of evaluated_value on string
This can return bytes if the string is a bytestring, e.g.:
In [1]: import libcst as cst
In [2]: cst.parse_expression('b"foo"').evaluated_value
Out[2]: b'foo'
* Fix type errors from changed signature
* Fixes an issue where ApplyTypeAnnotationsVisitor would crash on code
like `SomeClass.some_attribute = 42` with a "Name is not a valid
identifier" error message.
* Changes the above-mentioned error message to include the bad name in
the message, for easier debugging.
* Adds tests for all valid assignment targets, as described here:
https://libcst.readthedocs.io/en/latest/nodes.html#libcst.BaseAssignTargetExpression.
On the python side, we can add parentheses from MaybeSentinel.DEFAULT if the whitespace requires it.
On the rust side, we support the new grammar but codegen will only add explicitly included parentheses for now - it should be possible to match python behavior but it's not urgent so I've left a TODO
It took me some time to track down the root cause of `children`
not matching codegen, having the error message directly hint that
visit and codegen are probably mimatched (my visit was running
out-of-order) will likely help newbies get going faster.
* ParenthesizedNode implementation for Box
* match statement rust CST and grammar
* match statement python CST and docs
* run rust unit tests in release mode for now
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms.
Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR.
The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by
1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper)
2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree
3. using the anchor tokens to **inflate** the syntax tree into a proper CST
Co-authored-by: Benjamin Woodruff <github@benjam.info>
which ensures we won't have inconsistent black-vs-isort errors
going forward. We can always format by running `ufmt format .`
at the root, and check with `ufmt check .` in our CI actions.
* Add flatten_sentinal
* Add FlattenSentinal to __all__
* Fix lint errors
* Fix type errors
* Update test to use leave_Return
* Update and run codegen
* Add empty test
* Update docs
* autofix
Previous behavior treated it as identical to equal, making a kwarg; it should
instead be a positional arg. Includes several tests to make sure that
whitespace handling is correct.
Fixes#416
* Read install requirements from requirements.txt
* read extras_require from requirements-dev.txt
* add requirements-dev.txt to MANIFEST.in
* apply fixes for new version of Black and Flake8
* don't upgrade Pyre
* re-format
## Summary
The pyre stub for the tokenizer module had a syntax error.
Fixing it removes other pyre errors.
## Test Plan
```
pyre check
```
Co-authored-by: Germán Méndez Bravo <kronuz@fb.com>
* fix: improve validation for ImportAlias and Try statements
For `Try` statements we ensure that the bare except, if present, is at the last position.
For ImportAlias we ensure that the imported name is valid.
Fixes#287
* Apply suggestions from code review
Add missing periods.
* Apply suggestions from code review
Add missing periods.
* Update libcst/_nodes/tests/test_import.py
Co-authored-by: jimmylai <yurinai@gmail.com>