Commit graph

23 commits

Author SHA1 Message Date
martin
3b5329aa20
feat: add support for PEP758 (#1401)
Some checks are pending
CI / test (macos-latest, 3.13) (push) Waiting to run
CI / test (macos-latest, 3.13t) (push) Waiting to run
CI / test (macos-latest, 3.14) (push) Waiting to run
CI / test (macos-latest, 3.14t) (push) Waiting to run
CI / test (macos-latest, 3.9) (push) Waiting to run
CI / test (ubuntu-latest, 3.10) (push) Waiting to run
CI / test (ubuntu-latest, 3.11) (push) Waiting to run
CI / test (ubuntu-latest, 3.12) (push) Waiting to run
CI / test (ubuntu-latest, 3.13) (push) Waiting to run
CI / test (ubuntu-latest, 3.13t) (push) Waiting to run
CI / test (ubuntu-latest, 3.14) (push) Waiting to run
CI / test (ubuntu-latest, 3.14t) (push) Waiting to run
CI / test (ubuntu-latest, 3.9) (push) Waiting to run
CI / test (windows-latest, 3.10) (push) Waiting to run
CI / test (windows-latest, 3.11) (push) Waiting to run
CI / test (windows-latest, 3.12) (push) Waiting to run
CI / test (windows-latest, 3.13) (push) Waiting to run
CI / test (windows-latest, 3.13t) (push) Waiting to run
CI / test (windows-latest, 3.14) (push) Waiting to run
CI / test (windows-latest, 3.14t) (push) Waiting to run
CI / test (windows-latest, 3.9) (push) Waiting to run
CI / lint (push) Waiting to run
CI / typecheck (push) Waiting to run
CI / docs (push) Waiting to run
CI / Rust unit tests (push) Waiting to run
CI / Rustfmt (push) Waiting to run
CI / build (push) Waiting to run
pypi_upload / build (push) Waiting to run
pypi_upload / Upload wheels to pypi (push) Blocked by required conditions
GitHub Actions Security Analysis with zizmor 🌈 / zizmor latest via PyPI (push) Waiting to run
PEP758 removes the requirement for parentheses to surround exceptions
in except and except* expressions when 'as' is not present.

This pr implements support for parsing these types of statements
2025-09-09 11:16:49 -04:00
martin
48668dfabb
Support parsing of t-strings #1374 (#1398)
#1343
Adds support to parse t-strings

Couple things of note:

TemplatedString* is largely a copy of FormattedString*
Since clients operate of libcst objects I consider this this part of a public API - following the python grammar (where TStrings are distinct from FStrings) seems like a good way to avoid changes to the API in the future.
Within the tokenizer we reuse the fstring machinery
I consider this an implementation detail, fstrings and tstrings are (for now) identical, we can change this later without changes to the public api.
Since 2 -> we have a new FTStringType enum
We need to discriminate between f and t strings to know which token to return, a bit clumsy to use in my opinion - so looking for feedback here on how to improve this.
2025-09-09 11:16:20 -04:00
Zsolt Dollenstein
8b97600fb3
fix various Match statement visitation errors (#1161)
Fixes #1160.

This PR also

- fixes `whitespace_before_colon` being swallowed during visitation on `MatchCase`s
- adds a new type of roundtrip test that catches issues of this class: the test applies a noop transformer to exercise the visitation API and compares the result with the original source.
- adds a few more cases to the match fixture
2024-06-12 17:29:25 +01:00
martin
71b0a1288b
Implement Type Defaults for Type Parameters (PEP 696) (#1141)
Co-authored-by: thereversiblewheel <martin.li@uwaterloo.ca>
2024-05-20 11:26:38 -04:00
Zsolt Dollenstein
c854c986b6
Fix parsing list matchers without explicit brackets (#1097)
```
match a:
  case 1, 2: pass
```

This is parsed correctly by the grammar, but the default values of `MatchList.lbracket` and `MatchList.rbracket` are inconsistent between Python and Rust, causing the above snippet to round-trip (from Python) to:
```
match a:
  case [1, 2]: pass
```

Fixes #1096.
2024-02-02 20:49:25 +00:00
Zsolt Dollenstein
5df1569a40
Parse multiline expressions in f-strings (#1027) 2023-10-02 10:33:29 -07:00
Zsolt Dollenstein
03179b55eb
Parse arbitrarily nested f-strings (#1026) 2023-10-01 20:58:40 +01:00
Micha Reiser
9c263aa897
Support files with mixed newlines (#1007)
* Add test case with mixed newlines

* Split lines by any newline character and not just by default

* Add unit test, remove copied
2023-09-02 09:56:20 +01:00
Zsolt Dollenstein
9286446f88
PEP 695 - Type Parameter Syntax (#1004)
This PR adds support for parsing and representing Type Parameters and Type Aliases as specified by PEP 695. What's missing are the scope rules, to be implemented in a future PR.

Notable (user visible) changes:

- new `TypeAlias` CST node, which is a `SmallStatement`
- new CST nodes to represent TypeVarLikes: `TypeVar`, `TypeVarTuple`, `ParamSpec`
- new helper CST nodes:  `TypeParameters` to serve as a container for multiple TypeVarLikes, and `TypeParam` which is a single item in a `TypeParameters` (owning the separating comma)
- extended `FunctionDef` and `ClassDef` with an optional `type_parameters` field, as well as `whitespace_after_type_parameters` to own the extra whitespace between type parameters and the following token
  - these new fields are added after all others to avoid breaking callers passing in fields as positional arguments
- in `FunctionDef` and `ClassDef`, `whitespace_after_name` now owns the whitespace before the type parameters if they exist
2023-08-28 22:07:22 +01:00
Zsolt Dollenstein
0fb9021218
Don't swallow trailing whitespace (#976) 2023-07-18 10:03:10 +01:00
Zsolt Dollenstein
2acc293347
Fix whitespace, fstring, walrus related parse errors (#939, #938, #937, #936, #935, #934, #933, #932, #931)
* Allow walrus in slices

See https://github.com/python/cpython/pull/23317

Raised in #930.

* Fix parsing of nested f-string specifiers

For an expression like `f"{one:{two:}{three}}"`, `three` is not in an f-string spec, and should be tokenized accordingly.

This PR fixes the `format_spec_count` bookkeeping in the tokenizer, so it properly decrements it when a closing `}` is encountered but only if the `}` closes a format_spec.

Reported in #930.

* Fix tokenizing `0else`

This is an obscure one.

`_ if 0else _` failed to parse with some very weird errors. It turns out that the tokenizer tries to parse `0else` as a single number, but when it encounters `l` it realizes it can't be a single number and it backtracks.

Unfortunately the backtracking logic was broken, and it failed to correctly backtrack one of the offsets used for whitespace parsing (the byte offset since the start of the line). This caused whitespace nodes to refer to incorrect parts of the input text, eventually resulting in the above behavior.

This PR fixes the bookkeeping when the tokenizer backtracks.

Reported in #930.

* Allow no whitespace between lambda keyword and params in certain cases

Python accepts code where `lambda` follows a `*`, so this PR relaxes validation rules for Lambdas.

Raised in #930.

* Allow any expression in comprehensions' evaluated expression


This PR relaxes the accepted types for the `elt` field in `ListComp`, `SetComp`, and `GenExp`, as well as the `key` and `value` fields in `DictComp`.

Fixes #500.

* Allow no space around an ifexp in certain cases

For example in `_ if _ else""if _ else _`.

Raised in #930. Also fixes #854.

* Allow no spaces after `as` in a contextmanager in certain cases

Like in `with foo()as():pass`

Raised in #930.

* Allow no spaces around walrus in certain cases

Like in `[_:=''for _ in _]`

Raised in #930.

* Allow no whitespace after lambda body in certain cases

Like in `[lambda:()for _ in _]`

Reported in #930.
2023-06-07 12:37:16 +01:00
Steven Troxler
b5c34d39a0
Fix Github issue 855 - fail to parse with statement (#861)
* Fix Github issue 855 - fail to parse with statement

When we added support for parenthesized with statements, the
grammar on the with itself was correct (it's a right and left
parenthesis around a comma-separated list of with-items, with
a possible trailing comma).

But inside of the "as" variation of the with_item rule we have a peek at
the next character, which was allowing for a comma or a colon. That peek
needs to also accept right parentheses - otherwise, if the last item
contains an `as` and has no trailing comma we fail to parse.

The bug is exercisecd by, for example, this code snippet:
```
with (foo, bar as bar,):
    pass
```

The with_wickedness test fixture has been revised to include both
the plain and async variations of this example snippet with and without
trailing comma, and tests pass after the peek rule fix.

* Add more tests covering the plain expression form of `with_item`
2023-02-16 10:49:05 -08:00
Zsolt Dollenstein
9925117391
Support whitespace after ParamSlash (#713)
* add whitespace_after field to ParamSlash
* codegen
2022-06-26 09:42:37 +01:00
Zsolt Dollenstein
5592f2e00f
Fix parsing of parenthesized empty tuples (#712)
* Don't drop rpars from empty tuples during inflate
2022-06-26 09:41:49 +01:00
Zsolt Dollenstein
153c6d12c0
Only skip supported escaped characters in f-strings (#700) 2022-06-16 09:47:36 +01:00
Zsolt Dollenstein
ebe1851c2b
Add support for PEP-646 (#696) 2022-06-13 09:52:31 -06:00
Zsolt Dollenstein
c91655fbba
fix copyright headers and add a script to check (#635) 2022-02-01 11:13:17 +00:00
Zsolt Dollenstein
8ed3a9cd5c
[native] Box most enums (#632)
* Box most enums

* add big nested expression as fixture
2022-01-28 10:33:33 +00:00
Zsolt Dollenstein
86431eea89
Make sure dedents are emitted for inputs without trailing newlines (#573) 2022-01-04 20:04:21 +00:00
Zsolt Dollenstein
9932a6d339
Implement PEP-634 - Match statement (#568)
* ParenthesizedNode implementation for Box

* match statement rust CST and grammar

* match statement python CST and docs

* run rust unit tests in release mode for now
2021-12-30 10:00:51 +00:00
Zsolt Dollenstein
67db03915d
implement PEP-654: except* (#571) 2021-12-29 21:23:46 +00:00
Zsolt Dollenstein
c44ff0500b
Fix license headers (#560)
* Facebook -> Meta

* remove year from doc copyright
2021-12-28 11:55:18 +00:00
Zsolt Dollenstein
c02de9b718
Implement a Python PEG parser in Rust (#566)
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms.

Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR.

The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by
1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper)
2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree
3. using the anchor tokens to **inflate** the syntax tree into a proper CST

Co-authored-by: Benjamin Woodruff <github@benjam.info>
2021-12-21 18:14:39 +00:00