LibCST/native/libcst/README.md
Zsolt Dollenstein c02de9b718
Implement a Python PEG parser in Rust (#566)
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms.

Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR.

The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by
1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper)
2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree
3. using the anchor tokens to **inflate** the syntax tree into a proper CST

Co-authored-by: Benjamin Woodruff <github@benjam.info>
2021-12-21 18:14:39 +00:00

1.9 KiB

libcst_native

A very experimental native extension to speed up LibCST. This does not currently provide much performance benefit and is therefore not recommended for general use.

The extension is written in Rust using PyO3.

This installs as a separate python package that LibCST looks for and will import if it's available.

Using with LibCST

Set up a rust development environment. Using rustup is recommended, but not necessary. Rust 1.45.0+ should work.

Follow the instructions for setting up a virtualenv in the top-level README, then:

cd libcst_native
maturin develop  # install libcst_native to the virtualenv
cd ..            # cd back into the main project
python -m unittest

This will run the python test suite. Nothing special is required to use libcst_native, since libcst will automatically use the native extension when it's installed.

When benchmarking this code, make sure to run maturin develop with the --release flag to enable compiler optimizations.

You can disable the native extension by uninstalling the package from your virtualenv:

pip uninstall libcst_native

Rust Tests

In addition to running the python test suite, you can run some tests written in rust with

cargo test --no-default-features

The --no-default-features flag needed to work around an incompatibility between tests and pyo3's extension-module feature.

Code Formatting

Use cargo fmt to format your code.

Release

This isn't currently supported, so there's no releases available, but the end-goal would be to publish this on PyPI.

Because this is a native extension, it must be re-built for each platform/architecture. The per-platform build could be automated using a CI system, like github actions.