LibCST/native/libcst/tests/fixtures/comments.py
Zsolt Dollenstein c02de9b718
Implement a Python PEG parser in Rust (#566)
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms.

Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR.

The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by
1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper)
2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree
3. using the anchor tokens to **inflate** the syntax tree into a proper CST

Co-authored-by: Benjamin Woodruff <github@benjam.info>
2021-12-21 18:14:39 +00:00

101 lines
1.9 KiB
Python

#!/usr/bin/env python3
# fmt: on
# Some license here.
#
# Has many lines. Many, many lines.
# Many, many, many lines.
"""Module docstring.
Possibly also many, many lines.
"""
import os.path
import sys
import a
from b.c.d.e import X # some noqa comment
try:
import fast
except ImportError:
import slow as fast
# Some comment before a function.
y = 1
(
# some strings
y # type: ignore
)
def function(default=None):
"""Docstring comes first.
Possibly many lines.
"""
# FIXME: Some comment about why this function is crap but still in production.
import inner_imports
if inner_imports.are_evil():
# Explains why we have this if.
# In great detail indeed.
x = X()
return x.method1() # type: ignore
# This return is also commented for some reason.
return default
# Explains why we use global state.
GLOBAL_STATE = {"a": a(1), "b": a(2), "c": a(3)}
# Another comment!
# This time two lines.
class Foo:
"""Docstring for class Foo. Example from Sphinx docs."""
#: Doc comment for class attribute Foo.bar.
#: It can have multiple lines.
bar = 1
flox = 1.5 #: Doc comment for Foo.flox. One line only.
baz = 2
"""Docstring for class attribute Foo.baz."""
def __init__(self):
#: Doc comment for instance attribute qux.
self.qux = 3
self.spam = 4
"""Docstring for instance attribute spam."""
#' <h1>This is pweave!</h1>
@fast(really=True)
async def wat():
# This comment, for some reason \
# contains a trailing backslash.
async with X.open_async() as x: # Some more comments
result = await x.method1()
# Comment after ending a block.
if result:
print("A OK", file=sys.stdout)
# Comment between things.
print()
if True: # Hanging comments
# because why not
pass
# Some closing comments.
# Maybe Vim or Emacs directives for formatting.
# Who knows.