A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree https://libcst.readthedocs.io/
Find a file
Zsolt Dollenstein b230302947
Fix tokenizing 0else
This is an obscure one.

`_ if 0else _` failed to parse with some very weird errors. It turns out that the tokenizer tries to parse `0else` as a single number, but when it encounters `l` it realizes it can't be a single number and it backtracks.

Unfortunately the backtracking logic was broken, and it failed to correctly backtrack one of the offsets used for whitespace parsing (the byte offset since the start of the line). This caused whitespace nodes to refer to incorrect parts of the input text, eventually resulting in the above behavior.

This PR fixes the bookkeeping when the tokenizer backtracks.

Reported in #930.
2023-05-27 19:33:20 +01:00
.cargo Implement a Python PEG parser in Rust (#566) 2021-12-21 18:14:39 +00:00
.github Switch default parser implementation to native (#929) 2023-05-25 18:24:59 +01:00
docs/source Fix pyre setup link in metadata.rst (#913) 2023-04-25 09:45:51 +01:00
libcst Fix type of evaluated_value on string to allow bytes (#721) 2023-05-26 13:43:05 +01:00
native Fix tokenizing 0else 2023-05-27 19:33:20 +01:00
scripts PEP 621 + hatch to run tests/lint/etc 2023-03-14 19:37:41 -07:00
stubs fix copyright headers and add a script to check (#635) 2022-02-01 11:13:17 +00:00
.editorconfig Implement a Python PEG parser in Rust (#566) 2021-12-21 18:14:39 +00:00
.fixit.config.yaml [CI] add Fixit to tox -e lint (#386) 2020-09-09 17:33:49 -07:00
.flake8 Implement a Python PEG parser in Rust (#566) 2021-12-21 18:14:39 +00:00
.gitattributes Add the logo to the docs/source/_static/ directory 2019-08-05 10:44:17 -07:00
.gitignore PEP 621 + hatch to run tests/lint/etc 2023-03-14 19:37:41 -07:00
.pyre_configuration Use pyre site-package feature (#589) 2022-01-07 07:34:50 -05:00
.readthedocs.yml PEP 621 + hatch to run tests/lint/etc 2023-03-14 19:37:41 -07:00
.watchmanconfig Fix some typos and add watchman config for pyre incremental check 2019-09-16 13:48:34 -07:00
CHANGELOG.md Switch default parser implementation to native (#929) 2023-05-25 18:24:59 +01:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant CoC 2022-02-01 14:12:18 +00:00
codecov.yml [CI] move test coverage job as a separate job (#48) 2019-08-27 08:46:35 -07:00
CONTRIBUTING.md Remove tox references (#588) 2022-01-06 21:09:27 -05:00
LICENSE Fix license headers (#560) 2021-12-28 11:55:18 +00:00
MANIFEST.in exclude native/target directory from sdist (#928) 2023-05-24 20:36:31 +01:00
pyproject.toml Bump black from 23.1.0 to 23.3.0 (#918) 2023-05-26 13:44:46 +01:00
README.rst PEP 621 + hatch to run tests/lint/etc 2023-03-14 19:37:41 -07:00
setup.py exclude native/target directory from sdist (#928) 2023-05-24 20:36:31 +01:00

.. image:: docs/source/_static/logo/horizontal.svg
   :width: 600 px
   :alt: LibCST

A Concrete Syntax Tree (CST) parser and serializer library for Python

|support-ukraine| |readthedocs-badge| |ci-badge| |codecov-badge| |pypi-badge| |pypi-download| |notebook-badge|

.. |support-ukraine| image:: https://img.shields.io/badge/Support-Ukraine-FFD500?style=flat&labelColor=005BBB
   :alt: Support Ukraine - Help Provide Humanitarian Aid to Ukraine.
   :target: https://opensource.fb.com/support-ukraine

.. |readthedocs-badge| image:: https://readthedocs.org/projects/libcst/badge/?version=latest&style=flat
   :target: https://libcst.readthedocs.io/en/latest/
   :alt: Documentation

.. |ci-badge| image:: https://github.com/Instagram/LibCST/actions/workflows/build.yml/badge.svg
   :target: https://github.com/Instagram/LibCST/actions/workflows/build.yml?query=branch%3Amain
   :alt: Github Actions

.. |codecov-badge| image:: https://codecov.io/gh/Instagram/LibCST/branch/main/graph/badge.svg
   :target: https://codecov.io/gh/Instagram/LibCST/branch/main
   :alt: CodeCov

.. |pypi-badge| image:: https://img.shields.io/pypi/v/libcst.svg
   :target: https://pypi.org/project/libcst
   :alt: PYPI

.. |pypi-download| image:: https://pepy.tech/badge/libcst/month
   :target: https://pepy.tech/project/libcst/month
   :alt: PYPI Download


.. |notebook-badge| image:: https://img.shields.io/badge/notebook-run-579ACA.svg?logo=
   :target: https://mybinder.org/v2/gh/Instagram/LibCST/main?filepath=docs%2Fsource%2Ftutorial.ipynb
   :alt: Notebook

.. intro-start

LibCST parses Python 3.0 -> 3.11 source code as a CST tree that keeps
all formatting details (comments, whitespaces, parentheses, etc). It's useful for
building automated refactoring (codemod) applications and linters.

.. intro-end

.. why-libcst-intro-start

LibCST creates a compromise between an Abstract Syntax Tree (AST) and a traditional
Concrete Syntax Tree (CST). By carefully reorganizing and naming node types and
fields, we've created a lossless CST that looks and feels like an AST.

.. why-libcst-intro-end

You can learn more about `the value that LibCST provides
<https://libcst.readthedocs.io/en/latest/why_libcst.html>`__ and `our
motivations for the project
<https://libcst.readthedocs.io/en/latest/motivation.html>`__
in `our documentation <https://libcst.readthedocs.io/en/latest/index.html>`__.
Try it out with `notebook examples <https://mybinder.org/v2/gh/Instagram/LibCST/main?filepath=docs%2Fsource%2Ftutorial.ipynb>`__.

Example expression::

    1 + 2

CST representation::

    BinaryOperation(
        left=Integer(
            value='1',
            lpar=[],
            rpar=[],
        ),
        operator=Add(
            whitespace_before=SimpleWhitespace(
                value=' ',
            ),
            whitespace_after=SimpleWhitespace(
                value=' ',
            ),
        ),
        right=Integer(
            value='2',
            lpar=[],
            rpar=[],
        ),
        lpar=[],
        rpar=[],
    )

Getting Started
===============

Examining a sample tree
-----------------------

To examine the tree that is parsed from a particular file, do the following::

    python -m libcst.tool print <some_py_file.py>

Alternatively, you can import LibCST into a Python REPL and use the included parser
and pretty printing functions:

>>> import libcst as cst
>>> from libcst.tool import dump
>>> print(dump(cst.parse_expression("(1 + 2)")))
BinaryOperation(
  left=Integer(
    value='1',
  ),
  operator=Add(),
  right=Integer(
    value='2',
  ),
  lpar=[
    LeftParen(),
  ],
  rpar=[
    RightParen(),
  ],
)

For a more detailed usage example, `see our documentation
<https://libcst.readthedocs.io/en/latest/tutorial.html>`__.

Installation
------------

LibCST requires Python 3.7+ and can be easily installed using most common Python
packaging tools. We recommend installing the latest stable release from
`PyPI <https://pypi.org/project/libcst/>`_ with pip:

.. code-block:: shell

    pip install libcst

For parsing, LibCST ships with a native extension, so releases are distributed as binary
wheels as well as the source code. If a binary wheel is not available for your system
(Linux/Windows x86/x64 and Mac x64/arm are covered), you'll need a recent
`Rust toolchain <https://rustup.rs>`_ for installing.

Further Reading
---------------
- `Static Analysis at Scale: An Instagram Story. <https://instagram-engineering.com/static-analysis-at-scale-an-instagram-story-8f498ab71a0c>`_
- `Refactoring Python with LibCST. <https://chairnerd.seatgeek.com/refactoring-python-with-libcst/>`_

Development
-----------

You'll need a recent `Rust toolchain <https://rustup.rs>`_ for developing.

We recommend using `hatch <https://hatch.pypa.io/>` for running tests, linters,
etc.

Then, start by setting up and building the project:

.. code-block:: shell

    git clone git@github.com:Instagram/LibCST.git libcst
    cd libcst
    hatch env create

To run the project's test suite, you can:

.. code-block:: shell

    hatch run test

You can also run individual tests by using unittest and specifying a module like
this:

.. code-block:: shell

    hatch run python -m unittest libcst.tests.test_batched_visitor

See the `unittest documentation <https://docs.python.org/3/library/unittest.html>`_
for more examples of how to run tests.

We have multiple linters, including copyright checks and
`slotscheck <https://slotscheck.rtfd.io>`_ to check the correctness of class
``__slots__``. To run all of the linters:

.. code-block:: shell

    hatch run lint

We use `ufmt <https://ufmt.omnilib.dev/en/stable/>`_ to format code. To format
changes to be conformant, run the following in the root:

.. code-block:: shell

    hatch run format

Building
~~~~~~~~

In order to build LibCST, which includes a native parser module, you
will need to have the Rust build tool ``cargo`` on your path. You can
usually install ``cargo`` using your system package manager, but the
most popular way to install cargo is using
`rustup <https://rustup.rs/>`_.

To build just the native parser, do the following from the ``native``
directory:

.. code-block:: shell

    cargo build

To rebuild the ``libcst.native`` module, from the repo root:

.. code-block:: shell

    hatch env prune && hatch env create

Type Checking
~~~~~~~~~~~~~

We use `Pyre <https://github.com/facebook/pyre-check>`_ for type-checking.

To verify types for the library, do the following in the root:

.. code-block:: shell

    hatch run typecheck

Generating Documents
~~~~~~~~~~~~~~~~~~~~

To generate documents, do the following in the root:

.. code-block:: shell

    hatch run docs

Future
======

- Advanced full repository facts providers like fully qualified name and call graph.

License
=======

LibCST is `MIT licensed <LICENSE>`_, as found in the LICENSE file.

.. fb-docs-start

Privacy Policy and Terms of Use
===============================

- `Privacy Policy <https://opensource.facebook.com/legal/privacy>`_
- `Terms of Use <https://opensource.facebook.com/legal/terms>`_

.. fb-docs-end

Acknowledgements
================

- Guido van Rossum for creating the parser generator pgen2 (originally used in lib2to3 and forked into parso).
- David Halter for parso which provides the parser and tokenizer that LibCST sits on top of.
- Zac Hatfield-Dodds for hypothesis integration which continues to help us find bugs.
- Zach Hammer improved type annotation for Mypy compatibility.