PEP758 removes the requirement for parentheses to surround exceptions
in except and except* expressions when 'as' is not present.
This pr implements support for parsing these types of statements
#1343
Adds support to parse t-strings
Couple things of note:
TemplatedString* is largely a copy of FormattedString*
Since clients operate of libcst objects I consider this this part of a public API - following the python grammar (where TStrings are distinct from FStrings) seems like a good way to avoid changes to the API in the future.
Within the tokenizer we reuse the fstring machinery
I consider this an implementation detail, fstrings and tstrings are (for now) identical, we can change this later without changes to the public api.
Since 2 -> we have a new FTStringType enum
We need to discriminate between f and t strings to know which token to return, a bit clumsy to use in my opinion - so looking for feedback here on how to improve this.
* generate Attribute nodes when applying type annotations
The old version generated an incorrect CST which
happened to work as long as you didn't do further processing.
* add a test
* add failing test
* fix issue
* fixes an issue with PositionProvider not working with case statement
* remove comments
---------
Co-authored-by: steve <steve@patreon.com>
* use dependency-groups in pyproject.toml
* replace `hatch run foo` with `uv run poe foo`
* install uv @ 0.7.12 in CI and disable caching
* use `uv run --group docs` for the `docs` command
* DRY docs between CONTRIBUTING and README
* tell pyre to ignore `.venv`
* set up uv to rebuild on rust, pyproject.toml, git changes
* Keep old exception messages (avoid breaking-changes for users relying on exception messages)
* Move ``get_expected_str`` out of _exceptions.py, where it does not belong, to its own file in _parser/_parsing_check.py
From 3.14 onwards, we'll get `foo | bar` instead of `typing.Union[foo, bar]` as the annotation for union types (including optional). This PR prepares the codegen script for this.
This PR:
1. marks the `libcst.native` module as free-threading-compatible
2. replaces the use of ProcessPoolExecutor with ThreadPoolExecutor if free-threaded CPython is detected at runtime
Instead of sharing instances of a Codemod across many files, this PR allows passing in a Codemod class to `parallel_exec_transform_with_prettyprint` which will then instantiate the Codemod for each file. `tool._codemod_impl` now starts using this API.
The old behavior is deprecated, because sharing codemod instances across files is a surprising behavior, and causes hard-to-diagnose bugs when a Codemod keeps track of its state via instance variables.
Instead of relying on `multiprocessing.Pool`, this PR replaces the implementation of `parallel_exec_transform_with_prettyprint` with `concurrent.futures.ProcessPoolExecutor`
* Update Cargo.lock and Cargo.toml for PyO3 0.23 support
* Replace deprecated _bound methods with their new undeprecated names
* Update TryIntoPy trait to use IntoPyObject
* Update ParserError wrapper to use IntoPyObject
* replace unwrap with early return
When renaming `a.b` -> `c.d`, in imports like `import a.b as x` the as_name wasn't correctly removed even though references to `x` were renamed to `c.d`.
This PR makes the codemod remove the `x` asname in these cases.
This fixes current CI failures by skipping Musl builds for `i686`,
`ppc64le`, `s390x`, and `armv7le` architectures.
The failures are due to Rust ecosystem having only partial support / not
having tool chains for these architectures. For the list of supported
archs and tiers of support, see:
https://doc.rust-lang.org/nightly/rustc/platform-support.html
The architectures skipped here are either, from the Rust PoV:
- Tier-2 support without host tools.
- Tier-3 support without host tools.
* Add is_property check
Skip properties to prevent exceptions
* Delayed initialization of matchers
To support multiprocessing on Windows/macOS
Issue #1181
* Add a test for matcher decorators with multiprocessing
Per the Cargo Book, `license-file` is only to be used if a package uses
a non-standard license; see https://doc.rust-lang.org/cargo/reference/manifest.html#the-license-and-license-file-fields
Declare the licenses directly, and verify that the LICENSE file
containing the license breakdown is still included
```
…n LibCST/native/libcst_derive on cargo-fixes [!] is 📦 v1.4.0 via 🦀 v1.77.1
⬢ [fedora:40] ❯ cargo package --list --allow-dirty | grep LICENSE
LICENSE
…n LibCST/native/libcst_derive on cargo-fixes [!] is 📦 v1.4.0 via 🦀 v1.77.1
⬢ [fedora:40] ❯ cd ../libcst
michel in LibCST/native/libcst on cargo-fixes [!] is 📦 v1.4.0 via 🦀 v1.77.1
⬢ [fedora:40] ❯ cargo package --list --allow-dirty | grep LICENSE
LICENSE
src/tokenizer/core/LICENSE
```
Signed-off-by: Michel Lind <salimma@fedoraproject.org>
* Clean warnings for each file in comemod cli
* Fix ZeroDivisionError: float division by zero
When codemodding too fast
* Recreate CodemodContext for each file
Keep only context.metadata_manager
Remove wrapper from context defaults on each file
Fixes#1160.
This PR also
- fixes `whitespace_before_colon` being swallowed during visitation on `MatchCase`s
- adds a new type of roundtrip test that catches issues of this class: the test applies a noop transformer to exercise the visitation API and compares the result with the original source.
- adds a few more cases to the match fixture
#453 fixed scratch leaking between files by setting it to empty, but that drops all the scratch space that was set up before the codemod runs (e.g. in the transformer's constructor)
This PR improves the fix by preserving the initial scratch.
* Make the nodes fields filtering process - from libcst.tool - public, so that other libraries may provide their own custom representation of LibCST graphs.
* Create functions to access & filter CST-node fields (with appropriate docstrings & tests), in libcst.helpers
* Add new CST-node fields functions to helpers documentation.
* Make the nodes fields filtering process - from libcst.tool - public, so that other libraries may provide their own custom representation of LibCST graphs.
* Create functions to access & filter CST-node fields (with appropriate docstrings & tests), in libcst.helpers
* Add new CST-node fields functions to helpers documentation.
Summary:
This PR removes the `typing_extensions` and `typing_inspect` dependencies as we can now rely on the built-in `typing` module since Python 3.9.
Test Plan:
existing tests
```
match a:
case 1, 2: pass
```
This is parsed correctly by the grammar, but the default values of `MatchList.lbracket` and `MatchList.rbracket` are inconsistent between Python and Rust, causing the above snippet to round-trip (from Python) to:
```
match a:
case [1, 2]: pass
```
Fixes#1096.
Readthedocs builds are currently failing because the libcst wheel fails
to build, hitting an error when trying to get rust dependencies:
```
running build_rust
Updating crates.io index
error: failed to select a version for the requirement `regex = "=1.9.3"`
candidate versions found which didn't match: 1.8.4, 1.8.3, 1.8.2, ...
location searched: crates.io index
required by package `libcst v1.1.0 (/home/docs/checkouts/readthedocs.org/user_builds/libcst/checkouts/latest/native/libcst)`
error: `cargo metadata --manifest-path native/libcst/Cargo.toml --format-version 1` failed with code 101
```
Assuming this is related to current configuration requesting rust v1.55,
rather than 1.70 that is currently offered.
when we switch to the rust compiler by default, mybinder stop working, as reoported in https://github.com/Instagram/LibCST/issues/1054 this is because the binder docker image does not have a rust compiler or tools, this install them by using the apt.txt file
Co-authored-by: Alvaro Leiva Geisse <aleivag@meta.com>
* Update test_fix_pyre_directives.py
refactor with fstring to format string to make code more Pythonic.
* Update test_fix_pyre_directives.py
refactor with fstring to format string to make code more Pythonic.
* Update test_fix_pyre_directives.py
refactor with fstring to format string to make code more Pythonic.
* Update test_fix_pyre_directives.py
refactor with fstring to format string to make code more Pythonic.
* Update test_fix_pyre_directives.py
refactor with chain constant value assignment to make code more Pythonic
* Update test_fix_pyre_directives.py
refactor with chain constant value assignment to make code more Pythonic
* Update pyproject.toml for Python 3.12 support
add 3.12 classifier and update description to correctly reflect supported Python versions
* Update pyproject.toml
make the stated parsable versions range consistent with the README
* AddImportsVisitor will now only add at the top of module
- Also added new tests to cover these cases
* Fixed an issue with from imports
* Added a couple tests for AddImportsVisitor
* Refactoring of GatherImportsVisitor
* Refactors, typos and typing changes
This PR introduces `Scope._next_visible_parent` which deduplicates much of the logic between `_contains_in_self_or_parent`, `_find_assignment_target_parent`, and `_getitem_from_self_or_parent`.
This will be helpful when implementing scope resolution for the future `AnnotationScope`.
There should be no functionality change.
removing Regexes from whitespace parser allows ditching of thread local storage + lazy initialization cost
This shows a modest 2% improvement in overall parse time (inflate is improved by 10%)
This PR adds support for parsing and representing Type Parameters and Type Aliases as specified by PEP 695. What's missing are the scope rules, to be implemented in a future PR.
Notable (user visible) changes:
- new `TypeAlias` CST node, which is a `SmallStatement`
- new CST nodes to represent TypeVarLikes: `TypeVar`, `TypeVarTuple`, `ParamSpec`
- new helper CST nodes: `TypeParameters` to serve as a container for multiple TypeVarLikes, and `TypeParam` which is a single item in a `TypeParameters` (owning the separating comma)
- extended `FunctionDef` and `ClassDef` with an optional `type_parameters` field, as well as `whitespace_after_type_parameters` to own the extra whitespace between type parameters and the following token
- these new fields are added after all others to avoid breaking callers passing in fields as positional arguments
- in `FunctionDef` and `ClassDef`, `whitespace_after_name` now owns the whitespace before the type parameters if they exist
When the input doesn't have a trailing newline, but the last line had
exactly the amount of bytes as the current indentation level, the
tokenizer didn't emit a fake newline, causing parse errors (the grammar
expects newlines to conform with the Python spec).
I don't see any reason for fake newlines to be omitted in these cases,
so this PR removes that condition from the tokenizer.
Reported in #930.
* Allow walrus in slices
See https://github.com/python/cpython/pull/23317
Raised in #930.
* Fix parsing of nested f-string specifiers
For an expression like `f"{one:{two:}{three}}"`, `three` is not in an f-string spec, and should be tokenized accordingly.
This PR fixes the `format_spec_count` bookkeeping in the tokenizer, so it properly decrements it when a closing `}` is encountered but only if the `}` closes a format_spec.
Reported in #930.
* Fix tokenizing `0else`
This is an obscure one.
`_ if 0else _` failed to parse with some very weird errors. It turns out that the tokenizer tries to parse `0else` as a single number, but when it encounters `l` it realizes it can't be a single number and it backtracks.
Unfortunately the backtracking logic was broken, and it failed to correctly backtrack one of the offsets used for whitespace parsing (the byte offset since the start of the line). This caused whitespace nodes to refer to incorrect parts of the input text, eventually resulting in the above behavior.
This PR fixes the bookkeeping when the tokenizer backtracks.
Reported in #930.
* Allow no whitespace between lambda keyword and params in certain cases
Python accepts code where `lambda` follows a `*`, so this PR relaxes validation rules for Lambdas.
Raised in #930.
* Allow any expression in comprehensions' evaluated expression
This PR relaxes the accepted types for the `elt` field in `ListComp`, `SetComp`, and `GenExp`, as well as the `key` and `value` fields in `DictComp`.
Fixes#500.
* Allow no space around an ifexp in certain cases
For example in `_ if _ else""if _ else _`.
Raised in #930. Also fixes#854.
* Allow no spaces after `as` in a contextmanager in certain cases
Like in `with foo()as():pass`
Raised in #930.
* Allow no spaces around walrus in certain cases
Like in `[_:=''for _ in _]`
Raised in #930.
* Allow no whitespace after lambda body in certain cases
Like in `[lambda:()for _ in _]`
Reported in #930.
* Fix type of evaluated_value on string
This can return bytes if the string is a bytestring, e.g.:
In [1]: import libcst as cst
In [2]: cst.parse_expression('b"foo"').evaluated_value
Out[2]: b'foo'
* Fix type errors from changed signature
* Fixes an issue where ApplyTypeAnnotationsVisitor would crash on code
like `SomeClass.some_attribute = 42` with a "Name is not a valid
identifier" error message.
* Changes the above-mentioned error message to include the bad name in
the message, for easier debugging.
* Adds tests for all valid assignment targets, as described here:
https://libcst.readthedocs.io/en/latest/nodes.html#libcst.BaseAssignTargetExpression.
* Simplify command specifier parsing
* Allow running codemods without configuring in YAML
This enables codemodding things by just plonking a CodemodCommand class
into any old importable module and running
`python -m libcst.tool codemod -x some_module.SomeClass ...`
Moves PEP 621 metadata from `setup.py` and `requirements*.txt` into the
`[project]` table of `pyproject.toml`. This enables using hatch as a
task runner for the project, where previously one would need to remember
a bunch of different commands, or repeatedly consult the readme's
developer guide to find all of the relevant commands.
This creates the following hatch commands:
- docs
- fixtures
- format
- lint
- test
- typecheck
It also updates all of the github actions workflows to use the
appropriate hatch commands, and the readme's developer guide, so that
there is only one source of truth for what constitutes running tests.
The "test" workflows now drop the matrix distinction between "pure" or
"native", and run tests in both modes from a single build.
ghstack-source-id: 8834da7825
Pull Request resolved: https://github.com/Instagram/LibCST/pull/893
Caches file path information on the root `Module` node.
Resolves paths when caching, so they are always absolute paths.
Adds a new `chdir` helper to change working directory and automatically
revert to previous directory, which makes testing file paths with the
`"."` repo root easier.
ghstack-source-id: 3413905fc1
Pull Request resolved: https://github.com/Instagram/LibCST/pull/892
pypi_upload has been broken since #810, because `actions/checkout` defaults to a shallow checkout that only checks out the revision triggering the workflow. This causes setuptools-scm to miss the most recent tag, causing the version to be detected as `0.1`.
With the latest setup-python actions, there is a better caching
mechanism available that also requires less setup, and provides better
fallback behavior that should help avoid the random CI failures that
have been happening on 3.11 for setuptools-rust. This ensures that we
install the necessary dependencies before attempting to build the
package or run tests, while still enabling speedups in best case
scenario when requirements files haven't changed.
See the upstream readme for details:
https://github.com/actions/setup-python#caching-packages-dependencies
This allows FullyQualifiedNameProvider to work with absolute paths,
rather than assuming all paths given will be relative to the current
directory. This enables tools like Fixit to provide a root path, and
have the FullyQualifiedNameProvider correctly scope the final results
relative to that root path.
This does require that both the root path and the given file paths
match the other as relative or absolute, due to the
`calculate_module_and_package` helper comparing file paths relative
to the root path, but this seems like a reasonable tradeoff, and
unlikely to cause a problem in normal use cases.
Upgrading Pyre requires updating test fixtures with any upstream changes
to Pyre's query results for the `simple_class.py` fixture.
This adds a new `scripts/` directory to the repo, with a script to
regenerate test fixtures. The script regenerates the cache data fixture,
and updates the `TypeInferenceProvider` tests to use `assertDictEqual`
and helpful error messages for better behavior in future mismatches.
This also includes a slight bump to Pyre 0.9.10 to fix install issues on
Apple Silicon M1 Macs, and regenerated fixtures using the script above.
* Fix Github issue 855 - fail to parse with statement
When we added support for parenthesized with statements, the
grammar on the with itself was correct (it's a right and left
parenthesis around a comma-separated list of with-items, with
a possible trailing comma).
But inside of the "as" variation of the with_item rule we have a peek at
the next character, which was allowing for a comma or a colon. That peek
needs to also accept right parentheses - otherwise, if the last item
contains an `as` and has no trailing comma we fail to parse.
The bug is exercisecd by, for example, this code snippet:
```
with (foo, bar as bar,):
pass
```
The with_wickedness test fixture has been revised to include both
the plain and async variations of this example snippet with and without
trailing comma, and tests pass after the peek rule fix.
* Add more tests covering the plain expression form of `with_item`
* [ci] narrow python 3.11 version window
Also, quote the versions for consistency.
Signed-off-by: Vincent Fazio <vfazio@gmail.com>
* [ci] bump cibuildwheel to 2.11.2
Newer versions support building 3.11 wheels automatically, so just take
the latest currently available.
Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Signed-off-by: Vincent Fazio <vfazio@gmail.com>
* Raise black's output file's target version to 3.7, which is the lowest supported Python version that libcst can be run on
* Add to, instead of override, the exclusion rules of black
* Fix the bug that files in `stubs/libcst_native/` are inadvertently ignored by black
This is due to black's file exclusion mechanism is a file-system-unaware pure-string-based pattern match. We need to prepend "^/" to specify that we are referring to the root-level "native/" folder. Yeah, I know this looks strange, but blame black for it :) . See https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#configuration-format for further reference.
* It's conventional to use single-quote literal string to represent regular expression in TOML format, because in this way it doesn't perform any escaping
* When codemod, specify the black formatter to use the same target Python version we use
* Fix the `test_codemod_formatter_error_input` unit test
* Remove an unused import in `test_codemod_cli` module
* Enumeration members are singletons. Copying on them would be no-op
* Avoid generating unnecessary `pass` statement
* Several trivial refactor
* Avoid building unnecessary intermediate lists, which are mere slight waste of time and space
* Remove unused import, an overlook from commit 8e6bf9e9
* `collections.abc.Mapping.get()` defaults to return `None` when key doesn't exist
* Just use unittest's `assertRaises` to specify expected exception types, instead of catching every possible `Exception`s, which could suppress legitimate errors and hide bugs
* We know for sure that the body of `CSTTypedTransformerFunctions` won't be empty, so don't bother with complex formal completeness
* When codemod, specify the black formatter to use the same target Python version we use
* Fix the `test_codemod_formatter_error_input` unit test
* Remove an unused import in `test_codemod_cli` module
* Raise informative exception when metadata is unresolved in a metadata-based match, instead of silently hide potential errors
* Fix unit test of `findall`
* Add unit test to cover the case of a resolved metadata provider doesn't provide metadata for all nodes
* Document the behavior of metadata-based match when the metadata provider is unresolved
There are no available binary wheels for lxml for Windows & Python 3.11 yet:
https://bugs.launchpad.net/lxml/+bug/1977998
Until that's resolved, let's skip tests in this configuration.
The render error originates from how we violate the syntax rules of the `field list` markup element of reStructuredText. The `specification of field list states](https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#field-lists) that a multi-line `field body` must be indented relative to the `field marker`.
* Implement lazy loading mechanism for expensive metadata providers
* Add support for lazy values in metadata matchers
* Fix type issues and implement lazy value support in base metadata provider too
* Add unit tests for BaseMetadataProvider
Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
* Cache the scope name prefix to prevent scope traversal in a tight loop
* Adding pyre-fixme. this attribute iclearly has a type in the base class.
* Clarify why we do join(filter(None,...
Correct the renamed import from structure when renaming last imported name from a module.
Given
from a.b import qux
print(qux)
And providing old_name="a.b.qux" and new_name="a:b.qux" I expect the
following output (as described int the command description):
from a import b
print(b.qux)
But what I get is:
from a import b.qux
print(b.qux)
It pulls the old name up into the new one.
The provided test is the important part but I've attempted a fix too. I
suspect there is a better one and that the special casing of the "this
is that last name" situation shouldn't be needed. For instance there is
import removing code in leave_Module and renaming the first of many
names (as opposed to the last) happily adds a correct import line.
I didn't manage to grok the code and all the concepts it requires to
provide a better fix though.
This leaves the alias adjustments to the existing code and just does the
module renaming the int he special casing block.
I don't know why scheduling removal of the updated node is required, it makes
the tests pass though.
* Qualify imported symbols when the dequalified form would cause a conflict.
Adds a preliminary pass that scans the stub file for all imported
symbols, and collects the ones that cannot be safely dequalified.
Fixes#673
* review fixes
* handle symbol conflicts between the stub and the main file
* fix type errors
* Always compute a module and package name
* Update name_provider to correctly support __main__ (also updated the tests to use data_provider)
* Update name_provider to correctly handle relative imports and package name
* Update relative module resolution to work on package names
* Use full_package_name in libcst.codemod.visitors.GatherImportsVisitor
* Use full_package_name in libcst.codemod.visitors.RemovedNodeVisitor
* Use full_package_name in libcst.codemod.visitors.AddImportsVisitor
* Fix failing test
* Fix typo in variable name
* PR feedback
* Force rebuild
* Support module and package names in the codemod context
* PR feedback
* Reorganize module name and relative name logic to libcst.helpers.module
* Force rebuild
## Summary
Our mission at Meta Open Source is to empower communities through open source, and we believe that it means building a welcoming and safe environment for all. As a part of this work, we are adding this banner in support for Ukraine during this crisis.
* Port c3b44cb9d3
* Port 138c97cb70
* Test harness for the next commit
* Port 2cdc4ba237
* Test harness for next commit
* Port 71c5da8169
* Remove no-longer-used import
The test wrongly assumed that `first_assignment.references`
is ordered collection, while actually it is `set`.
Fixes: https://github.com/Instagram/LibCST/issues/442
Signed-off-by: Stanislav Levin <slev@altlinux.org>
* Remove unneeded block
* Improve function name, add docstring
* Rename _is_set -> _is_non_sentinel
* Add docstring for FunctionKey
* Add class attributes with doc blocks to TypeCollector
* Extract Annotations into a single abstraction, not two
* Nits + fix flake8
* Add ApplyTrailingCommas codemod
This codemod adds trailing commas to parameter and arguments
lists when there are sufficient arguments or parameters.
The idea is this:
- both black and yapf will generally split lines when there
are trailing commas at the end of a parameter / arguments list
- It's easier on my eye to have names and types in more predictable
locations within a function header, i.e. left-aligned. And in
function calls, I also find it easier to compare arguments to
function parameters whenever the arguments are one-per line, at
least when there are more than two arguments.
By default, we ensure trailing commas for functions with one or more
parameters (but do not include `self` or `cls` method arguments) which
is suitable for `black`, and calls with 3 or more arguments.
Both the parameter count and the argument count can be overridden.
Moreover, by passing `--formatter yapf` someone can use the
yapf-suitable default of 2 parameters which is handy since then the
user doesn't have to memorize black vs yapf settings; this is necesary
because yapf does not split lines after a trailing comma in one-argument
defines.
```
> python -m unittest libcst.codemod.commands.tests.test_add_trailing_commas
......
----------------------------------------------------------------------
Ran 6 tests in 0.134s
OK
```
* Run ufmt, fix type error
* Bump argument counts down to 2
I find it difficult sometimes to read method and function signatures
when there are multiple arguments with type annotations - left-aligning
the arguments makes it much easier for me to skim and see, using mostly
my automatic visual resoning,
- the argument names
- the argument types
- the return type
Without this, I feel like I'm trying to run a parser in my head, which
is not as fast and distracts me from code-skimming.
This change was generated using thte new AddTrailingCommas codemod
(which I'll put in a separate PR) via the command
```
python -m libcst.tool codemod add_trailing_commas.AddTrailingCommas ./libcst/codemod/visitors/_apply_type_annotations.py
```
Wait for CI - this is pure formatting, it should be very safe
* Add support for methods with func type comment excluding self/cls
PEP 484 doesn't really specify carefully how function type
comments should work on methods, but since usually the type of
`self` / `cls` is automatic, most use cases choose to only annotate
the other arguments.
As a result, this commit modifies our codemod so that non-static
methods can specify either all the arguments, or all but one of
them. We'll correctly zip together the inline and func-type-comment
types either way, typically getting no type for `cls` or `self`.
We accomplish this by using matchers to trigger the visit
method for FunctionDef rather than using visit_FunctionDef, which gives
us enough context to determine when a function def is a regular
function versus a method (plus also matching the decorators against
`@staticmethod`, so that we trigger the normal function logic in
that case).
Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
* Handle syntax errors in the ast parse function.
If we encounter a syntax error in either the type comment extraction
or the type comment parsing stages, ignore type information on that
cst node.
* Quote the FunctionType type, which does not exist in Python x3.6
I've tested all of the edge cases I know of: type comments in various
locations, non-type-comments, arity mismatches where we should skip,
etc.
Assuming that all type comments parse, this should work as far as I
know. I'll make a separate PR to deal with SyntaxErrors when parsing
types, because that is cross-cutting and not specific to FunctionDef.
* Add full support type comment -> PEP 526 conversion
Summary:
In the previous PR, I added basic support for converting an
Assign with a type comment to an AnnAssign, as long as there was
only one target.
This PR handles all fully PEP 484 compliant cases:
- multiple assignments
- multiple elements in the LHS l-value
We cannot handle arity errors because there's no way to do it. And
we don't try to handle the ambiguous case of multiple assignments with
mismatched arities (PEP 484 isn't super clear on which LHS is supposed
to pick up the type, we are conservative here). The ambiguous case is
probably very uncommon in real code anyway, multiple assignment is not
a widely used feature.
Test Plan:
There are new test cases covering:
- multiple elements in the LHS
- multiple assignment
- both of the above together
- semicolon expansion, which is handled differently in the cases
where we have to add type declarations
- new error cases:
- mismatched arity in both directions on one assignment
- mismatched arity in multiple assignment
```
> python -m unittest libcst.codemod.commands.tests.test_convert_type_comments
.....
----------------------------------------------------------------------
Ran 5 tests in 0.150s
OK
```
* Tracks TypeVars that are used in type annotations in the pyi file, and
adds their Assign statements to the merged file.
* Adds Generic[T] as a base class if needed.
* Codemod for PEP 484 Assign w / type comments -> PEP 526 AnnAssign
Summary:
This codemod is intended to eventually handle all type comments from
PEP 484. This is a partial implementation specifically handling
assignment type comments, which as of PEP 526 are better dealt
with using AnnAssign nodes.
There is more work to do because there are two other kinds of
comments to support: function heading comments and function parameter
inline comments. But the PEP 526 functionality is complete so I feel
like it's worth havign a PR / CI signals / code review at this stage.
Test Plan:
```
python -m unittest libcst.codemod.commands.tests.test_convert_type_comments
```
* Disable on python 3.6, 3.7
The ast module didn't get the `type_comment` information we need
until python 3.8.
It is possible but not a priority right now to enable 3.6 and 3.7
via the typed_ast library, for now I just throw a NotImplementedError
with a nice description. There's a note in the code about where to look
for a typed_ast example in case anyone wants to add support in the
future.
* Fix type errors on the 3.8+ testing fix
* Do a better job of complaining on Python < 3.8
* Updates based on code review
Summary:
Do not strip type comments in the visitor pattern; instead,
reach down from the parent to do it because this makes it
much more reliable that we won't accidentally remove
other comments in a codemod (using visitor state to do this
isn't really feasible once we handle complex statements like
FunctionDef, With, For).
Handle multi-statement statement lines; this works since the
trailing whitespace can only apply to the final statement on
the line. It's not really a critical edge case to handle, but
the code is no more complicated so we might as well.
* Prevent comment stripping for multi-assign
* Note in the docstring that this is a limited WIP
* Reorder checks so the next step will be cleaner
* Use precise signature matching when inserting function type annotations
* add type annotations
* Add an argument for strict annotation matching.
* don't use Any
* Support relative imports in AddImportsVisitor.
* Adds an Import dataclass to represent a single imported object
* Refactors AddImportsVisitor to pass around Import objects
* Separates out the main logic in get_absolute_module_for_import so that
it can be used to resolve relative module names outside of a cst.Import
node
* Resolves relative module names in AddImportsVisitor if we have a
current module name set.
Fixes#578
On the python side, we can add parentheses from MaybeSentinel.DEFAULT if the whitespace requires it.
On the rust side, we support the new grammar but codegen will only add explicitly included parentheses for now - it should be possible to match python behavior but it's not urgent so I've left a TODO
It took me some time to track down the root cause of `children`
not matching codegen, having the error message directly hint that
visit and codegen are probably mimatched (my visit was running
out-of-order) will likely help newbies get going faster.
* ParenthesizedNode implementation for Box
* match statement rust CST and grammar
* match statement python CST and docs
* run rust unit tests in release mode for now
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms.
Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR.
The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by
1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper)
2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree
3. using the anchor tokens to **inflate** the syntax tree into a proper CST
Co-authored-by: Benjamin Woodruff <github@benjam.info>
* Add ImportAssignment class and record it from Scope
* Add overrides for LocalScope and ClassScope
* Clean scope_provider code and use ImportAssignment class in `unusued_imports` codemod
* Add missing types
* Fix fixit errors
* Fix ScopeProvider when string type annotation is unparsable
* Handle nested function calls w/in type declarations
* Edit stack in place
* Add unparsed test to test_cast
* Swallow parsing errors in string annotations.
This is the same behavior as cPython.
I've also rewritten the test that was relying on this exception to check where type parsing was happening
* Fix pyre error
Based on diff review of https://github.com/Instagram/LibCST/pull/536,
I investigated relatvie import handling and realized that with minor
changes we can now handle them correctly.
Relative imports aren't likely in code coming from an automated
tool, but they could happen in hand-written stubs if anyone tries
to use this codemod tool to merge stubs with code.
Added a new test:
```
> python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations
.............................................
----------------------------------------------------------------------
Ran 45 tests in 2.195s
OK
```
Make sure that the MetadataWrapper to resolves the requested providers vs the existing metadata results and prevent a single provider from being invoked multiple times.
Note: I'm pushing this because it works, but I actually want to
add annotation counting first and then modify the code so that we
only add the import if an annotation was actually included.
In ApplyTypeAnnotationsVisitor, there are edge cases where we
might have changed the module imports even though we never wound
up applying any type annotations.
This will become even more common if we support adding
`from __future__ import annotations`, which I would like to do
soon.
To handle this, we can simply return the original tree from
`transform_module_impl` (discarding any changes from either
`self` or `AddImportsVisitor`) whenever there are no changes
in `self.annotation_counts`.
I updated the no-annotations-changed test to reflect this:
```
> python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations.TestApplyAnnotationsVisitor
...............................................
----------------------------------------------------------------------
Ran 47 tests in 2.312s
OK
```
The existing TypeCollector visitor logic attempted to
fold actual imports from stubs together with the module
we were annotating, and separately do nice things with the
names of types so that we could parse stubs written either
with various sorts of proper imports *or* stubs written
using bare fully-qualified type names (which isn't
actually legal python, but is easy to produce from automated
tools like `pyre infer`).
In this commit I simplify things in principle - meaning the
data flow is simpler, although the code is still similarly
complex - by using `QualifiedNameProvider` plus a fallback
to `get_full_name_for_node` to handle all cases via
fully-qualified names, so that the way a stub chooses to
lay out its imports is no longer relevant to how we will
understand it.
As a result, we can scrap a whole test suite where we
were understanding edge cases in the import handling, and
moreover one of the weird unsupported edge cases is now
well supported.
The tests got simpler because some edge cases no longer
matter (the whole imports test is no longer relevant),
and a couple of weird edge cases were fixed.
I ran tests with
```
python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations.TestApplyAnnotationsVisitor
```
I tried to make this change minimal in that I preserve the
existing data flow, so that it's easy to review. But it's worth
considering whether to follow up with a diff where we change
the TypeAnnotationCollector into a *transform* rather than a
*visitor*, because that would allow us to scrap quite a bit
of logic - all we would need to know is a couple of bits
of context from higher up in the tree and we could process
Names and Attributes without needing all this recursion.
Refactor ApplyTypeAnnotationsVisitor so that all annotation
information is added via a smart constructor method starting
with `_apply_annotation_to`. This makes it much easier to
skim the code and understand where annotations are actually
added with a simple forward search.
Then, add an AnnotationCounts dataclass and count up all the
annotations we add inside the transform. This should be helpful
for a few reasons:
- First, it just makes counting the annotations easier. Prior
to this change, we would have to run some separate command
to count annotations before and after a codemod, which is
not as convenient as doing it directly, and would also fail
to account for cases where we changed an annotation.
- Second, I want to be able to avoid altering the import
statements in cases where we never actually made any changes.
Having annotation counts will help us do this - we can just
return the original tree (without import changes) in that
situtation.
```
> python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations.TestApplyAnnotationsVisitor
................................................
----------------------------------------------------------------------
Ran 48 tests in 1.773s
OK
```
(
I'm not really sure how the method got there, but it was calling
itself recursively... fortunately, it was also overwritten by
an identically named method so it was actually impossible to access.
The existing two tests didn't make it clear what exactly we wanted
to verify, which is two things:
- that we can successfully annotate async functions with decorators
- that it doesn't matter whether or not the async and decorator
information is part of the stubs - we need it to be permissible
because a "real" stubs file would have this, but stubs generated
by tools like pyre infer shouldn't need to care, they only
really need to care about types
All of our tests follow one of two patterns: either populate
a context and transform using the default behavior, or test
when setting flags in either the context population and transform
steps (and verify that the behavior is the same in both cases).
So, extract these two patterns into helper functions. This improves
readability of the existing code a bit, and will be even more helpful
if we split apart the monster test `test_annotate_functions` (which
I would like to do soon - the list of test cases is so big that it's
hard to jump to the relevant section when trying to verify behaviors).
which ensures we won't have inconsistent black-vs-isort errors
going forward. We can always format by running `ufmt format .`
at the root, and check with `ufmt check .` in our CI actions.
* Use setuptools-scm to derive the current version from git metadata
* Add Github Action equivalent to the current circleci tasks
* Run pyre integration test in GH action / tox
2021-08-10 15:01:16 +01:00
437 changed files with 47977 additions and 5962 deletions
1. Add a new entry to `CHANGELOG.md` (I normally use the [new release page](https://github.com/Instagram/LibCST/releases/new) to generate a changelog, then manually group)
1. Follow the existing format: `Fixed`, `Added`, `Updated`, `Deprecated`, `Removed`, `New Contributors` sections, and the full changelog link at the bottom.
1. Mention only user-visible changes - improvements to CI, tests, or development workflow aren't noteworthy enough
1. Version bumps are generally not worth mentioning with some notable exceptions (like pyo3)
1. Group related PRs into one bullet point if it makes sense
2. manually bump versions in `Cargo.toml` files in the repo
3. run `cargo update -p libcst`
4. make a new PR with the above changes, get it reviewed and landed
5. make a new release on Github, create a new tag on publish, and copy the contents of the changelog entry in there
6. after publishing, check out the repo at the new tag, and run `cd native; cargo +nightly publish -Z package-workspace -p libcst_derive -p libcst`
"To find all unused imports, we iterate through :attr:`~libcst.metadata.Scope.assignments` and an assignment is unused when its :attr:`~libcst.metadata.BaseAssignment.references` is empty. To find all undefined references, we iterate through :attr:`~libcst.metadata.Scope.accesses` (we focus on :class:`~libcst.Import`/:class:`~libcst.ImportFrom` assignments) and an access is undefined reference when its :attr:`~libcst.metadata.Access.referents` is empty. When reporting the warning to developer, we'll want to report the line number and column offset along with the suggestion to make it more clear. We can get position information from :class:`~libcst.metadata.PositionProvider` and print the warnings as follows.\n"
"To find all unused imports, we iterate through :attr:`~libcst.metadata.Scope.assignments` and an assignment is unused when its :attr:`~libcst.metadata.BaseAssignment.references` is empty. To find all undefined references, we iterate through :attr:`~libcst.metadata.Scope.accesses` (we focus on :class:`~libcst.Import`/:class:`~libcst.ImportFrom` assignments) and an access is undefined reference when its :attr:`~libcst.metadata.Access.referents` is empty. When reporting the warning to the developer, we'll want to report the line number and column offset along with the suggestion to make it more clear. We can get position information from :class:`~libcst.metadata.PositionProvider` and print the warnings as follows.\n"
]
},
{
@ -136,13 +136,13 @@
"Automatically Remove Unused Import\n",
"==================================\n",
"Unused import is a commmon code suggestion provided by lint tool like `flake8 F401 <https://lintlyci.github.io/Flake8Rules/rules/F401.html>`_ ``imported but unused``.\n",
"Even though reporting unused import is already useful, with LibCST we can provide automatic fix to remove unused import. That can make the suggestion more actionable and save developer's time.\n",
"Even though reporting unused imports is already useful, with LibCST we can provide an automatic fix to remove unused imports. That can make the suggestion more actionable and save developer's time.\n",
"\n",
"An import statement may import multiple names, we want to remove those unused names from the import statement. If all the names in the import statement are not used, we remove the entire import.\n",
"To remove the unused name, we implement ``RemoveUnusedImportTransformer`` by subclassing :class:`~libcst.CSTTransformer`. We overwrite ``leave_Import`` and ``leave_ImportFrom`` to modify the import statements.\n",
"When we find the import node in lookup table, we iterate through all ``names`` and keep used names in ``names_to_keep``.\n",
"When we find the import node in the lookup table, we iterate through all ``names`` and keep used names in ``names_to_keep``.\n",
"If ``names_to_keep`` is empty, all names are unused and we remove the entire import node.\n",
"Otherwise, we update the import node and just removing partial names."
"Otherwise, we update the import node and just remove partial names."
]
},
{
@ -195,7 +195,7 @@
"raw_mimetype": "text/restructuredtext"
},
"source": [
"After the transform, we use ``.code`` to generate fixed code and all unused names are fixed as expected! The difflib is used to show only changed part and only import lines are updated as expected."
"After the transform, we use ``.code`` to generate the fixed code and all unused names are fixed as expected! The difflib is used to show only the changed part and only imported lines are updated as expected."
"LibCST provides helpers to parse source code string as concrete syntax tree. In order to perform static analysis to identify patterns in the tree or modify the tree programmatically, we can use visitor pattern to traverse the tree. In this tutorial, we demonstrate a common three-step-workflow to build an automated refactoring (codemod) application:\n",
"LibCST provides helpers to parse source code string as a concrete syntax tree. In order to perform static analysis to identify patterns in the tree or modify the tree programmatically, we can use the visitor pattern to traverse the tree. In this tutorial, we demonstrate a common four-step-workflow to build an automated refactoring (codemod) application:\n",
"\n",
"1. `Parse Source Code <#Parse-Source-Code>`_\n",
"2. `Build Visitor or Transformer <#Build-Visitor-or-Transformer>`_\n",
"LibCST provides various helpers to parse source code as concrete syntax tree: :func:`~libcst.parse_module`, :func:`~libcst.parse_expression` and :func:`~libcst.parse_statement` (see :doc:`Parsing <parser>` for more detail). The default :class:`~libcst.CSTNode` repr provides pretty print formatting for reading the tree easily."
"LibCST provides various helpers to parse source code as a concrete syntax tree: :func:`~libcst.parse_module`, :func:`~libcst.parse_expression` and :func:`~libcst.parse_statement` (see :doc:`Parsing <parser>` for more detail)."
]
},
{
@ -41,7 +42,42 @@
"source": [
"import libcst as cst\n",
"\n",
"cst.parse_expression(\"1 + 2\")"
"source_tree = cst.parse_expression(\"1 + 2\")"
]
},
{
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"cell_type": "raw",
"source": [
"|\n",
"Display Source Code CST\n",
"=======================\n",
"The default :class:`~libcst.CSTNode` repr provides pretty print formatting for displaying the entire CST tree."
]
},
{
"metadata": {},
"cell_type": "code",
"outputs": [],
"execution_count": null,
"source": "print(source_tree)"
},
{
"metadata": {},
"cell_type": "raw",
"source": "The entire CST tree may be overwhelming at times. To only focus on essential elements of the CST tree, LibCST provides the ``dump`` helper."
},
{
"metadata": {},
"cell_type": "code",
"outputs": [],
"execution_count": null,
"source": [
"from libcst.display import dump\n",
"\n",
"print(dump(source_tree))"
]
},
{
@ -50,9 +86,11 @@
"raw_mimetype": "text/restructuredtext"
},
"source": [
" \n",
"|\n",
"Example: add typing annotation from pyi stub file to Python source\n",
"Python `typing annotation <https://mypy.readthedocs.io/en/latest/cheat_sheet_py3.html>`_ was added in Python 3.5. Some Python applications add typing annotations in separate ``pyi`` stub files in order to support old Python versions. When applications decide to stop supporting old Python versions, they'll want to automatically copy the type annotation from a pyi file to a source file. Here we demonstrate how to do that easliy using LibCST. The first step is to parse the pyi stub and source files as trees."
"Python `typing annotation <https://mypy.readthedocs.io/en/latest/cheat_sheet_py3.html>`_ was added in Python 3.5. Some Python applications add typing annotations in separate ``pyi`` stub files in order to support old Python versions. When applications decide to stop supporting old Python versions, they'll want to automatically copy the type annotation from a pyi file to a source file. Here we demonstrate how to do that easily using LibCST. The first step is to parse the pyi stub and source files as trees."
"For traversing and modifying the tree, LibCST provides Visitor and Transformer classes similar to the `ast module <https://docs.python.org/3/library/ast.html#ast.NodeVisitor>`_. To implement a visitor (read only) or transformer (read/write), simply implement a subclass of :class:`~libcst.CSTVisitor` or :class:`~libcst.CSTTransformer` (see :doc:`Visitors <visitors>` for more detail).\n",
"In the typing example, we need to implement a visitor to collect typing annotation from the stub tree and a transformer to copy the annotation to the function signature. In the visitor, we implement ``visit_FunctionDef`` to collect annotations. Later in the transformer, we implement ``leave_FunctionDef`` to add the collected annotations."
"In the typing example, we need to implement a visitor to collect typing annotations from the stub tree and a transformer to copy the annotation to the function signature. In the visitor, we implement ``visit_FunctionDef`` to collect annotations. Later in the transformer, we implement ``leave_FunctionDef`` to add the collected annotations."
]
},
{
@ -113,7 +152,7 @@
" self.stack: List[Tuple[str, ...]] = []\n",
" # store the annotations\n",
" self.annotations: Dict[\n",
" Tuple[str, ...], # key: tuple of cononical class/function name\n",
" Tuple[str, ...], # key: tuple of canonical class/function name\n",
"Generating the source code from a cst tree is as easy as accessing the :attr:`~libcst.Module.code` attribute on :class:`~libcst.Module`. After the code generation, we often use `Black <https://black.readthedocs.io/en/stable/>`_ and `isort <https://isort.readthedocs.io/en/stable/>`_ to reformate the code to keep a consistent coding style."
"Generating the source code from a cst tree is as easy as accessing the :attr:`~libcst.Module.code` attribute on :class:`~libcst.Module`. After the code generation, we often use `ufmt <https://ufmt.omnilib.dev/en/stable/>`_ to reformat the code to keep a consistent coding style."