* Codemod for PEP 484 Assign w / type comments -> PEP 526 AnnAssign
Summary:
This codemod is intended to eventually handle all type comments from
PEP 484. This is a partial implementation specifically handling
assignment type comments, which as of PEP 526 are better dealt
with using AnnAssign nodes.
There is more work to do because there are two other kinds of
comments to support: function heading comments and function parameter
inline comments. But the PEP 526 functionality is complete so I feel
like it's worth havign a PR / CI signals / code review at this stage.
Test Plan:
```
python -m unittest libcst.codemod.commands.tests.test_convert_type_comments
```
* Disable on python 3.6, 3.7
The ast module didn't get the `type_comment` information we need
until python 3.8.
It is possible but not a priority right now to enable 3.6 and 3.7
via the typed_ast library, for now I just throw a NotImplementedError
with a nice description. There's a note in the code about where to look
for a typed_ast example in case anyone wants to add support in the
future.
* Fix type errors on the 3.8+ testing fix
* Do a better job of complaining on Python < 3.8
* Updates based on code review
Summary:
Do not strip type comments in the visitor pattern; instead,
reach down from the parent to do it because this makes it
much more reliable that we won't accidentally remove
other comments in a codemod (using visitor state to do this
isn't really feasible once we handle complex statements like
FunctionDef, With, For).
Handle multi-statement statement lines; this works since the
trailing whitespace can only apply to the final statement on
the line. It's not really a critical edge case to handle, but
the code is no more complicated so we might as well.
* Prevent comment stripping for multi-assign
* Note in the docstring that this is a limited WIP
* Reorder checks so the next step will be cleaner
* Use precise signature matching when inserting function type annotations
* add type annotations
* Add an argument for strict annotation matching.
* don't use Any
* Support relative imports in AddImportsVisitor.
* Adds an Import dataclass to represent a single imported object
* Refactors AddImportsVisitor to pass around Import objects
* Separates out the main logic in get_absolute_module_for_import so that
it can be used to resolve relative module names outside of a cst.Import
node
* Resolves relative module names in AddImportsVisitor if we have a
current module name set.
Fixes#578
On the python side, we can add parentheses from MaybeSentinel.DEFAULT if the whitespace requires it.
On the rust side, we support the new grammar but codegen will only add explicitly included parentheses for now - it should be possible to match python behavior but it's not urgent so I've left a TODO
It took me some time to track down the root cause of `children`
not matching codegen, having the error message directly hint that
visit and codegen are probably mimatched (my visit was running
out-of-order) will likely help newbies get going faster.
* ParenthesizedNode implementation for Box
* match statement rust CST and grammar
* match statement python CST and docs
* run rust unit tests in release mode for now
This massive PR implements an alternative Python parser that will allow LibCST to parse Python 3.10's new grammar features. The parser is implemented in Rust, but it's turned off by default through the `LIBCST_PARSER_TYPE` environment variable. Set it to `native` to enable. The PR also enables new CI steps that test just the Rust parser, as well as steps that produce binary wheels for a variety of CPython versions and platforms.
Note: this PR aims to be roughly feature-equivalent to the main branch, so it doesn't include new 3.10 syntax features. That will be addressed as a follow-up PR.
The new parser is implemented in the `native/` directory, and is organized into two rust crates: `libcst_derive` contains some macros to facilitate various features of CST nodes, and `libcst` contains the `parser` itself (including the Python grammar), a `tokenizer` implementation by @bgw, and a very basic representation of CST `nodes`. Parsing is done by
1. **tokenizing** the input utf-8 string (bytes are not supported at the Rust layer, they are converted to utf-8 strings by the python wrapper)
2. running the **PEG parser** on the tokenized input, which also captures certain anchor tokens in the resulting syntax tree
3. using the anchor tokens to **inflate** the syntax tree into a proper CST
Co-authored-by: Benjamin Woodruff <github@benjam.info>
* Add ImportAssignment class and record it from Scope
* Add overrides for LocalScope and ClassScope
* Clean scope_provider code and use ImportAssignment class in `unusued_imports` codemod
* Add missing types
* Fix fixit errors
* Fix ScopeProvider when string type annotation is unparsable
* Handle nested function calls w/in type declarations
* Edit stack in place
* Add unparsed test to test_cast
* Swallow parsing errors in string annotations.
This is the same behavior as cPython.
I've also rewritten the test that was relying on this exception to check where type parsing was happening
* Fix pyre error
Based on diff review of https://github.com/Instagram/LibCST/pull/536,
I investigated relatvie import handling and realized that with minor
changes we can now handle them correctly.
Relative imports aren't likely in code coming from an automated
tool, but they could happen in hand-written stubs if anyone tries
to use this codemod tool to merge stubs with code.
Added a new test:
```
> python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations
.............................................
----------------------------------------------------------------------
Ran 45 tests in 2.195s
OK
```
Make sure that the MetadataWrapper to resolves the requested providers vs the existing metadata results and prevent a single provider from being invoked multiple times.
Note: I'm pushing this because it works, but I actually want to
add annotation counting first and then modify the code so that we
only add the import if an annotation was actually included.
In ApplyTypeAnnotationsVisitor, there are edge cases where we
might have changed the module imports even though we never wound
up applying any type annotations.
This will become even more common if we support adding
`from __future__ import annotations`, which I would like to do
soon.
To handle this, we can simply return the original tree from
`transform_module_impl` (discarding any changes from either
`self` or `AddImportsVisitor`) whenever there are no changes
in `self.annotation_counts`.
I updated the no-annotations-changed test to reflect this:
```
> python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations.TestApplyAnnotationsVisitor
...............................................
----------------------------------------------------------------------
Ran 47 tests in 2.312s
OK
```
The existing TypeCollector visitor logic attempted to
fold actual imports from stubs together with the module
we were annotating, and separately do nice things with the
names of types so that we could parse stubs written either
with various sorts of proper imports *or* stubs written
using bare fully-qualified type names (which isn't
actually legal python, but is easy to produce from automated
tools like `pyre infer`).
In this commit I simplify things in principle - meaning the
data flow is simpler, although the code is still similarly
complex - by using `QualifiedNameProvider` plus a fallback
to `get_full_name_for_node` to handle all cases via
fully-qualified names, so that the way a stub chooses to
lay out its imports is no longer relevant to how we will
understand it.
As a result, we can scrap a whole test suite where we
were understanding edge cases in the import handling, and
moreover one of the weird unsupported edge cases is now
well supported.
The tests got simpler because some edge cases no longer
matter (the whole imports test is no longer relevant),
and a couple of weird edge cases were fixed.
I ran tests with
```
python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations.TestApplyAnnotationsVisitor
```
I tried to make this change minimal in that I preserve the
existing data flow, so that it's easy to review. But it's worth
considering whether to follow up with a diff where we change
the TypeAnnotationCollector into a *transform* rather than a
*visitor*, because that would allow us to scrap quite a bit
of logic - all we would need to know is a couple of bits
of context from higher up in the tree and we could process
Names and Attributes without needing all this recursion.
Refactor ApplyTypeAnnotationsVisitor so that all annotation
information is added via a smart constructor method starting
with `_apply_annotation_to`. This makes it much easier to
skim the code and understand where annotations are actually
added with a simple forward search.
Then, add an AnnotationCounts dataclass and count up all the
annotations we add inside the transform. This should be helpful
for a few reasons:
- First, it just makes counting the annotations easier. Prior
to this change, we would have to run some separate command
to count annotations before and after a codemod, which is
not as convenient as doing it directly, and would also fail
to account for cases where we changed an annotation.
- Second, I want to be able to avoid altering the import
statements in cases where we never actually made any changes.
Having annotation counts will help us do this - we can just
return the original tree (without import changes) in that
situtation.
```
> python -m unittest libcst.codemod.visitors.tests.test_apply_type_annotations.TestApplyAnnotationsVisitor
................................................
----------------------------------------------------------------------
Ran 48 tests in 1.773s
OK
```
(
I'm not really sure how the method got there, but it was calling
itself recursively... fortunately, it was also overwritten by
an identically named method so it was actually impossible to access.