from circleci pyre tests:
> libcst/_typed_visitor_base.py:15:0 Invalid type [31]: Expression `Variable[F (bound to typing.Callable[[libcst._typed_visitor.CSTTypedBaseFunctions, Variable[libcst._typed_visitor_base.T]], None])]` is not a valid type. Type variables cannot contain other type variables in their constraints.
This is especially helpful for checking qualified names of nodes against one item
or a list of items that you wish to match against. I chose to create a new matcher
instead of widening the type of `MatchMetadata` to take in either a value or
a callable. I was originally going to do the former, but having a MatchIfTrue
and a MatchMetadataIfTrue felt more orthogonal and became easier to explain than
a single MatchIfTrue that could take two types of values.
Findall does what you expect given a matcher and a tree: It returns all nodes
that exist in that tree which match the given matcher. For convenience, findall
works with regular LibCST trees as well as MetadataWrappers. It is also provided
as a helper method on the matcher transforms.
This does a few things, since it was easier to combine than separate:
- Uses type aliases instead of fully-unrolled types, making it far easier for a human to read.
- Makes codegen approximately 3x faster, which has the side effect of halving our test run times.
- Consolidate the way we use type aliases by dropping the MetadataPredicate type for now, increasing consistency.
- Widen types for AllOf/OneOf matchers to allow for MatchIfTrue since this is supported under the hood.
This results in a generated matchers file that is 1/3 the size it was, more readable by a human, and most importantly, faster to codegen, parse, format and typecheck.
In certain cases (e.g. inside Instagram's lint framework) we know that
our tree originates from the parser, so we know that there shouldn't be
any duplicate nodes in our tree.
MetadataWrapper exists to copy the tree ensuring that there's no
duplicate nodes.
This diff provides an escape hatch on MetadataWrapper that allows us to
save a little time and avoid a copy when we know that it's safe to skip
the copy.
As part of this, I ran into some issues with `InitVar` and pyre, so I
removed `@dataclass` from the class. This means that this is techincally
a breaking change if someone depended on the MetadataWrapper being an
actual dataclass, but I think this is unlikely. I implemented `__repr__`
and added tests for hashing/equality behavior.
We don't need to worry about formatting changes since we already paid the cost
in b3253de9b8. So, now that there's a new 3.8-compatible
Black, lets use it!
**Context:** This is an experimental performance optimization that we're
hoping to use for our internal linter at Instagram. I added some
documentation, but it's unsupported, and isn't very user-friendly.
This adds `ExperimentalReentrantCodegenProvider`, which tracks the
codegen's internal state (indentation level, character offsets,
encoding, etc.) and for each statement, it stores a `CodegenPartial`
object.
The `CodegenPartial` object has enough information about the previous
codegen pass to run the codegen on part of a tree and patch the result
back into the original module's string.
In cases where we need to generate a bunch of small independent patches
for the same file (and we can't just generate a new tree with each patch
applied), this *should* be a faster alternative.
I don't have any performance numbers because I still need to test this
end-to-end with our internal codebase, but I'd be shocked if it was
slower than what we're doing.
This could theoretically live outside of LibCST, but it depends on a
whole bunch of LibCST internals, so there's some value in making sure
that this is in sync with the rest of LibCST.
I need to do some additional work on visit/leave to make codegen
re-entrant, so this makes it more generic.
This should have an additional small positive effect of creating less
throwaway objects when we're doing codegen without position calculation.
There are going to be many more where this comes from, but these are the three cases that have come up most frequently, so I started with these. Hopefully this helps give additional direction to people using LibCST.
It turns out that when we had a sequence type, like Module's body, we were incorrectly
leaning on copy.deepcopy for the rest of the tree. This works, but its likely to blow
the stack when doing a deep clone of a very very large file. Fix that oversight here.
Let's split from just fuzzing on the default python to having a separate fuzzer for each version of python we support. This should help us catch any issues should they arise in differences with grammars across releases. Right now we're fuzz-clean as well, so I also bumped up the thresholds to stress test LibCST more. I anticipate that this will be useful when we begin work on 3.8 support.