Previously, `libcst.Module.code_for_node` accepted a `provider`
parameter, and would construct the appropriate CodegenState subclass
based on some if/else logic.
This had a few knock-on effects:
- A tighter circular dependency between node definitions and metadata,
which was previously mitigated with an inner import.
- Adding a new `CodegenState` subclass required the non-obvious task of
modifying `Module`. I'll need to add a new `CodegenState` subclass to
support incremental codegen.
- What was intended to be a private implementation detail (how positions
are computed by hooking into codegen) was exposed as a parameter on a
public method.
This diff aims to clean up those knock on effects. The position-related
subclasses have been moved from `libcst.nodes._internal` into
`libcst.metadata.position_provider`, which keeps more of the position
computation logic together.
Technically this is a breaking change. If somebody was passing the
second parameter into `code_for_node`, their code will break. However:
- It will break in a clear and obvious way.
- This second parameter was never documented (aside from my recent
addition of some remarks telling people not to use it). There's plenty
of documentation that shows how to fetch positions properly.
So it's my opinion that we shouldn't require a major version bump for
this change.
This allows a slightly more convenient way of calling 'matches' with metadata inside a transform/visitor which also supports matcher decorators. It also hooks in automatic metadata connection and lookup for matcher decorators themselves.
This adds the ability to match on metadata for a node as if it was an attribute on the node. It also opens up the ability to match on node attributes via metadata only. It allows for metadata match specifications with similar flexibility to standard matchers.
If you have such a program like "pass\\\n", this is technically a program without a trailing newline, since line continuations are defined as being a `\` followed by a newline. We were misdetecting this as having a trailing newline, thus making it impossible to parse the continuation. Add some tests to verify this behavior and then fix the problem.
Note that this was found via hypothesis.
I missed this in #114 when renaming `SyntacticPositionProvider` to
`PositionProvider` because it's a different file extension and I was
only grepping for rst and py files.
While these classes are used by the codegen implementation, conceptually
they're part of `libcst.metadata`, so we should export them from
`libcst.metadata` instead of the top-level `libcst` package.
This makes sure we always wrap elements in a SubscriptElement, even when there
is only one element. This makes things more regular while still being backwards
compatible with existing creation. The meat of this is in two halves, which can't
be split due to not wanting to break the build between commits. The first half
is just the changes to the parser and updates to tests. This includes a test to
be sure we can still render code that uses old construction types. The second half
is changes to codegen which made assumptions about `Subscript` and demonstrates
the need to make this change in the first place. This includes a fix to
`CSTNode.with_deep_changes` type to make it more correct and also more usable in
transforms without additional type assertions.
I discussed the high-level idea here with @DragonMinded a few months
ago, but this isn't set in stone. If people have better ideas for names,
I'd love to hear it.
Publicly-Visible Changes
------------------------
- SyntacticPositionProvider is deprecated. The new name is
PositionProvider.
- BasicPositionProvider is deprecated. The new name is
WhitespaceInclusivePositionProvider.
- Documentation is updated to better explain these renamed providers and
how to use them.
The prefixes "Syntactic" and "Basic" were pretty bad because they're
just concepts that we made up for LibCST.
The idea for the new names is that most users will want the
SyntacticPositionProvider, and so we should name things so that the user
will naturally gravitate towards the correct choice.
There's some argument that we shouldn't even bother exposing
WhitespaceInclusivePositionProvider, but we already need to implement it
as a fallback for PositionProvider, and it might be useful for some
niche use-cases.
Once we have another major version bump, we can remove the old class
names. The old class names have already be removed from the
documentation so that new users aren't tempted to use them.
Internal-Only Changes
---------------------
- `PositionProvider` is now `_PositionProviderUnion`. This type alias
was never a public API (and probably never will be).
- `BasicCodegenState` is now
`WhitespaceInclusivePositionProvidingCodegenState`.
- `SyntacticCodegenState` is now `PositionProvidingCodegenState`.
This is somewhat complicated by the fact that we need to not just allow
construction of nodes/matchers using `ExtSlice` still for backwards compatibility,
but we also need to be able to call `visit_ExtSlice` and `leave_ExtSlice` on
old visitors even though the new node is named `SubscriptElement`. The
construction/instance check/matching side of things will work since internally we
refer to everything as `SubscriptElement` and alias `ExtSlice` to this everywhere,
but for string-based function lookup, we need to get a little more clever and make
the default `visit_SubscriptElement` delegate onward to `visit_ExtSlice` so that
either form works.
This can all be removed again once we're past the deprecation period for ExtSlice.
Explaining the implementation details of scopes to someone unfamiliar
with compilers can be tricky. Hopefully this helps.
- Rephrased the definition of a scope to be more applicable to Python
(remove references to "blocks"), and made it use an example for
(hopefully) better clarity.
- New scopes are also created for comprehensions.
- Set a fixed width (400px) for the scope diagram, since it was too
large before.
- Tweaked some tenses.
- Add a final call to action: "LibCST allows you to inspect these
scopes"
I noticed that the typed visitors codegen was creating messy types such as Union[SingleType]. We have a clean-up for Union[SingleType] snuck into gen_matcher_classes. So, lets remove that snuck-in clean-up from gen_matcher_classes and apply it globally to all codegen. This makes its purpose a lot more obvious, while helping decouple necessary codegen from prettifying/simplifying transforms. It also cleans up typed visitors, so that's a bonus.
This makes matcher equality and hash equivalent to the way LibCST nodes behave. Not only does this make us more consistent, but it also fixes a bug where matcher decorators could not be used with a matcher that initialized a sequence type as a list.
Because we'd consider `BaseMetadataProvider[int]` to be a subtype of
`BaseMetadataProvider[object]`, it should be covariant over its
typevar, rather than invariant.
This isn't entirely correct because we have a mutable data structure
(`_computed`) that depends on the typevar, and pyre points this out
(though with a really confusing error message). However, it's not
correct to say that `BaseMetadataProvider` is invariant either, so I
think this is the lesser evil.
I don't think it's practical to redesign this API to avoid the variance
issue, so I'm ignoring the new type error that results from this change.
I think this may resolve some of the issues we've seen internally with
D17820032.