I need to do some additional work on visit/leave to make codegen
re-entrant, so this makes it more generic.
This should have an additional small positive effect of creating less
throwaway objects when we're doing codegen without position calculation.
It turns out that when we had a sequence type, like Module's body, we were incorrectly
leaning on copy.deepcopy for the rest of the tree. This works, but its likely to blow
the stack when doing a deep clone of a very very large file. Fix that oversight here.
Previously, `libcst.Module.code_for_node` accepted a `provider`
parameter, and would construct the appropriate CodegenState subclass
based on some if/else logic.
This had a few knock-on effects:
- A tighter circular dependency between node definitions and metadata,
which was previously mitigated with an inner import.
- Adding a new `CodegenState` subclass required the non-obvious task of
modifying `Module`. I'll need to add a new `CodegenState` subclass to
support incremental codegen.
- What was intended to be a private implementation detail (how positions
are computed by hooking into codegen) was exposed as a parameter on a
public method.
This diff aims to clean up those knock on effects. The position-related
subclasses have been moved from `libcst.nodes._internal` into
`libcst.metadata.position_provider`, which keeps more of the position
computation logic together.
Technically this is a breaking change. If somebody was passing the
second parameter into `code_for_node`, their code will break. However:
- It will break in a clear and obvious way.
- This second parameter was never documented (aside from my recent
addition of some remarks telling people not to use it). There's plenty
of documentation that shows how to fetch positions properly.
So it's my opinion that we shouldn't require a major version bump for
this change.
While these classes are used by the codegen implementation, conceptually
they're part of `libcst.metadata`, so we should export them from
`libcst.metadata` instead of the top-level `libcst` package.
This makes sure we always wrap elements in a SubscriptElement, even when there
is only one element. This makes things more regular while still being backwards
compatible with existing creation. The meat of this is in two halves, which can't
be split due to not wanting to break the build between commits. The first half
is just the changes to the parser and updates to tests. This includes a test to
be sure we can still render code that uses old construction types. The second half
is changes to codegen which made assumptions about `Subscript` and demonstrates
the need to make this change in the first place. This includes a fix to
`CSTNode.with_deep_changes` type to make it more correct and also more usable in
transforms without additional type assertions.
Add a RemoveFromParent() function as a convenience to returning RemovalSentinel.REMOVE.
Introduce a `deep_remove()` on CSTNode analogous to `deep_replace()` but for removing.
We used _visit_and_replace_children to implement deep_clone, which was already
bad since it relied on the implicit behavior of _visit_and_replace_children to
make a copy on visit. If we fix that behavior in the future, deep_Clone would
have broken. However, its even more broken, because nodes that subclass from
BaseLeaf define their _visit_and_replace_children as returning self. So, not
only is this a bad coupling, but its also broken. Implement deep_clone properly
here.
When parsing, we don't always fill in defaults unless we have a good reason to. That, coupled with the fact that we use dataclasses that allow you to set a default on creation instead of runtime construction means that we accidentally aliased a whole bunch of SimpleWhitespace nodes. Fix that by switching to the datalasses field() method which allows runtime evaluation. We do this by creating a simple (untyped, unfortunately) helper on CSTNode which makes for easier creation.
Standardize on the convention that private modules (those we don't expect people to directly import) are prefixed with an underscore. Everything under a directory/module that has an underscore is considered private, unless it is re-exported from a non-underscored module. Most things are exported from libcst directly, but there are a few things in libcst.tool, libcst.codegen and libcst.metadata that are namedspaced as such.