which ensures we won't have inconsistent black-vs-isort errors
going forward. We can always format by running `ufmt format .`
at the root, and check with `ufmt check .` in our CI actions.
**Context:** This is an experimental performance optimization that we're
hoping to use for our internal linter at Instagram. I added some
documentation, but it's unsupported, and isn't very user-friendly.
This adds `ExperimentalReentrantCodegenProvider`, which tracks the
codegen's internal state (indentation level, character offsets,
encoding, etc.) and for each statement, it stores a `CodegenPartial`
object.
The `CodegenPartial` object has enough information about the previous
codegen pass to run the codegen on part of a tree and patch the result
back into the original module's string.
In cases where we need to generate a bunch of small independent patches
for the same file (and we can't just generate a new tree with each patch
applied), this *should* be a faster alternative.
I don't have any performance numbers because I still need to test this
end-to-end with our internal codebase, but I'd be shocked if it was
slower than what we're doing.
This could theoretically live outside of LibCST, but it depends on a
whole bunch of LibCST internals, so there's some value in making sure
that this is in sync with the rest of LibCST.
Previously, `libcst.Module.code_for_node` accepted a `provider`
parameter, and would construct the appropriate CodegenState subclass
based on some if/else logic.
This had a few knock-on effects:
- A tighter circular dependency between node definitions and metadata,
which was previously mitigated with an inner import.
- Adding a new `CodegenState` subclass required the non-obvious task of
modifying `Module`. I'll need to add a new `CodegenState` subclass to
support incremental codegen.
- What was intended to be a private implementation detail (how positions
are computed by hooking into codegen) was exposed as a parameter on a
public method.
This diff aims to clean up those knock on effects. The position-related
subclasses have been moved from `libcst.nodes._internal` into
`libcst.metadata.position_provider`, which keeps more of the position
computation logic together.
Technically this is a breaking change. If somebody was passing the
second parameter into `code_for_node`, their code will break. However:
- It will break in a clear and obvious way.
- This second parameter was never documented (aside from my recent
addition of some remarks telling people not to use it). There's plenty
of documentation that shows how to fetch positions properly.
So it's my opinion that we shouldn't require a major version bump for
this change.
I discussed the high-level idea here with @DragonMinded a few months
ago, but this isn't set in stone. If people have better ideas for names,
I'd love to hear it.
Publicly-Visible Changes
------------------------
- SyntacticPositionProvider is deprecated. The new name is
PositionProvider.
- BasicPositionProvider is deprecated. The new name is
WhitespaceInclusivePositionProvider.
- Documentation is updated to better explain these renamed providers and
how to use them.
The prefixes "Syntactic" and "Basic" were pretty bad because they're
just concepts that we made up for LibCST.
The idea for the new names is that most users will want the
SyntacticPositionProvider, and so we should name things so that the user
will naturally gravitate towards the correct choice.
There's some argument that we shouldn't even bother exposing
WhitespaceInclusivePositionProvider, but we already need to implement it
as a fallback for PositionProvider, and it might be useful for some
niche use-cases.
Once we have another major version bump, we can remove the old class
names. The old class names have already be removed from the
documentation so that new users aren't tempted to use them.
Internal-Only Changes
---------------------
- `PositionProvider` is now `_PositionProviderUnion`. This type alias
was never a public API (and probably never will be).
- `BasicCodegenState` is now
`WhitespaceInclusivePositionProvidingCodegenState`.
- `SyntacticCodegenState` is now `PositionProvidingCodegenState`.
Standardize on the convention that private modules (those we don't expect people to directly import) are prefixed with an underscore. Everything under a directory/module that has an underscore is considered private, unless it is re-exported from a non-underscored module. Most things are exported from libcst directly, but there are a few things in libcst.tool, libcst.codegen and libcst.metadata that are namedspaced as such.