mirror of
https://github.com/Instagram/LibCST.git
synced 2025-12-23 10:35:53 +00:00
192 lines
8.5 KiB
ReStructuredText
192 lines
8.5 KiB
ReStructuredText
==============
|
|
Best Practices
|
|
==============
|
|
|
|
While there are plenty of ways to interact with LibCST, we recommend some patterns
|
|
over others. Various best practices are laid out here along with their justifications.
|
|
|
|
Avoid ``isinstance`` when traversing
|
|
------------------------------------
|
|
|
|
Excessive use of ``isinstance`` implies that you should rewrite your check as a
|
|
matcher or unroll it into a set of visitor methods. Often, you should make use of
|
|
:func:`~libcst.ensure_type` to make your type checker aware of a node's type.
|
|
|
|
Often it is far easier to use :ref:`libcst-matchers` over explicit instance checks
|
|
in a transform. Matching against some pattern and then extracting a value from a
|
|
node's child is often easier and far more readable. Unfortunately this clashes with
|
|
various type-checkers which do not understand that :func:`~libcst.matchers.matches`
|
|
guarantees a particular set of children. Instead of instance checks, you should use
|
|
:func:`~libcst.ensure_type` which can be inlined and nested.
|
|
|
|
For example, if you have written the following::
|
|
|
|
def get_identifier_name(node: cst.CSTNode) -> Optional[str]:
|
|
if m.matches(node, m.Name()):
|
|
assert isinstance(node, cst.Name)
|
|
return node.value
|
|
return None
|
|
|
|
You could instead write something like::
|
|
|
|
def get_identifier_name(node: cst.CSTNode) -> Optional[str]:
|
|
return (
|
|
cst.ensure_type(node, cst.Name).value
|
|
if m.matches(node, m.Name())
|
|
else None
|
|
)
|
|
|
|
If you find yourself attempting to manually traverse a tree using ``isinstance``,
|
|
you can often rewrite your code using visitor methods instead. Nested instance checks
|
|
can often be unrolled into visitors methods along with matcher decorators. This may
|
|
entail adding additional state to your visitor, but the resulting code is far more
|
|
likely to work after changes to LibCST itself. For example, if you have written the
|
|
following::
|
|
|
|
class CountBazFoobarArgs(cst.CSTVisitor):
|
|
"""
|
|
Given a set of function names, count how many arguments to those function
|
|
calls are the identifiers "baz" or "foobar".
|
|
"""
|
|
|
|
def __init__(self, functions: Set[str]) -> None:
|
|
super().__init__()
|
|
self.functions: Set[str] = functions
|
|
self.arg_count: int = 0
|
|
|
|
def visit_Call(self, node: cst.Call) -> None:
|
|
# See if the call itself is one of our functions we care about
|
|
if isinstance(node.func, cst.Name) and node.func.value in self.functions:
|
|
# Loop through each argument
|
|
for arg in node.args:
|
|
# See if the argument is an identifier matching what we want to count
|
|
if isinstance(arg.value, cst.Name) and arg.value.value in {"baz", "foobar"}:
|
|
self.arg_count += 1
|
|
|
|
You could instead write something like::
|
|
|
|
class CountBazFoobarArgs(m.MatcherDecoratableVisitor):
|
|
"""
|
|
Given a set of function names, count how many arguments to those function
|
|
calls are the identifiers "baz" or "foobar".
|
|
"""
|
|
|
|
def __init__(self, functions: Set[str]) -> None:
|
|
super().__init__()
|
|
self.functions: Set[str] = functions
|
|
self.arg_count: int = 0
|
|
self.call_stack: List[str] = []
|
|
|
|
def visit_Call(self, node: cst.Call) -> None:
|
|
# Store all calls in a stack
|
|
if m.matches(node.func, m.Name()):
|
|
self.call_stack.append(cst.ensure_type(node.func, cst.Name).value)
|
|
|
|
def leave_Call(self, original_node: cst.Call) -> None:
|
|
# Pop the latest call off the stack
|
|
if m.matches(node.func, m.Name()):
|
|
self.call_stack.pop()
|
|
|
|
@m.visit(m.Arg(m.Name("baz") | m.Name("foobar")))
|
|
def _count_args(self, node: cst.Arg) -> None:
|
|
# See if the most shallow call is one we're interested in, so we can
|
|
# count the args we care about only in calls we care about.
|
|
if self.call_stack[-1] in self.functions:
|
|
self.arg_count += 1
|
|
|
|
While there is more code than the previous example, it is arguably easier to understand
|
|
and maintain each part of the code. It is also immune to any future changes to LibCST
|
|
which change's the tree shape. Note that LibCST is traversing the tree completely
|
|
in both cases, so while the first appears to be faster, it is actually doing the same
|
|
amount of work as the second.
|
|
|
|
Prefer ``updated_node`` when modifying trees
|
|
--------------------------------------------
|
|
|
|
When you are using :class:`~libcst.CSTTransformer` to modify a LibCST tree, only return
|
|
modifications to ``updated_node``. The ``original_node`` parameter on any
|
|
``leave_<Node>`` method is provided for book-keeping and is guaranteed to be
|
|
equal via ``==`` and ``is`` checks to the ``node`` parameter in the corresponding
|
|
``visit_<Node>`` method. Remember that LibCST trees are immutable, so the only
|
|
way to make a modification is to return a new tree. Hence, by the time we get to
|
|
calling ``leave_<Node>`` methods, we have an updated tree whose children have been
|
|
modified. Therefore, you should only return ``original_node`` when you want to
|
|
explicitly discard changes performed on the node's children.
|
|
|
|
Say you wanted to rename all function calls which were calling global functions.
|
|
So, you might write the following::
|
|
|
|
class FunctionRenamer(cst.CSTTransformer):
|
|
def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.Call:
|
|
if m.matches(original_node.func, m.Name()):
|
|
return original_node.with_changes(
|
|
func=cst.Name(
|
|
"renamed_" + cst.ensure_type(original_node.func, cst.Name).value
|
|
)
|
|
)
|
|
return original_node
|
|
|
|
Consider writing instead::
|
|
|
|
class FunctionRenamer(cst.CSTTransformer):
|
|
def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.Call:
|
|
if m.matches(updated_node.func, m.Name()):
|
|
return updated_node.with_changes(
|
|
func=cst.Name(
|
|
"renamed_" + cst.ensure_type(updated_node.func, cst.Name).value
|
|
)
|
|
)
|
|
return updated_node
|
|
|
|
The version that returns modifications to ``original_node`` has a subtle bug. Consider
|
|
the following code snippet::
|
|
|
|
some_func(1, 2, other_func(3))
|
|
|
|
Running the recommended transform will return us a new code snippet that looks like this::
|
|
|
|
renamed_some_func(1, 2, renamed_other_func(3))
|
|
|
|
However, running the version which modifies ``original_node`` will instead return::
|
|
|
|
renamed_some_func(1, 2, other_func(3))
|
|
|
|
That's because the ``updated_node`` tree contains the modification to ``other_func``.
|
|
By returning modifications to ``original_node`` instead of ``updated_node``, we accidentally
|
|
discarded all the work done deeper in the tree.
|
|
|
|
.. _libcst-config_best_practice:
|
|
|
|
Provide a ``config`` when generating code from templates
|
|
--------------------------------------------------------
|
|
|
|
When generating complex trees it is often far easier to pass a string to
|
|
:func:`~libcst.parse_statement` or :func:`~libcst.parse_expression` than it is to
|
|
manually construct the tree. When using these functions to generate code, you should
|
|
always use the ``config`` parameter in order to generate code that matches the
|
|
defaults of the module you are modifying. The :class:`~libcst.Module` class even has
|
|
a helper attribute :attr:`~libcst.Module.config_for_parsing` to make it easy to use. This
|
|
ensures that line endings and indentation are consistent with the defaults in the
|
|
module you are adding the code to.
|
|
|
|
For example, to add a print statement to the end of a module::
|
|
|
|
module = cst.parse_module(some_code_string)
|
|
new_module = module.with_changes(
|
|
body=(
|
|
*module.body,
|
|
cst.parse_statement(
|
|
"print('Hello, world!')",
|
|
config=module.config_for_parsing,
|
|
),
|
|
),
|
|
)
|
|
new_code_string = new_module.code
|
|
|
|
Leaving out the ``config`` parameter means that LibCST will assume some defaults
|
|
and could result in added code which is formatted differently than the rest of the
|
|
module it was added to. In the above example, because we used the config from the
|
|
already-parsed example, the print statement will be added with line endings matching
|
|
the rest of the module. If we neglect the ``config`` parameter, we might accidentally
|
|
insert a windows line ending into a unix file or vice versa, depending on what system
|
|
we ran the code under.
|