mirror of
https://github.com/Instagram/LibCST.git
synced 2025-12-23 10:35:53 +00:00
428 lines
13 KiB
Text
428 lines
13 KiB
Text
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"=====================\n",
|
|
"Working with Matchers\n",
|
|
"=====================\n",
|
|
"Matchers provide a flexible way of comparing LibCST nodes in order to build ",
|
|
"more complex transforms. See :doc:`Matchers <matchers>` for the complete ",
|
|
"documentation.\n",
|
|
"\n",
|
|
"Basic Matcher Usage\n",
|
|
"===================\n",
|
|
"Let's say you are visiting a LibCST :class:`~libcst.Call` node and you want ",
|
|
"to know if all arguments provided are the literal ``True`` or ``False``. ",
|
|
"You look at the documentation and see that ``Call.args`` is a sequence of ",
|
|
":class:`~libcst.Arg`, and each ``Arg.value`` is a :class:`~libcst.BaseExpression`. ",
|
|
"In order to verify that each argument is either ``True`` or ``False`` you ",
|
|
"would have to first loop over ``node.args``, and then check ",
|
|
"``isinstance(arg.value, cst.Name)`` for each ``arg`` in the loop before ",
|
|
"finally checking ``arg.value.value in (\"True\", \"False\")``. \n",
|
|
"\n",
|
|
"Here's a short example of that in action:\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"nbsphinx": "hidden"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"import sys\n",
|
|
"sys.path.append(\"../../\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import libcst as cst\n",
|
|
"\n",
|
|
"def is_call_with_booleans(node: cst.Call) -> bool:\n",
|
|
" for arg in node.args:\n",
|
|
" if not isinstance(arg.value, cst.Name):\n",
|
|
" # This can't be the literal True/False, so bail early.\n",
|
|
" return False\n",
|
|
" if cst.ensure_type(arg.value, cst.Name).value not in (\"True\", \"False\"):\n",
|
|
" # This is a Name node, but not the literal True/False, so bail.\n",
|
|
" return False\n",
|
|
" # We got here, so all arguments are literal boolean values.\n",
|
|
" return True\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"We can see from a few examples that this does work as intended. ",
|
|
"However, it is an awful lot of boilerplate that was fairly cumbersome to write.\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"call_1 = cst.Call(\n",
|
|
" func=cst.Name(\"foo\"),\n",
|
|
" args=(\n",
|
|
" cst.Arg(cst.Name(\"True\")),\n",
|
|
" ),\n",
|
|
")\n",
|
|
"is_call_with_booleans(call_1)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"call_2 = cst.Call(\n",
|
|
" func=cst.Name(\"foo\"),\n",
|
|
" args=(\n",
|
|
" cst.Arg(cst.Name(\"None\")),\n",
|
|
" ),\n",
|
|
")\n",
|
|
"is_call_with_booleans(call_2)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"Let's try to do a bit better with matchers. We can make a better function ",
|
|
"that takes advantage of matchers to get rid of both the instance check and ",
|
|
"the ``ensure_type`` call, like so:\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import libcst.matchers as m\n",
|
|
"\n",
|
|
"def better_is_call_with_booleans(node: cst.Call) -> bool:\n",
|
|
" for arg in node.args:\n",
|
|
" if not m.matches(arg.value, m.Name(\"True\") | m.Name(\"False\")):\n",
|
|
" # Oops, this isn't a True/False literal!\n",
|
|
" return False\n",
|
|
" # We got here, so all arguments are literal boolean values.\n",
|
|
" return True\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"This is a lot shorter and is easier to read as well! We made use of the ",
|
|
"fact that matchers handles instance checking for us in a safe way. We also ",
|
|
"made use of the fact that matchers allows us to concisely express multiple ",
|
|
"match options with the use of Python's or operator. We can also see that ",
|
|
"it still works on our previous examples:\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"better_is_call_with_booleans(call_1)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"better_is_call_with_booleans(call_2)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"We still have one more trick up our sleeve though. Matchers don't just ",
|
|
"allow us to specify which attributes we want to match on exactly. It ",
|
|
"also allows us to specify rules for matching sequences of nodes, like ",
|
|
"the list of :class:`~libcst.Arg` nodes that appears in :class:`~libcst.Call`. ",
|
|
"Let's make use of that, turning our original ``is_call_with_booleans`` ",
|
|
"function into a call to :func:`~libcst.matchers.matches`:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def best_is_call_with_booleans(node: cst.Call) -> bool:\n",
|
|
" return m.matches(\n",
|
|
" node,\n",
|
|
" m.Call(\n",
|
|
" args=(\n",
|
|
" m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n",
|
|
" ),\n",
|
|
" ),\n",
|
|
" )\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"We've turned our original function into a single call to ",
|
|
":func:`~libcst.matchers.matches`. As an added benefit, the match node can ",
|
|
"be read from left to right in a way that makes sense in english: \"match ",
|
|
"any call with zero or more arguments that are the literal ``True`` or ",
|
|
"``False``\". As we can see, it works as intended: "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"best_is_call_with_booleans(call_1)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"best_is_call_with_booleans(call_2)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"Matcher Decorators\n",
|
|
"==================\n",
|
|
"You can already do a lot with just :func:`~libcst.matchers.matches`. It ",
|
|
"lets you define the shape of nodes you want to match and LibCST takes ",
|
|
"care of the rest. However, you still need to include a lot of boilerplate ",
|
|
"into your :ref:`libcst-visitors` in order to identify which nodes you care ",
|
|
"about. Matcher :ref:`libcst-matcher-decorators` help reduce that boilerplate.\n",
|
|
"\n",
|
|
"Say you wanted to invert the boolean literals in functions which ",
|
|
"match the above ``best_is_call_with_booleans``. You could build something ",
|
|
"that looks like the following:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class BoolInverter(cst.CSTTransformer):\n",
|
|
" def __init__(self) -> None:\n",
|
|
" self.in_call: int = 0\n",
|
|
"\n",
|
|
" def visit_Call(self, node: cst.Call) -> None:\n",
|
|
" if m.matches(node, m.Call(args=(\n",
|
|
" m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n",
|
|
" ))):\n",
|
|
" self.in_call += 1\n",
|
|
"\n",
|
|
" def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.Call:\n",
|
|
" if m.matches(original_node, m.Call(args=(\n",
|
|
" m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n",
|
|
" ))):\n",
|
|
" self.in_call -= 1\n",
|
|
" return updated_node\n",
|
|
"\n",
|
|
" def leave_Name(self, original_node: cst.Name, updated_node: cst.Name) -> cst.Name:\n",
|
|
" if self.in_call > 0:\n",
|
|
" if updated_node.value == \"True\":\n",
|
|
" return updated_node.with_changes(value=\"False\")\n",
|
|
" if updated_node.value == \"False\":\n",
|
|
" return updated_node.with_changes(value=\"True\")\n",
|
|
" return updated_node\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"We can try it out on a source file to see that it works:\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"source = \"def some_func(*params: object) -> None:\\n pass\\n\\nsome_func(True, False)\\nsome_func(1, 2, 3)\\nsome_func()\\n\"\n",
|
|
"module = cst.parse_module(source)\n",
|
|
"print(source)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"new_module = module.visit(BoolInverter())\n",
|
|
"print(new_module.code)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"While this works its not super elegant. We have to track where we are in ",
|
|
"the tree so we know when its safe to invert boolean literals which means ",
|
|
"we have to create a constructor and we have to duplicate matching logic. ",
|
|
"We could refactor that into a helper like the ``best_is_call_with_booleans`` ",
|
|
"above, but it only makes things so much better.\n",
|
|
"\n",
|
|
"So, let's try rewriting it with matcher decorators instead. Note that this ",
|
|
"includes changing the class we inherit from to ",
|
|
":class:`~libcst.matchers.MatcherDecoratableTransformer` in order to enable ",
|
|
"the matcher decorator feature:\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class BetterBoolInverter(m.MatcherDecoratableTransformer):\n",
|
|
" @m.call_if_inside(m.Call(args=(\n",
|
|
" m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n",
|
|
" )))\n",
|
|
" def leave_Name(self, original_node: cst.Name, updated_node: cst.Name) -> cst.Name:\n",
|
|
" if updated_node.value == \"True\":\n",
|
|
" return updated_node.with_changes(value=\"False\")\n",
|
|
" if updated_node.value == \"False\":\n",
|
|
" return updated_node.with_changes(value=\"True\")\n",
|
|
" return updated_node\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"new_module = module.visit(BetterBoolInverter())\n",
|
|
"print(new_module.code)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"Using matcher decorators we successfully removed all of the boilerplate ",
|
|
"around state tracking! The only thing that ``leave_Name`` needs to concern ",
|
|
"itself with is the actual business logic of the transform. However, it ",
|
|
"still needs to check to see if the value of the node should be inverted. ",
|
|
"This is because the ``Call.func`` is a :class:`~libcst.Name` in this case. ",
|
|
"Let's use another matcher decorator to make that problem go away:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class BestBoolInverter(m.MatcherDecoratableTransformer):\n",
|
|
" @m.call_if_inside(m.Call(args=(\n",
|
|
" m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n",
|
|
" )))\n",
|
|
" @m.leave(m.Name(\"True\") | m.Name(\"False\"))\n",
|
|
" def invert_bool_literal(self, original_node: cst.Name, updated_node: cst.Name) -> cst.Name:\n",
|
|
" return updated_node.with_changes(value=\"False\" if updated_node.value == \"True\" else \"True\")\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"new_module = module.visit(BestBoolInverter())\n",
|
|
"print(new_module.code)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {
|
|
"raw_mimetype": "text/restructuredtext"
|
|
},
|
|
"source": [
|
|
"That's it! Instead of using a ``leave_Name`` which modifies all ",
|
|
":class:`~libcst.Name` nodes we instead created a matcher visitor that ",
|
|
"only modifies :class:`~libcst.Name` nodes with the value of ``True`` or ",
|
|
"``False``. We decorate *that* with :func:`~libcst.matchers.call_if_inside` ",
|
|
"to ensure we run this on :class:`~libcst.Name` nodes found inside of ",
|
|
"function calls that only take boolean literals. Using two matcher ",
|
|
"decorators we got rid of all of the state management as well as all of the ",
|
|
"cases where we needed to handle nodes we weren't interested in."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"celltoolbar": "Edit Metadata",
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|