Auto generate visit_source_order (#17180)

## Summary

part of: #15655 

I tried generating the source order function using code generation. I
tried a simple approach, but it is not enough to generate all of them
this way.

There is one good thing, that most of the implementations are fine with
this. We only have a few that are not. So one benefit of this PR could
be it eliminates a lot of the code, hence changing the AST structure
will only leave a few places to be fixed.

The `source_order` field determines if a node requires a source order
implementation. If it’s empty it means source order does not visit
anything.

Initially I didn’t want to repeat the field names. But I found two
things:
- `ExprIf` statement unlike other statements does not have the fields
defined in source order. This and also some fields do not need to be
included in the visit. So we just need a way to determine order, and
determine presence.
- Relying on the fields sounds more complicated to me. Maybe another
solution is to add a new attribute `order` to each field? I'm open to
suggestions.
But anyway, except for the `ExprIf` we don't need to write the field
names in order. Just knowing what fields must be visited are enough.

Some nodes had a more complex visitor:

`ExprCompare` required zipping two fields.

`ExprBoolOp` required a match over the fields.

`FstringValue` required a match, I created a new walk_ function that
does the match. and used it in code generation. I don’t think this
provides real value. Because I mostly moved the code from one file to
another. I was tried it as an option. I prefer to leave it in the code
as before.

Some visitors visit a slice of items. Others visit a single element. I
put a check on this in code generation to see if the field requires a
for loop or not. I think better approach is to have a consistent style.
So we can by default loop over any field that is a sequence.

For field types `StringLiteralValue` and `BytesLiteralValue` the types
are not a sequence in toml definition. But they implement `iter` so they
are iterated over. So the code generation does not properly identify
this. So in the code I'm checking for their types.

## Test Plan

All the tests should pass without any changes.
I checked the generated code to make sure it's the same as old code. I'm
not sure if there's a test for the source order visitor.
This commit is contained in:
Shaygan Hooshyari 2025-04-17 14:59:57 +02:00 committed by GitHub
parent bd89838212
commit 3ada36b766
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 1048 additions and 886 deletions

View file

@ -37,6 +37,10 @@
# derives:
# List of derives to add to the syntax node struct. Clone, Debug, PartialEq are added by default.
#
# custom_source_order:
# A boolean that specifies if this node has a custom source order visitor implementation.
# generation of visit_source_order will be skipped for this node.
#
# fields:
# List of fields in the syntax node struct. Each field is a table with the
# following keys:
@ -48,6 +52,10 @@
# * `Expr*` - A vector of Expr.
# * `&Expr*` - A boxed slice of Expr.
# These properties cannot be nested, for example we cannot create a vector of option types.
# * is_annotation - If this field is a type annotation.
#
# source_order:
# Defines in what order the fields appear in source
#
# variant:
# The name of the enum variant for this syntax node. Defaults to the node
@ -57,9 +65,13 @@
anynode_is_label = "module"
doc = "See also [mod](https://docs.python.org/3/library/ast.html#ast.mod)"
[Mod.nodes]
ModModule = {}
ModExpression = {}
[Mod.nodes.ModModule]
doc = "See also [Module](https://docs.python.org/3/library/ast.html#ast.Module)"
fields = [{ name = "body", type = "Stmt*" }]
[Mod.nodes.ModExpression]
doc = "See also [Module](https://docs.python.org/3/library/ast.html#ast.Module)"
fields = [{ name = "body", type = "Box<Expr>" }]
[Stmt]
add_suffix_to_is_methods = true
@ -77,8 +89,7 @@ fields = [
{ name = "name", type = "Identifier" },
{ name = "type_params", type = "Box<crate::TypeParams>?" },
{ name = "parameters", type = "Box<crate::Parameters>" },
{ name = "returns", type = "Expr?" },
{ name = "returns", type = "Expr?", is_annotation = true },
{ name = "body", type = "Stmt*" },
]
@ -127,7 +138,7 @@ fields = [
doc = "See also [AnnAssign](https://docs.python.org/3/library/ast.html#ast.AnnAssign)"
fields = [
{ name = "target", type = "Expr" },
{ name = "annotation", type = "Expr" },
{ name = "annotation", type = "Expr", is_annotation = true },
{ name = "value", type = "Expr?" },
{ name = "simple", type = "bool" },
]
@ -305,6 +316,7 @@ doc = "See also [expr](https://docs.python.org/3/library/ast.html#ast.expr)"
[Expr.nodes.ExprBoolOp]
doc = "See also [BoolOp](https://docs.python.org/3/library/ast.html#ast.BoolOp)"
fields = [{ name = "op", type = "BoolOp" }, { name = "values", type = "Expr*" }]
custom_source_order = true
[Expr.nodes.ExprNamed]
doc = "See also [NamedExpr](https://docs.python.org/3/library/ast.html#ast.NamedExpr)"
@ -339,10 +351,12 @@ fields = [
{ name = "body", type = "Expr" },
{ name = "orelse", type = "Expr" },
]
source_order = ["body", "test", "orelse"]
[Expr.nodes.ExprDict]
doc = "See also [Dict](https://docs.python.org/3/library/ast.html#ast.Dict)"
fields = [{ name = "items", type = "DictItem*" }]
custom_source_order = true
[Expr.nodes.ExprSet]
doc = "See also [Set](https://docs.python.org/3/library/ast.html#ast.Set)"
@ -397,6 +411,8 @@ fields = [
{ name = "ops", type = "&CmpOp*" },
{ name = "comparators", type = "&Expr*" },
]
# The fields must be visited simultaneously
custom_source_order = true
[Expr.nodes.ExprCall]
doc = "See also [Call](https://docs.python.org/3/library/ast.html#ast.Call)"
@ -415,16 +431,21 @@ it keeps them separate and provide various methods to access the parts.
See also [JoinedStr](https://docs.python.org/3/library/ast.html#ast.JoinedStr)"""
fields = [{ name = "value", type = "FStringValue" }]
custom_source_order = true
[Expr.nodes.ExprStringLiteral]
doc = """An AST node that represents either a single-part string literal
or an implicitly concatenated string literal."""
fields = [{ name = "value", type = "StringLiteralValue" }]
# Because StringLiteralValue type is an iterator and it's not clear from the type
custom_source_order = true
[Expr.nodes.ExprBytesLiteral]
doc = """An AST node that represents either a single-part bytestring literal
or an implicitly concatenated bytestring literal."""
fields = [{ name = "value", type = "BytesLiteralValue" }]
# Because BytesLiteralValue type is an iterator and it's not clear from the type
custom_source_order = true
[Expr.nodes.ExprNumberLiteral]
fields = [{ name = "value", type = "Number" }]