ruff/crates/ruff_python_formatter/src/expression
Dhruv Manilawala cdac90ef68
New AST nodes for f-string elements (#8835)
Rebase of #6365 authored by @davidszotten.

## Summary

This PR updates the AST structure for an f-string elements.

The main **motivation** behind this change is to have a dedicated node
for the string part of an f-string. Previously, the existing
`ExprStringLiteral` node was used for this purpose which isn't exactly
correct. The `ExprStringLiteral` node should include the quotes as well
in the range but the f-string literal element doesn't include the quote
as it's a specific part within an f-string. For example,

```python
f"foo {x}"
# ^^^^
# This is the literal part of an f-string
```

The introduction of `FStringElement` enum is helpful which represent
either the literal part or the expression part of an f-string.

### Rule Updates

This means that there'll be two nodes representing a string depending on
the context. One for a normal string literal while the other is a string
literal within an f-string. The AST checker is updated to accommodate
this change. The rules which work on string literal are updated to check
on the literal part of f-string as well.

#### Notes

1. The `Expr::is_literal_expr` method would check for
`ExprStringLiteral` and return true if so. But now that we don't
represent the literal part of an f-string using that node, this improves
the method's behavior and confines to the actual expression. We do have
the `FStringElement::is_literal` method.
2. We avoid checking if we're in a f-string context before adding to
`string_type_definitions` because the f-string literal is now a
dedicated node and not part of `Expr`.
3. Annotations cannot use f-string so we avoid changing any rules which
work on annotation and checks for `ExprStringLiteral`.

## Test Plan

- All references of `Expr::StringLiteral` were checked to see if any of
the rules require updating to account for the f-string literal element
node.
- New test cases are added for rules which check against the literal
part of an f-string.
- Check the ecosystem results and ensure it remains unchanged.

## Performance

There's a performance penalty in the parser. The reason for this remains
unknown as it seems that the generated assembly code is now different
for the `__reduce154` function. The reduce function body is just popping
the `ParenthesizedExpr` on top of the stack and pushing it with the new
location.

- The size of `FStringElement` enum is the same as `Expr` which is what
it replaces in `FString::format_spec`
- The size of `FStringExpressionElement` is the same as
`ExprFormattedValue` which is what it replaces

I tried reducing the `Expr` enum from 80 bytes to 72 bytes but it hardly
resulted in any performance gain. The difference can be seen here:
- Original profile: https://share.firefox.dev/3Taa7ES
- Profile after boxing some node fields:
https://share.firefox.dev/3GsNXpD

### Backtracking

I tried backtracking the changes to see if any of the isolated change
produced this regression. The problem here is that the overall change is
so small that there's only a single checkpoint where I can backtrack and
that checkpoint results in the same regression. This checkpoint is to
revert using `Expr` to the `FString::format_spec` field. After this
point, the change would revert back to the original implementation.

## Review process

The review process is similar to #7927. The first set of commits update
the node structure, parser, and related AST files. Then, further commits
update the linter and formatter part to account for the AST change.

---------

Co-authored-by: David Szotten <davidszotten@gmail.com>
2023-12-07 10:28:05 -06:00
..
string New AST nodes for f-string elements (#8835) 2023-12-07 10:28:05 -06:00
binary_like.rs Create dedicated is_*_enabled functions for each preview style (#8988) 2023-12-04 05:38:54 +00:00
expr_attribute.rs Delete redundant branch in NeedsParentheses (#8377) 2023-10-31 12:06:17 +00:00
expr_await.rs Fix instability with await fluent style (#8676) 2023-11-17 12:24:19 -05:00
expr_bin_op.rs Update string nodes for implicit concatenation (#7927) 2023-11-24 17:55:41 -06:00
expr_bool_op.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_boolean_literal.rs Split Constant to individual literal nodes (#8064) 2023-10-30 12:13:23 +05:30
expr_bytes_literal.rs Update string nodes for implicit concatenation (#7927) 2023-11-24 17:55:41 -06:00
expr_call.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_compare.rs Update string nodes for implicit concatenation (#7927) 2023-11-24 17:55:41 -06:00
expr_dict.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_dict_comp.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_ellipsis_literal.rs Split Constant to individual literal nodes (#8064) 2023-10-30 12:13:23 +05:30
expr_f_string.rs Update string nodes for implicit concatenation (#7927) 2023-11-24 17:55:41 -06:00
expr_formatted_value.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_generator_exp.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_if_exp.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_ipy_escape_command.rs Formatter parentheses support for IpyEscapeCommand (#8207) 2023-10-25 14:01:50 +00:00
expr_lambda.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_list.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_list_comp.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_name.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_named_expr.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_none_literal.rs Split Constant to individual literal nodes (#8064) 2023-10-30 12:13:23 +05:30
expr_number_literal.rs Inline ExprNumberLiteral formatting logic (#8340) 2023-10-30 14:09:38 +05:30
expr_set.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_set_comp.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_slice.rs Split Constant to individual literal nodes (#8064) 2023-10-30 12:13:23 +05:30
expr_starred.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_string_literal.rs Update string nodes for implicit concatenation (#7927) 2023-11-24 17:55:41 -06:00
expr_subscript.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_tuple.rs Split tuples in return positions by comma first (#8280) 2023-10-30 00:25:44 +00:00
expr_unary_op.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
expr_yield.rs Avoid parenthesizing unsplittable because of comments (#8431) 2023-11-03 05:12:59 +00:00
expr_yield_from.rs Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030) 2023-10-18 00:01:18 +00:00
mod.rs New AST nodes for f-string elements (#8835) 2023-12-07 10:28:05 -06:00
operator.rs Split implicit concatenated strings before binary expressions (#7145) 2023-09-08 06:51:26 +00:00
parentheses.rs Implement the fix_power_op_line_length preview style (#8947) 2023-12-02 09:35:34 +09:00