Preview minimal f-string formatting (#9642)

## Summary

_This is preview only feature and is available using the `--preview`
command-line flag._

With the implementation of [PEP 701] in Python 3.12, f-strings can now
be broken into multiple lines, can contain comments, and can re-use the
same quote character. Currently, no other Python formatter formats the
f-strings so there's some discussion which needs to happen in defining
the style used for f-string formatting. Relevant discussion:
https://github.com/astral-sh/ruff/discussions/9785

The goal for this PR is to add minimal support for f-string formatting.
This would be to format expression within the replacement field without
introducing any major style changes.

### Newlines

The heuristics for adding newline is similar to that of
[Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals)
where the formatter would only split an expression in the replacement
field across multiple lines if there was already a line break within the
replacement field.

In other words, the formatter would not add any newlines unless they
were already present i.e., they were added by the user. This makes
breaking any expression inside an f-string optional and in control of
the user. For example,

```python
# We wouldn't break this
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"

# But, we would break the following as there's already a newline
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
	aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
```


If there are comments in any of the replacement field of the f-string,
then it will always be a multi-line f-string in which case the formatter
would prefer to break expressions i.e., introduce newlines. For example,

```python
x = f"{ # comment
    a }"
```

### Quotes

The logic for formatting quotes remains unchanged. The existing logic is
used to determine the necessary quote char and is used accordingly.

Now, if the expression inside an f-string is itself a string like, then
we need to make sure to preserve the existing quote and not change it to
the preferred quote unless it's 3.12. For example,

```python
f"outer {'inner'} outer"

# For pre 3.12, preserve the single quote
f"outer {'inner'} outer"

# While for 3.12 and later, the quotes can be changed
f"outer {"inner"} outer"
```

But, for triple-quoted strings, we can re-use the same quote char unless
the inner string is itself a triple-quoted string.

```python
f"""outer {"inner"} outer"""  # valid
f"""outer {'''inner'''} outer"""  # preserve the single quote char for the inner string
```

### Debug expressions

If debug expressions are present in the replacement field of a f-string,
then the whitespace needs to be preserved as they will be rendered as it
is (for example, `f"{ x = }"`. If there are any nested f-strings, then
the whitespace in them needs to be preserved as well which means that
we'll stop formatting the f-string as soon as we encounter a debug
expression.

```python
f"outer {   x =  !s  :.3f}"
#                  ^^
#                  We can remove these whitespaces
```

Now, the whitespace doesn't need to be preserved around conversion spec
and format specifiers, so we'll format them as usual but we won't be
formatting any nested f-string within the format specifier.

### Miscellaneous

- The
[`hug_parens_with_braces_and_square_brackets`](https://github.com/astral-sh/ruff/issues/8279)
preview style isn't implemented w.r.t. the f-string curly braces.
- The
[indentation](https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590)
is always relative to the f-string containing statement

## Test Plan

* Add new test cases
* Review existing snapshot changes
* Review the ecosystem changes

[PEP 701]: https://peps.python.org/pep-0701/
This commit is contained in:
Dhruv Manilawala 2024-02-16 20:28:11 +05:30 committed by GitHub
parent c47ff658e4
commit 72bf1c2880
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
20 changed files with 1972 additions and 55 deletions

View file

@ -0,0 +1,8 @@
[
{
"preview": "enabled"
},
{
"preview": "disabled"
}
]

View file

@ -30,22 +30,22 @@ result_f = (
# an expression inside a formatted value
(
f'{1}'
# comment
# comment 1
''
)
(
f'{1}' # comment
f'{1}' # comment 2
f'{2}'
)
(
f'{1}'
f'{2}' # comment
f'{2}' # comment 3
)
(
1, ( # comment
1, ( # comment 4
f'{2}'
)
)
@ -53,7 +53,7 @@ result_f = (
(
(
f'{1}'
# comment
# comment 5
),
2
)
@ -62,3 +62,221 @@ result_f = (
x = f'''a{""}b'''
y = f'''c{1}d"""e'''
z = f'''a{""}b''' f'''c{1}d"""e'''
# F-String formatting test cases (Preview)
# Simple expression with a mix of debug expression and comments.
x = f"{a}"
x = f"{
a = }"
x = f"{ # comment 6
a }"
x = f"{ # comment 7
a = }"
# Remove the parentheses as adding them doesn't make then fit within the line length limit.
# This is similar to how we format it before f-string formatting.
aaaaaaaaaaa = (
f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd } cccccccccc"
)
# Here, we would use the best fit layout to put the f-string indented on the next line
# similar to the next example.
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
aaaaaaaaaaa = (
f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
)
# This should never add the optional parentheses because even after adding them, the
# f-string exceeds the line length limit.
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { "bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { "bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" = } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { # comment 8
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { # comment 9
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" = } ccccccccccccccc"
# Multiple larger expressions which exceeds the line length limit. Here, we need to decide
# whether to split at the first or second expression. This should work similarly to the
# assignment statement formatting where we split from right to left in preview mode.
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
# The above example won't split but when we start introducing line breaks:
x = f"aaaaaaaaaaaa {
bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb
} cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc {
ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd
} eeeeeeeeeeeeee"
# But, in case comments are present, we would split at the expression containing the
# comments:
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb # comment 10
} cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb
} cccccccccccccccccccc { # comment 11
ddddddddddddddd } eeeeeeeeeeeeee"
# Here, the expression part itself starts with a curly brace so we need to add an extra
# space between the opening curly brace and the expression.
x = f"{ {'x': 1, 'y': 2} }"
# Although the extra space isn't required before the ending curly brace, we add it for
# consistency.
x = f"{ {'x': 1, 'y': 2}}"
x = f"{ {'x': 1, 'y': 2} = }"
x = f"{ # comment 12
{'x': 1, 'y': 2} }"
x = f"{ # comment 13
{'x': 1, 'y': 2} = }"
# But, in this case, we would split the expression itself because it exceeds the line
# length limit so we need not add the extra space.
xxxxxxx = f"{
{'aaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbb', 'ccccccccccccccccccccc'}
}"
# And, split the expression itself because it exceeds the line length.
xxxxxxx = f"{
{'aaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccc'}
}"
# Quotes
f"foo 'bar' {x}"
f"foo \"bar\" {x}"
f'foo "bar" {x}'
f'foo \'bar\' {x}'
f"foo {"bar"}"
f"foo {'\'bar\''}"
# Here, the formatter will remove the escapes which is correct because they aren't allowed
# pre 3.12. This means we can assume that the f-string is used in the context of 3.12.
f"foo {'\"bar\"'}"
# Triple-quoted strings
# It's ok to use the same quote char for the inner string if it's single-quoted.
f"""test {'inner'}"""
f"""test {"inner"}"""
# But if the inner string is also triple-quoted then we should preserve the existing quotes.
f"""test {'''inner'''}"""
# Magic trailing comma
#
# The expression formatting will result in breaking it across multiple lines with a
# trailing comma but as the expression isn't already broken, we will remove all the line
# breaks which results in the trailing comma being present. This test case makes sure
# that the trailing comma is removed as well.
f"aaaaaaa {['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbb', 'ccccccccccccccccc', 'ddddddddddddddd', 'eeeeeeeeeeeeee']} aaaaaaa"
# And, if the trailing comma is already present, we still need to remove it.
f"aaaaaaa {['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbb', 'ccccccccccccccccc', 'ddddddddddddddd', 'eeeeeeeeeeeeee',]} aaaaaaa"
# Keep this Multiline by breaking it at the square brackets.
f"""aaaaaa {[
xxxxxxxx,
yyyyyyyy,
]} ccc"""
# Add the magic trailing comma because the elements don't fit within the line length limit
# when collapsed.
f"aaaaaa {[
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
yyyyyyyyyyyy
]} ccccccc"
# Remove the parenthese because they aren't required
xxxxxxxxxxxxxxx = (
f"aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbb {
xxxxxxxxxxx # comment 14
+ yyyyyyyyyy
} dddddddddd"
)
# Comments
# No comments should be dropped!
f"{ # comment 15
# comment 16
foo # comment 17
# comment 18
}" # comment 19
# comment 20
# Conversion flags
#
# This is not a valid Python code because of the additional whitespace between the `!`
# and conversion type. But, our parser isn't strict about this. This should probably be
# removed once we have a strict parser.
x = f"aaaaaaaaa { x ! r }"
# Even in the case of debug expresions, we only need to preserve the whitespace within
# the expression part of the replacement field.
x = f"aaaaaaaaa { x = ! r }"
# Combine conversion flags with format specifiers
x = f"{x = ! s
:>0
}"
# This is interesting. There can be a comment after the format specifier but only if it's
# on it's own line. Refer to https://github.com/astral-sh/ruff/pull/7787 for more details.
# We'll format is as trailing comments.
x = f"{x !s
:>0
# comment 21
}"
x = f"""
{ # comment 22
x = :.0{y # comment 23
}f}"""
# Here, the debug expression is in a nested f-string so we should start preserving
# whitespaces from that point onwards. This means we should format the outer f-string.
x = f"""{"foo " + # comment 24
f"{ x =
}" # comment 25
}
"""
# Mix of various features.
f"{ # comment 26
foo # after foo
:>{
x # after x
}
# comment 27
# comment 28
} woah {x}"
# Indentation
# What should be the indentation?
# https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590
if indent0:
if indent1:
if indent2:
foo = f"""hello world
hello {
f"aaaaaaa {
[
'aaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbb',
'ccccccccccccccccccccc',
'ddddddddddddddddddddd'
]
} bbbbbbbb" +
[
'aaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbb',
'ccccccccccccccccccccc',
'ddddddddddddddddddddd'
]
} --------
"""

View file

@ -0,0 +1,5 @@
[
{
"target_version": "py312"
}
]

View file

@ -0,0 +1,6 @@
# This file contains test cases only for cases where the logic tests for whether
# the target version is 3.12 or later. A user can have 3.12 syntax even if the target
# version isn't set.
# Quotes re-use
f"{'a'}"

View file

@ -1,7 +1,7 @@
use ruff_formatter::{write, Argument, Arguments};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::context::{NodeLevel, WithNodeLevel};
use crate::context::{FStringState, NodeLevel, WithNodeLevel};
use crate::other::commas::has_magic_trailing_comma;
use crate::prelude::*;
@ -206,6 +206,16 @@ impl<'fmt, 'ast, 'buf> JoinCommaSeparatedBuilder<'fmt, 'ast, 'buf> {
pub(crate) fn finish(&mut self) -> FormatResult<()> {
self.result.and_then(|()| {
// If the formatter is inside an f-string expression element, and the layout
// is flat, then we don't need to add a trailing comma.
if let FStringState::InsideExpressionElement(context) =
self.fmt.context().f_string_state()
{
if context.layout().is_flat() {
return Ok(());
}
}
if let Some(last_end) = self.entries.position() {
let magic_trailing_comma = has_magic_trailing_comma(
TextRange::new(last_end, self.sequence_end),

View file

@ -289,6 +289,28 @@ fn handle_enclosed_comment<'a>(
}
}
AnyNodeRef::FString(fstring) => CommentPlacement::dangling(fstring, comment),
AnyNodeRef::FStringExpressionElement(_) => {
// Handle comments after the format specifier (should be rare):
//
// ```python
// f"literal {
// expr:.3f
// # comment
// }"
// ```
//
// This is a valid comment placement.
if matches!(
comment.preceding_node(),
Some(
AnyNodeRef::FStringExpressionElement(_) | AnyNodeRef::FStringLiteralElement(_)
)
) {
CommentPlacement::trailing(comment.enclosing_node(), comment)
} else {
handle_bracketed_end_of_line_comment(comment, locator)
}
}
AnyNodeRef::ExprList(_)
| AnyNodeRef::ExprSet(_)
| AnyNodeRef::ExprListComp(_)

View file

@ -1,4 +1,5 @@
use crate::comments::Comments;
use crate::other::f_string::FStringContext;
use crate::string::QuoteChar;
use crate::PyFormatOptions;
use ruff_formatter::{Buffer, FormatContext, GroupId, IndentWidth, SourceCode};
@ -22,6 +23,8 @@ pub struct PyFormatContext<'a> {
/// quote style that is inverted from the one here in order to ensure that
/// the formatted Python code will be valid.
docstring: Option<QuoteChar>,
/// The state of the formatter with respect to f-strings.
f_string_state: FStringState,
}
impl<'a> PyFormatContext<'a> {
@ -33,6 +36,7 @@ impl<'a> PyFormatContext<'a> {
node_level: NodeLevel::TopLevel(TopLevelStatementPosition::Other),
indent_level: IndentLevel::new(0),
docstring: None,
f_string_state: FStringState::Outside,
}
}
@ -86,6 +90,14 @@ impl<'a> PyFormatContext<'a> {
}
}
pub(crate) fn f_string_state(&self) -> FStringState {
self.f_string_state
}
pub(crate) fn set_f_string_state(&mut self, f_string_state: FStringState) {
self.f_string_state = f_string_state;
}
/// Returns `true` if preview mode is enabled.
pub(crate) const fn is_preview(&self) -> bool {
self.options.preview().is_enabled()
@ -115,6 +127,18 @@ impl Debug for PyFormatContext<'_> {
}
}
#[derive(Copy, Clone, Debug, Default)]
pub(crate) enum FStringState {
/// The formatter is inside an f-string expression element i.e., between the
/// curly brace in `f"foo {x}"`.
///
/// The containing `FStringContext` is the surrounding f-string context.
InsideExpressionElement(FStringContext),
/// The formatter is outside an f-string.
#[default]
Outside,
}
/// The position of a top-level statement in the module.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Default)]
pub(crate) enum TopLevelStatementPosition {
@ -332,3 +356,65 @@ where
.set_indent_level(self.saved_level);
}
}
pub(crate) struct WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
buffer: D,
saved_location: FStringState,
}
impl<'a, B, D> WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
pub(crate) fn new(expr_location: FStringState, mut buffer: D) -> Self {
let context = buffer.state_mut().context_mut();
let saved_location = context.f_string_state();
context.set_f_string_state(expr_location);
Self {
buffer,
saved_location,
}
}
}
impl<'a, B, D> Deref for WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
type Target = B;
fn deref(&self) -> &Self::Target {
&self.buffer
}
}
impl<'a, B, D> DerefMut for WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.buffer
}
}
impl<'a, B, D> Drop for WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
fn drop(&mut self) {
self.buffer
.state_mut()
.context_mut()
.set_f_string_state(self.saved_location);
}
}

View file

@ -48,6 +48,24 @@ impl NeedsParentheses for ExprFString {
) -> OptionalParentheses {
if self.value.is_implicit_concatenated() {
OptionalParentheses::Multiline
// TODO(dhruvmanila): Ideally what we want here is a new variant which
// is something like:
// - If the expression fits by just adding the parentheses, then add them and
// avoid breaking the f-string expression. So,
// ```
// xxxxxxxxx = (
// f"aaaaaaaaaaaa { xxxxxxx + yyyyyyyy } bbbbbbbbbbbbb"
// )
// ```
// - But, if the expression is too long to fit even with parentheses, then
// don't add the parentheses and instead break the expression at `soft_line_break`.
// ```
// xxxxxxxxx = f"aaaaaaaaaaaa {
// xxxxxxxxx + yyyyyyyyyy
// } bbbbbbbbbbbbb"
// ```
// This isn't decided yet, refer to the relevant discussion:
// https://github.com/astral-sh/ruff/discussions/9785
} else if AnyString::FString(self).is_multiline(context.source()) {
OptionalParentheses::Never
} else {

View file

@ -466,3 +466,12 @@ pub enum PythonVersion {
Py311,
Py312,
}
impl PythonVersion {
/// Return `true` if the current version supports [PEP 701].
///
/// [PEP 701]: https://peps.python.org/pep-0701/
pub fn supports_pep_701(self) -> bool {
self >= Self::Py312
}
}

View file

@ -1,8 +1,13 @@
use ruff_formatter::write;
use ruff_python_ast::FString;
use ruff_source_file::Locator;
use ruff_text_size::Ranged;
use crate::prelude::*;
use crate::string::{Quoting, StringNormalizer, StringPart};
use crate::preview::is_f_string_formatting_enabled;
use crate::string::{Quoting, StringNormalizer, StringPart, StringPrefix, StringQuotes};
use super::f_string_element::FormatFStringElement;
/// Formats an f-string which is part of a larger f-string expression.
///
@ -25,25 +30,126 @@ impl Format<PyFormatContext<'_>> for FormatFString<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let locator = f.context().locator();
let result = StringNormalizer::from_context(f.context())
let string = StringPart::from_source(self.value.range(), &locator);
let normalizer = StringNormalizer::from_context(f.context())
.with_quoting(self.quoting)
.with_preferred_quote_style(f.options().quote_style())
.normalize(
&StringPart::from_source(self.value.range(), &locator),
&locator,
.with_preferred_quote_style(f.options().quote_style());
// If f-string formatting is disabled (not in preview), then we will
// fall back to the previous behavior of normalizing the f-string.
if !is_f_string_formatting_enabled(f.context()) {
let result = normalizer.normalize(&string, &locator).fmt(f);
let comments = f.context().comments();
self.value.elements.iter().for_each(|value| {
comments.mark_verbatim_node_comments_formatted(value.into());
// Above method doesn't mark the trailing comments of the f-string elements
// as formatted, so we need to do it manually. For example,
//
// ```python
// f"""foo {
// x:.3f
// # comment
// }"""
// ```
for trailing_comment in comments.trailing(value) {
trailing_comment.mark_formatted();
}
});
return result;
}
let quotes = normalizer.choose_quotes(&string, &locator);
let context = FStringContext::new(
string.prefix(),
quotes,
FStringLayout::from_f_string(self.value, &locator),
);
// Starting prefix and quote
write!(f, [string.prefix(), quotes])?;
f.join()
.entries(
self.value
.elements
.iter()
.map(|element| FormatFStringElement::new(element, context)),
)
.fmt(f);
.finish()?;
// TODO(dhruvmanila): With PEP 701, comments can be inside f-strings.
// This is to mark all of those comments as formatted but we need to
// figure out how to handle them. Note that this needs to be done only
// after the f-string is formatted, so only for all the non-formatted
// comments.
let comments = f.context().comments();
self.value.elements.iter().for_each(|value| {
comments.mark_verbatim_node_comments_formatted(value.into());
});
result
// Ending quote
quotes.fmt(f)
}
}
#[derive(Clone, Copy, Debug)]
pub(crate) struct FStringContext {
prefix: StringPrefix,
quotes: StringQuotes,
layout: FStringLayout,
}
impl FStringContext {
const fn new(prefix: StringPrefix, quotes: StringQuotes, layout: FStringLayout) -> Self {
Self {
prefix,
quotes,
layout,
}
}
pub(crate) const fn quotes(self) -> StringQuotes {
self.quotes
}
pub(crate) const fn prefix(self) -> StringPrefix {
self.prefix
}
pub(crate) const fn layout(self) -> FStringLayout {
self.layout
}
}
#[derive(Copy, Clone, Debug)]
pub(crate) enum FStringLayout {
/// Original f-string is flat.
/// Don't break expressions to keep the string flat.
Flat,
/// Original f-string has multiline expressions in the replacement fields.
/// Allow breaking expressions across multiple lines.
Multiline,
}
impl FStringLayout {
fn from_f_string(f_string: &FString, locator: &Locator) -> Self {
// Heuristic: Allow breaking the f-string expressions across multiple lines
// only if there already is at least one multiline expression. This puts the
// control in the hands of the user to decide if they want to break the
// f-string expressions across multiple lines or not. This is similar to
// how Prettier does it for template literals in JavaScript.
//
// If it's single quoted f-string and it contains a multiline expression, then we
// assume that the target version of Python supports it (3.12+). If there are comments
// used in any of the expression of the f-string, then it's always going to be multiline
// and we assume that the target version of Python supports it (3.12+).
//
// Reference: https://prettier.io/docs/en/next/rationale.html#template-literals
if f_string
.elements
.iter()
.filter_map(|element| element.as_expression())
.any(|expr| memchr::memchr2(b'\n', b'\r', locator.slice(expr).as_bytes()).is_some())
{
Self::Multiline
} else {
Self::Flat
}
}
pub(crate) const fn is_flat(self) -> bool {
matches!(self, Self::Flat)
}
}

View file

@ -0,0 +1,244 @@
use std::borrow::Cow;
use ruff_formatter::{format_args, write, Buffer, RemoveSoftLinesBuffer};
use ruff_python_ast::{
ConversionFlag, Expr, FStringElement, FStringExpressionElement, FStringLiteralElement,
};
use ruff_text_size::Ranged;
use crate::comments::{dangling_open_parenthesis_comments, trailing_comments};
use crate::context::{FStringState, NodeLevel, WithFStringState, WithNodeLevel};
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::normalize_string;
use crate::verbatim::verbatim_text;
use super::f_string::FStringContext;
/// Formats an f-string element which is either a literal or a formatted expression.
///
/// This delegates the actual formatting to the appropriate formatter.
pub(crate) struct FormatFStringElement<'a> {
element: &'a FStringElement,
context: FStringContext,
}
impl<'a> FormatFStringElement<'a> {
pub(crate) fn new(element: &'a FStringElement, context: FStringContext) -> Self {
Self { element, context }
}
}
impl Format<PyFormatContext<'_>> for FormatFStringElement<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
match self.element {
FStringElement::Literal(string_literal) => {
FormatFStringLiteralElement::new(string_literal, self.context).fmt(f)
}
FStringElement::Expression(expression) => {
FormatFStringExpressionElement::new(expression, self.context).fmt(f)
}
}
}
}
/// Formats an f-string literal element.
pub(crate) struct FormatFStringLiteralElement<'a> {
element: &'a FStringLiteralElement,
context: FStringContext,
}
impl<'a> FormatFStringLiteralElement<'a> {
pub(crate) fn new(element: &'a FStringLiteralElement, context: FStringContext) -> Self {
Self { element, context }
}
}
impl Format<PyFormatContext<'_>> for FormatFStringLiteralElement<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let literal_content = f.context().locator().slice(self.element.range());
let normalized = normalize_string(
literal_content,
self.context.quotes(),
self.context.prefix(),
is_hex_codes_in_unicode_sequences_enabled(f.context()),
);
match &normalized {
Cow::Borrowed(_) => source_text_slice(self.element.range()).fmt(f),
Cow::Owned(normalized) => text(normalized).fmt(f),
}
}
}
/// Formats an f-string expression element.
pub(crate) struct FormatFStringExpressionElement<'a> {
element: &'a FStringExpressionElement,
context: FStringContext,
}
impl<'a> FormatFStringExpressionElement<'a> {
pub(crate) fn new(element: &'a FStringExpressionElement, context: FStringContext) -> Self {
Self { element, context }
}
}
impl Format<PyFormatContext<'_>> for FormatFStringExpressionElement<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let FStringExpressionElement {
expression,
debug_text,
conversion,
format_spec,
..
} = self.element;
if let Some(debug_text) = debug_text {
token("{").fmt(f)?;
let comments = f.context().comments();
// If the element has a debug text, preserve the same formatting as
// in the source code (`verbatim`). This requires us to mark all of
// the surrounding comments as formatted.
comments.mark_verbatim_node_comments_formatted(self.element.into());
// Above method doesn't mark the leading and trailing comments of the element.
// There can't be any leading comments for an expression element, but there
// can be trailing comments. For example,
//
// ```python
// f"""foo {
// x:.3f
// # trailing comment
// }"""
// ```
for trailing_comment in comments.trailing(self.element) {
trailing_comment.mark_formatted();
}
write!(
f,
[
text(&debug_text.leading),
verbatim_text(&**expression),
text(&debug_text.trailing),
]
)?;
// Even if debug text is present, any whitespace between the
// conversion flag and the format spec doesn't need to be preserved.
match conversion {
ConversionFlag::Str => text("!s").fmt(f)?,
ConversionFlag::Ascii => text("!a").fmt(f)?,
ConversionFlag::Repr => text("!r").fmt(f)?,
ConversionFlag::None => (),
}
if let Some(format_spec) = format_spec.as_deref() {
write!(f, [token(":"), verbatim_text(format_spec)])?;
}
token("}").fmt(f)
} else {
let comments = f.context().comments().clone();
let dangling_item_comments = comments.dangling(self.element);
let item = format_with(|f| {
let bracket_spacing = match expression.as_ref() {
// If an expression starts with a `{`, we need to add a space before the
// curly brace to avoid turning it into a literal curly with `{{`.
//
// For example,
// ```python
// f"{ {'x': 1, 'y': 2} }"
// # ^ ^
// ```
//
// We need to preserve the space highlighted by `^`. The whitespace
// before the closing curly brace is not strictly necessary, but it's
// added to maintain consistency.
Expr::Dict(_) | Expr::DictComp(_) | Expr::Set(_) | Expr::SetComp(_) => {
Some(format_with(|f| {
if self.context.layout().is_flat() {
space().fmt(f)
} else {
soft_line_break_or_space().fmt(f)
}
}))
}
_ => None,
};
// Update the context to be inside the f-string expression element.
let f = &mut WithFStringState::new(
FStringState::InsideExpressionElement(self.context),
f,
);
write!(f, [bracket_spacing, expression.format()])?;
// Conversion comes first, then the format spec.
match conversion {
ConversionFlag::Str => text("!s").fmt(f)?,
ConversionFlag::Ascii => text("!a").fmt(f)?,
ConversionFlag::Repr => text("!r").fmt(f)?,
ConversionFlag::None => (),
}
if let Some(format_spec) = format_spec.as_deref() {
token(":").fmt(f)?;
f.join()
.entries(
format_spec
.elements
.iter()
.map(|element| FormatFStringElement::new(element, self.context)),
)
.finish()?;
// These trailing comments can only occur if the format specifier is
// present. For example,
//
// ```python
// f"{
// x:.3f
// # comment
// }"
// ```
//
// Any other trailing comments are attached to the expression itself.
trailing_comments(comments.trailing(self.element)).fmt(f)?;
}
bracket_spacing.fmt(f)
});
let open_parenthesis_comments = if dangling_item_comments.is_empty() {
None
} else {
Some(dangling_open_parenthesis_comments(dangling_item_comments))
};
token("{").fmt(f)?;
{
let mut f = WithNodeLevel::new(NodeLevel::ParenthesizedExpression, f);
if self.context.layout().is_flat() {
let mut buffer = RemoveSoftLinesBuffer::new(&mut *f);
write!(buffer, [open_parenthesis_comments, item])?;
} else {
group(&format_args![
open_parenthesis_comments,
soft_block_indent(&item)
])
.fmt(&mut f)?;
}
}
token("}").fmt(f)
}
}
}

View file

@ -7,6 +7,7 @@ pub(crate) mod decorator;
pub(crate) mod elif_else_clause;
pub(crate) mod except_handler_except_handler;
pub(crate) mod f_string;
pub(crate) mod f_string_element;
pub(crate) mod f_string_part;
pub(crate) mod identifier;
pub(crate) mod keyword;

View file

@ -81,3 +81,8 @@ pub(crate) const fn is_multiline_string_handling_enabled(context: &PyFormatConte
pub(crate) const fn is_format_module_docstring_enabled(context: &PyFormatContext) -> bool {
context.is_preview()
}
/// Returns `true` if the [`f-string formatting`](https://github.com/astral-sh/ruff/issues/7594) preview style is enabled.
pub(crate) fn is_f_string_formatting_enabled(context: &PyFormatContext) -> bool {
context.is_preview()
}

View file

@ -1,7 +1,7 @@
use bitflags::bitflags;
pub(crate) use any::AnyString;
pub(crate) use normalize::{NormalizedString, StringNormalizer};
pub(crate) use normalize::{normalize_string, NormalizedString, StringNormalizer};
use ruff_formatter::format_args;
use ruff_source_file::Locator;
use ruff_text_size::{TextLen, TextRange, TextSize};

View file

@ -1,8 +1,11 @@
use std::borrow::Cow;
use ruff_formatter::FormatContext;
use ruff_source_file::Locator;
use ruff_text_size::{Ranged, TextRange};
use crate::context::FStringState;
use crate::options::PythonVersion;
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::{QuoteChar, Quoting, StringPart, StringPrefix, StringQuotes};
@ -12,6 +15,8 @@ pub(crate) struct StringNormalizer {
quoting: Quoting,
preferred_quote_style: QuoteStyle,
parent_docstring_quote_char: Option<QuoteChar>,
f_string_state: FStringState,
target_version: PythonVersion,
normalize_hex: bool,
}
@ -21,6 +26,8 @@ impl StringNormalizer {
quoting: Quoting::default(),
preferred_quote_style: QuoteStyle::default(),
parent_docstring_quote_char: context.docstring(),
f_string_state: context.f_string_state(),
target_version: context.options().target_version(),
normalize_hex: is_hex_codes_in_unicode_sequences_enabled(context),
}
}
@ -96,7 +103,33 @@ impl StringNormalizer {
self.preferred_quote_style
};
match self.quoting {
let quoting = if let FStringState::InsideExpressionElement(context) = self.f_string_state {
// If we're inside an f-string, we need to make sure to preserve the
// existing quotes unless we're inside a triple-quoted f-string and
// the inner string itself isn't triple-quoted. For example:
//
// ```python
// f"""outer {"inner"}""" # Valid
// f"""outer {"""inner"""}""" # Invalid
// ```
//
// Or, if the target version supports PEP 701.
//
// The reason to preserve the quotes is based on the assumption that
// the original f-string is valid in terms of quoting, and we don't
// want to change that to make it invalid.
if (context.quotes().is_triple() && !string.quotes().is_triple())
|| self.target_version.supports_pep_701()
{
self.quoting
} else {
Quoting::Preserve
}
} else {
self.quoting
};
match quoting {
Quoting::Preserve => string.quotes(),
Quoting::CanChange => {
if let Some(preferred_quote) = QuoteChar::from_style(preferred_style) {

View file

@ -873,11 +873,11 @@ impl Ranged for LogicalLine {
}
}
struct VerbatimText {
pub(crate) struct VerbatimText {
verbatim_range: TextRange,
}
fn verbatim_text<T>(item: T) -> VerbatimText
pub(crate) fn verbatim_text<T>(item: T) -> VerbatimText
where
T: Ranged,
{

View file

@ -902,7 +902,7 @@ log.info(f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share
)
dict_with_lambda_values = {
@@ -524,61 +383,54 @@
@@ -524,65 +383,58 @@
# Complex string concatenations with a method call in the middle.
code = (
@ -941,7 +941,7 @@ log.info(f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share
log.info(
- "Skipping:"
- f" {desc['db_id']} {foo('bar',x=123)} {'foo' != 'bar'} {(x := 'abc=')} {pos_share=} {desc['status']} {desc['exposure_max']}"
+ f'Skipping: {desc["db_id"]} {foo("bar",x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
+ f'Skipping: {desc["db_id"]} {foo("bar", x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
)
log.info(
@ -981,6 +981,18 @@ log.info(f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share
)
log.info(
- f"""Skipping: {"a" == 'b'} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
+ f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
log.info(
@@ -590,5 +442,5 @@
)
log.info(
- f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share=} {desc['status']} {desc['exposure_max']}"""
+ f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
```
## Ruff Output
@ -1394,7 +1406,7 @@ log.info(
)
log.info(
f'Skipping: {desc["db_id"]} {foo("bar",x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
f'Skipping: {desc["db_id"]} {foo("bar", x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
)
log.info(
@ -1422,7 +1434,7 @@ log.info(
)
log.info(
f"""Skipping: {"a" == 'b'} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
log.info(
@ -1430,7 +1442,7 @@ log.info(
)
log.info(
f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share=} {desc['status']} {desc['exposure_max']}"""
f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
```

View file

@ -832,7 +832,7 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
some_commented_string = ( # This comment stays at the top.
"This string is long but not so long that it needs hahahah toooooo be so greatttt"
@@ -279,36 +280,25 @@
@@ -279,37 +280,26 @@
)
lpar_and_rpar_have_comments = func_call( # LPAR Comment
@ -852,31 +852,32 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
- f" {'' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
-)
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {'' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
+
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {'{{}}' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
-cmd_fstring = (
- "sudo -E deluge-console info --detailed --sort-reverse=time_added"
- f" {'{{}}' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
-)
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {'{{}}' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {{'' if ID is None else ID}} | perl -nE 'print if /^{field}:/'"
-cmd_fstring = (
- "sudo -E deluge-console info --detailed --sort-reverse=time_added {'' if ID is"
- f" None else ID}} | perl -nE 'print if /^{field}:/'"
-)
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {{'' if ID is None else ID}} | perl -nE 'print if /^{field}:/'"
+fstring = f"This string really doesn't need to be an {{{{fstring}}}}, but this one most certainly, absolutely {does}."
+
fstring = (
- "This string really doesn't need to be an {{fstring}}, but this one most"
- f" certainly, absolutely {does}."
+ f"We have to remember to escape {braces}." " Like {these}." f" But not {this}."
)
-
-fstring = f"We have to remember to escape {braces}. Like {{these}}. But not {this}."
-fstring = f"We have to remember to escape {braces}. Like {{these}}. But not {this}."
-
class A:
class B:
@@ -364,10 +354,7 @@
def foo():
if not hasattr(module, name):
@ -979,7 +980,13 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
)
# The parens should NOT be removed in this case.
@@ -518,88 +494,78 @@
@@ -513,93 +489,83 @@
temp_msg = (
- f"{f'{humanize_number(pos)}.': <{pound_len+2}} "
+ f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)
@ -1103,7 +1110,13 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
"6. Click on Create Credential at the top."
'7. At the top click the link for "API key".'
"8. No application restrictions are needed. Click Create at the bottom."
@@ -613,55 +579,40 @@
@@ -608,60 +574,45 @@
# It shouldn't matter if the string prefixes are capitalized.
temp_msg = (
- f"{F'{humanize_number(pos)}.': <{pound_len+2}} "
+ f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)
@ -1688,7 +1701,7 @@ class X:
temp_msg = (
f"{f'{humanize_number(pos)}.': <{pound_len+2}} "
f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)
@ -1773,7 +1786,7 @@ message = (
# It shouldn't matter if the string prefixes are capitalized.
temp_msg = (
f"{F'{humanize_number(pos)}.': <{pound_len+2}} "
f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)

View file

@ -0,0 +1,54 @@
---
source: crates/ruff_python_formatter/tests/fixtures.rs
input_file: crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/fstring_py312.py
---
## Input
```python
# This file contains test cases only for cases where the logic tests for whether
# the target version is 3.12 or later. A user can have 3.12 syntax even if the target
# version isn't set.
# Quotes re-use
f"{'a'}"
```
## Outputs
### Output 1
```
indent-style = space
line-width = 88
indent-width = 4
quote-style = Double
line-ending = LineFeed
magic-trailing-comma = Respect
docstring-code = Disabled
docstring-code-line-width = "dynamic"
preview = Disabled
target_version = Py312
source_type = Python
```
```python
# This file contains test cases only for cases where the logic tests for whether
# the target version is 3.12 or later. A user can have 3.12 syntax even if the target
# version isn't set.
# Quotes re-use
f"{'a'}"
```
#### Preview changes
```diff
--- Stable
+++ Preview
@@ -3,4 +3,4 @@
# version isn't set.
# Quotes re-use
-f"{'a'}"
+f"{"a"}"
```