Basic string formatting

<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR implements formatting for non-f-string Strings that do not use implicit concatenation. 

Docstring formatting is out of the scope of this PR.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

I added a few tests for simple string literals. 

## Performance

Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:

1. To detect if it is an implicit string continuation
2. To detect if the string contains any new lines
3. To detect the preferred quote
4. To normalize the string

Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.
This commit is contained in:
Micha Reiser 2023-06-23 09:46:05 +02:00 committed by GitHub
parent 3e12bdff45
commit c52aa8f065
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
46 changed files with 1278 additions and 1086 deletions

View file

@ -226,6 +226,41 @@ impl Format<PyFormatContext<'_>> for VerbatimText {
}
}
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum QuoteStyle {
Single,
Double,
}
impl QuoteStyle {
pub const fn as_char(self) -> char {
match self {
QuoteStyle::Single => '\'',
QuoteStyle::Double => '"',
}
}
#[must_use]
pub const fn opposite(self) -> QuoteStyle {
match self {
QuoteStyle::Single => QuoteStyle::Double,
QuoteStyle::Double => QuoteStyle::Single,
}
}
}
impl TryFrom<char> for QuoteStyle {
type Error = ();
fn try_from(value: char) -> std::result::Result<Self, Self::Error> {
match value {
'\'' => Ok(QuoteStyle::Single),
'"' => Ok(QuoteStyle::Double),
_ => Err(()),
}
}
}
#[cfg(test)]
mod tests {
use anyhow::Result;
@ -342,29 +377,8 @@ if True:
let printed = format_module(&content)?;
let formatted_code = printed.as_code();
let reformatted =
format_module(formatted_code).unwrap_or_else(|err| panic!("Expected formatted code to be valid syntax but it contains syntax errors: {err}\n{formatted_code}"));
ensure_stability_when_formatting_twice(formatted_code);
if reformatted.as_code() != formatted_code {
let diff = TextDiff::from_lines(formatted_code, reformatted.as_code())
.unified_diff()
.header("Formatted once", "Formatted twice")
.to_string();
panic!(
r#"Reformatting the formatted code a second time resulted in formatting changes.
{diff}
Formatted once:
{formatted_code}
Formatted twice:
{}"#,
reformatted.as_code()
);
}
let snapshot = format!(
r#"## Input
{}