mirror of
https://github.com/astral-sh/ruff.git
synced 2025-08-02 01:42:25 +00:00
Add initial formatter implementation (#2883)
# Summary This PR contains the code for the autoformatter proof-of-concept. ## Crate structure The primary formatting hook is the `fmt` function in `crates/ruff_python_formatter/src/lib.rs`. The current formatter approach is outlined in `crates/ruff_python_formatter/src/lib.rs`, and is structured as follows: - Tokenize the code using the RustPython lexer. - In `crates/ruff_python_formatter/src/trivia.rs`, extract a variety of trivia tokens from the token stream. These include comments, trailing commas, and empty lines. - Generate the AST via the RustPython parser. - In `crates/ruff_python_formatter/src/cst.rs`, convert the AST to a CST structure. As of now, the CST is nearly identical to the AST, except that every node gets a `trivia` vector. But we might want to modify it further. - In `crates/ruff_python_formatter/src/attachment.rs`, attach each trivia token to the corresponding CST node. The logic for this is mostly in `decorate_trivia` and is ported almost directly from Prettier (given each token, find its preceding, following, and enclosing nodes, then attach the token to the appropriate node in a second pass). - In `crates/ruff_python_formatter/src/newlines.rs`, normalize newlines to match Black’s preferences. This involves traversing the CST and inserting or removing `TriviaToken` values as we go. - Call `format!` on the CST, which delegates to type-specific formatter implementations (e.g., `crates/ruff_python_formatter/src/format/stmt.rs` for `Stmt` nodes, and similar for `Expr` nodes; the others are trivial). Those type-specific implementations delegate to kind-specific functions (e.g., `format_func_def`). ## Testing and iteration The formatter is being developed against the Black test suite, which was copied over in-full to `crates/ruff_python_formatter/resources/test/fixtures/black`. The Black fixtures had to be modified to create `[insta](https://github.com/mitsuhiko/insta)`-compatible snapshots, which now exist in the repo. My approach thus far has been to try and improve coverage by tackling fixtures one-by-one. ## What works, and what doesn’t - *Most* nodes are supported at a basic level (though there are a few stragglers at time of writing, like `StmtKind::Try`). - Newlines are properly preserved in most cases. - Magic trailing commas are properly preserved in some (but not all) cases. - Trivial leading and trailing standalone comments mostly work (although maybe not at the end of a file). - Inline comments, and comments within expressions, often don’t work -- they work in a few cases, but it’s one-off right now. (We’re probably associating them with the “right” nodes more often than we are actually rendering them in the right place.) - We don’t properly normalize string quotes. (At present, we just repeat any constants verbatim.) - We’re mishandling a bunch of wrapping cases (if we treat Black as the reference implementation). Here are a few examples (demonstrating Black's stable behavior): ```py # In some cases, if the end expression is "self-closing" (functions, # lists, dictionaries, sets, subscript accesses, and any length-two # boolean operations that end in these elments), Black # will wrap like this... if some_expression and f( b, c, d, ): pass # ...whereas we do this: if ( some_expression and f( b, c, d, ) ): pass # If function arguments can fit on a single line, then Black will # format them like this, rather than exploding them vertically. if f( a, b, c, d, e, f, g, ... ): pass ``` - We don’t properly preserve parentheses in all cases. Black preserves parentheses in some but not all cases.
This commit is contained in:
parent
f661c90bd7
commit
ca49b00e55
134 changed files with 12044 additions and 18 deletions
143
crates/ruff_python_formatter/src/lib.rs
Normal file
143
crates/ruff_python_formatter/src/lib.rs
Normal file
|
@ -0,0 +1,143 @@
|
|||
use anyhow::Result;
|
||||
use ruff_formatter::{format, Formatted, IndentStyle, SimpleFormatOptions};
|
||||
use rustpython_parser::lexer::LexResult;
|
||||
|
||||
use crate::attachment::attach;
|
||||
use crate::context::ASTFormatContext;
|
||||
use crate::core::locator::Locator;
|
||||
use crate::core::rustpython_helpers;
|
||||
use crate::cst::Stmt;
|
||||
use crate::newlines::normalize_newlines;
|
||||
use crate::parentheses::normalize_parentheses;
|
||||
|
||||
mod attachment;
|
||||
pub mod builders;
|
||||
pub mod cli;
|
||||
pub mod context;
|
||||
mod core;
|
||||
mod cst;
|
||||
mod format;
|
||||
mod newlines;
|
||||
mod parentheses;
|
||||
pub mod shared_traits;
|
||||
#[cfg(test)]
|
||||
mod test;
|
||||
pub mod trivia;
|
||||
|
||||
pub fn fmt(contents: &str) -> Result<Formatted<ASTFormatContext>> {
|
||||
// Tokenize once.
|
||||
let tokens: Vec<LexResult> = rustpython_helpers::tokenize(contents);
|
||||
|
||||
// Extract trivia.
|
||||
let trivia = trivia::extract_trivia_tokens(&tokens);
|
||||
|
||||
// Parse the AST.
|
||||
let python_ast = rustpython_helpers::parse_program_tokens(tokens, "<filename>")?;
|
||||
|
||||
// Convert to a CST.
|
||||
let mut python_cst: Vec<Stmt> = python_ast.into_iter().map(Into::into).collect();
|
||||
|
||||
// Attach trivia.
|
||||
attach(&mut python_cst, trivia);
|
||||
normalize_newlines(&mut python_cst);
|
||||
normalize_parentheses(&mut python_cst);
|
||||
|
||||
format!(
|
||||
ASTFormatContext::new(
|
||||
SimpleFormatOptions {
|
||||
indent_style: IndentStyle::Space(4),
|
||||
line_width: 88.try_into().unwrap(),
|
||||
},
|
||||
Locator::new(contents)
|
||||
),
|
||||
[format::builders::block(&python_cst)]
|
||||
)
|
||||
.map_err(Into::into)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use std::path::Path;
|
||||
|
||||
use anyhow::Result;
|
||||
use test_case::test_case;
|
||||
|
||||
use crate::fmt;
|
||||
use crate::test::test_resource_path;
|
||||
|
||||
#[test_case(Path::new("simple_cases/class_blank_parentheses.py"); "class_blank_parentheses")]
|
||||
#[test_case(Path::new("simple_cases/class_methods_new_line.py"); "class_methods_new_line")]
|
||||
#[test_case(Path::new("simple_cases/beginning_backslash.py"); "beginning_backslash")]
|
||||
#[test_case(Path::new("simple_cases/import_spacing.py"); "import_spacing")]
|
||||
fn passing(path: &Path) -> Result<()> {
|
||||
let snapshot = format!("{}", path.display());
|
||||
let content = std::fs::read_to_string(test_resource_path(
|
||||
Path::new("fixtures/black").join(path).as_path(),
|
||||
))?;
|
||||
let formatted = fmt(&content)?;
|
||||
insta::assert_display_snapshot!(snapshot, formatted.print()?.as_code());
|
||||
Ok(())
|
||||
}
|
||||
|
||||
#[test_case(Path::new("simple_cases/collections.py"); "collections")]
|
||||
#[test_case(Path::new("simple_cases/bracketmatch.py"); "bracketmatch")]
|
||||
fn passing_modulo_string_normalization(path: &Path) -> Result<()> {
|
||||
fn adjust_quotes(contents: &str) -> String {
|
||||
// Replace all single quotes with double quotes.
|
||||
contents.replace('\'', "\"")
|
||||
}
|
||||
|
||||
let snapshot = format!("{}", path.display());
|
||||
let content = std::fs::read_to_string(test_resource_path(
|
||||
Path::new("fixtures/black").join(path).as_path(),
|
||||
))?;
|
||||
let formatted = fmt(&content)?;
|
||||
insta::assert_display_snapshot!(snapshot, adjust_quotes(formatted.print()?.as_code()));
|
||||
Ok(())
|
||||
}
|
||||
|
||||
#[ignore]
|
||||
// Passing apart from one deviation in RHS tuple assignment.
|
||||
#[test_case(Path::new("simple_cases/tupleassign.py"); "tupleassign")]
|
||||
// Lots of deviations, _mostly_ related to string normalization and wrapping.
|
||||
#[test_case(Path::new("simple_cases/expression.py"); "expression")]
|
||||
#[test_case(Path::new("simple_cases/function.py"); "function")]
|
||||
#[test_case(Path::new("simple_cases/function2.py"); "function2")]
|
||||
#[test_case(Path::new("simple_cases/power_op_spacing.py"); "power_op_spacing")]
|
||||
fn failing(path: &Path) -> Result<()> {
|
||||
let snapshot = format!("{}", path.display());
|
||||
let content = std::fs::read_to_string(test_resource_path(
|
||||
Path::new("fixtures/black").join(path).as_path(),
|
||||
))?;
|
||||
let formatted = fmt(&content)?;
|
||||
insta::assert_display_snapshot!(snapshot, formatted.print()?.as_code());
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Use this test to debug the formatting of some snipped
|
||||
#[ignore]
|
||||
#[test]
|
||||
fn quick_test() {
|
||||
let src = r#"
|
||||
{
|
||||
k: v for k, v in a_very_long_variable_name_that_exceeds_the_line_length_by_far_keep_going
|
||||
}
|
||||
"#;
|
||||
let formatted = fmt(src).unwrap();
|
||||
|
||||
// Uncomment the `dbg` to print the IR.
|
||||
// Use `dbg_write!(f, []) instead of `write!(f, [])` in your formatting code to print some IR
|
||||
// inside of a `Format` implementation
|
||||
// dbg!(formatted.document());
|
||||
|
||||
let printed = formatted.print().unwrap();
|
||||
|
||||
assert_eq!(
|
||||
printed.as_code(),
|
||||
r#"{
|
||||
k: v
|
||||
for k, v in a_very_long_variable_name_that_exceeds_the_line_length_by_far_keep_going
|
||||
}"#
|
||||
);
|
||||
}
|
||||
}
|
Loading…
Add table
Add a link
Reference in a new issue