mirror of
https://github.com/astral-sh/ruff.git
synced 2025-11-20 04:29:47 +00:00
## Summary This PR uses the new `Diagnostic` type for rendering formatter diagnostics. This allows the formatter to inherit all of the output formats already implemented in the linter and ty. For example, here's the new `full` output format, with the formatting diff displayed using the same infrastructure as the linter: <img width="592" height="364" alt="image" src="https://github.com/user-attachments/assets/6d09817d-3f27-4960-aa8b-41ba47fb4dc0" /> <details><summary>Resolved TODOs</summary> <p> ~~There are several limitiations/todos here still, especially around the `OutputFormat` type~~: - [x] A few literal `todo!`s for the remaining `OutputFormat`s without matching `DiagnosticFormat`s - [x] The default output format is `full` instead of something more concise like the current output - [x] Some of the output formats (namely JSON) have information that doesn't make much sense for these diagnostics The first of these is definitely resolved, and I think the other two are as well, based on discussion on the design document. In brief, we're okay inheriting the default `OutputFormat` and can separate the global option into `lint.output-format` and `format.output-format` in the future, if needed; and we're okay including redundant information in the non-human-readable output formats. My last major concern is with the performance of the new code, as discussed in the `Benchmarks` section below. A smaller question is whether we should use `Diagnostic`s for formatting errors too. I think the answer to this is yes, in line with changes we're making in the linter too. I still need to implement that here. </p> </details> <details><summary>Benchmarks</summary> <p> The values in the table are from a large benchmark on the CPython 3.10 code base, which involves checking 2011 files, 1872 of which need to be reformatted. `stable` corresponds to the same code used on `main`, while `preview-full` and `preview-concise` use the new `Diagnostic` code gated behind `--preview` for the `full` and `concise` output formats, respectively. `stable-diff` uses the `--diff` to compare the two diff rendering approaches. See the full hyperfine command below for more details. For a sense of scale, the `stable` output format produces 1873 lines on stdout, compared to 855,278 for `preview-full` and 857,798 for `stable-diff`. | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:------------------|--------------:|---------:|---------:|-------------:| | `stable` | 201.2 ± 6.8 | 192.9 | 220.6 | 1.00 | | `preview-full` | 9113.2 ± 31.2 | 9076.1 | 9152.0 | 45.29 ± 1.54 | | `preview-concise` | 214.2 ± 1.4 | 212.0 | 217.6 | 1.06 ± 0.04 | | `stable-diff` | 3308.6 ± 20.2 | 3278.6 | 3341.8 | 16.44 ± 0.56 | In summary, the `preview-concise` diagnostics are ~6% slower than the stable output format, increasing the average runtime from 201.2 ms to 214.2 ms. The `full` preview diagnostics are much more expensive, taking over 9113.2 ms to complete, which is ~3x more expensive even than the stable diffs produced by the `--diff` flag. My main takeaways here are: 1. Rendering `Edit`s is much more expensive than rendering the diffs from `--diff` 2. Constructing `Edit`s actually isn't too bad ### Constructing `Edit`s I also took a closer look at `Edit` construction by modifying the code and repeating the `preview-concise` benchmark and found that the main issue is constructing a `SourceFile` for use in the `Edit` rendering. Commenting out the `Edit` construction itself has basically no effect: | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:----------|------------:|---------:|---------:|------------:| | `stable` | 197.5 ± 1.6 | 195.0 | 200.3 | 1.00 | | `no-edit` | 208.9 ± 2.2 | 204.8 | 212.2 | 1.06 ± 0.01 | However, also omitting the source text from the `SourceFile` construction resolves the slowdown compared to `stable`. So it seems that copying the full source text into a `SourceFile` is the main cause of the slowdown for non-`full` diagnostics. | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:-----------------|------------:|---------:|---------:|------------:| | `stable` | 202.4 ± 2.9 | 197.6 | 207.9 | 1.00 | | `no-source-text` | 202.7 ± 3.3 | 196.3 | 209.1 | 1.00 ± 0.02 | ### Rendering diffs The main difference between `stable-diff` and `preview-full` seems to be the diffing strategy we use from `similar`. Both versions use the same algorithm, but in the existing [`CodeDiff`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_linter/src/source_kind.rs#L259) rendering for the `--diff` flag, we only do line-level diffing, whereas for `Diagnostic`s we use `TextDiff::iter_inline_changes` to highlight word-level changes too. Skipping the word diff for `Diagnostic`s closes most of the gap: | Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `stable-diff` | 3.323 ± 0.015 | 3.297 | 3.341 | 1.00 | | `preview-full` | 3.654 ± 0.019 | 3.618 | 3.682 | 1.10 ± 0.01 | (In some repeated runs, I've seen as small as a ~5% difference, down from 10% in the table) This doesn't actually change any of our snapshots, but it would obviously change the rendered result in a terminal since we wouldn't highlight the specific words that changed within a line. Another much smaller change that we can try is removing the deadline from the `iter_inline_changes` call. It looks like there's a fair amount of overhead from the default 500 ms deadline for computing these, and using `iter_inline_changes(op, None)` (`None` for the optional deadline argument) improves the runtime quite a bit: | Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `stable-diff` | 3.322 ± 0.013 | 3.298 | 3.341 | 1.00 | | `preview-full` | 5.296 ± 0.030 | 5.251 | 5.366 | 1.59 ± 0.01 | <hr> <details><summary>hyperfine command</summary> ```shell cargo build --release --bin ruff && hyperfine --ignore-failure --warmup 10 --export-markdown /tmp/table.md \ -n stable -n preview-full -n preview-concise -n stable-diff \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=full" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=concise" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --diff" ``` </details> </p> </details> ## Test Plan Some new CLI tests and manual testing
199 lines
6.4 KiB
Rust
199 lines
6.4 KiB
Rust
use crate::GroupId;
|
|
use crate::prelude::TagKind;
|
|
use ruff_text_size::TextRange;
|
|
use std::error::Error;
|
|
|
|
#[derive(Debug, PartialEq, Eq, Copy, Clone)]
|
|
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
|
|
/// Series of errors encountered during formatting
|
|
pub enum FormatError {
|
|
/// In case a node can't be formatted because it either misses a require child element or
|
|
/// a child is present that should not (e.g. a trailing comma after a rest element).
|
|
SyntaxError { message: &'static str },
|
|
/// In case range formatting failed because the provided range was larger
|
|
/// than the formatted syntax tree
|
|
RangeError { input: TextRange, tree: TextRange },
|
|
|
|
/// In case printing the document failed because it has an invalid structure.
|
|
InvalidDocument(InvalidDocumentError),
|
|
|
|
/// Formatting failed because some content encountered a situation where a layout
|
|
/// choice by an enclosing [`crate::Format`] resulted in a poor layout for a child [`crate::Format`].
|
|
///
|
|
/// It's up to an enclosing [`crate::Format`] to handle the error and pick another layout.
|
|
/// This error should not be raised if there's no outer [`crate::Format`] handling the poor layout error,
|
|
/// avoiding that formatting of the whole document fails.
|
|
PoorLayout,
|
|
}
|
|
|
|
impl std::fmt::Display for FormatError {
|
|
fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
match self {
|
|
FormatError::SyntaxError { message } => {
|
|
std::write!(fmt, "syntax error: {message}")
|
|
}
|
|
FormatError::RangeError { input, tree } => std::write!(
|
|
fmt,
|
|
"formatting range {input:?} is larger than syntax tree {tree:?}"
|
|
),
|
|
FormatError::InvalidDocument(error) => std::write!(
|
|
fmt,
|
|
"Invalid document: {error}\n\n This is an internal Ruff error. Please report if necessary."
|
|
),
|
|
FormatError::PoorLayout => {
|
|
std::write!(
|
|
fmt,
|
|
"Poor layout: The formatter wasn't able to pick a good layout for your document. This is an internal Ruff error. Please report if necessary."
|
|
)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
impl Error for FormatError {}
|
|
|
|
impl From<PrintError> for FormatError {
|
|
fn from(error: PrintError) -> Self {
|
|
FormatError::from(&error)
|
|
}
|
|
}
|
|
|
|
impl From<&PrintError> for FormatError {
|
|
fn from(error: &PrintError) -> Self {
|
|
match error {
|
|
PrintError::InvalidDocument(reason) => FormatError::InvalidDocument(*reason),
|
|
}
|
|
}
|
|
}
|
|
|
|
impl FormatError {
|
|
pub fn syntax_error(message: &'static str) -> Self {
|
|
Self::SyntaxError { message }
|
|
}
|
|
}
|
|
|
|
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
|
|
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
|
|
pub enum InvalidDocumentError {
|
|
/// Mismatching start/end kinds
|
|
///
|
|
/// ```plain
|
|
/// StartIndent
|
|
/// ...
|
|
/// EndGroup
|
|
/// ```
|
|
StartEndTagMismatch {
|
|
start_kind: TagKind,
|
|
end_kind: TagKind,
|
|
},
|
|
|
|
/// End tag without a corresponding start tag.
|
|
///
|
|
/// ```plain
|
|
/// Text
|
|
/// EndGroup
|
|
/// ```
|
|
StartTagMissing {
|
|
kind: TagKind,
|
|
},
|
|
|
|
/// Expected a specific start tag but instead is:
|
|
/// - at the end of the document
|
|
/// - at another start tag
|
|
/// - at an end tag
|
|
ExpectedStart {
|
|
expected_start: TagKind,
|
|
actual: ActualStart,
|
|
},
|
|
|
|
UnknownGroupId {
|
|
group_id: GroupId,
|
|
},
|
|
}
|
|
|
|
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
|
|
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
|
|
pub enum ActualStart {
|
|
/// The actual element is not a tag.
|
|
Content,
|
|
|
|
/// The actual element was a start tag of another kind.
|
|
Start(TagKind),
|
|
|
|
/// The actual element is an end tag instead of a start tag.
|
|
End(TagKind),
|
|
|
|
/// Reached the end of the document
|
|
EndOfDocument,
|
|
}
|
|
|
|
impl std::fmt::Display for InvalidDocumentError {
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
match self {
|
|
InvalidDocumentError::StartEndTagMismatch {
|
|
start_kind,
|
|
end_kind,
|
|
} => {
|
|
std::write!(
|
|
f,
|
|
"Expected end tag of kind {start_kind:?} but found {end_kind:?}."
|
|
)
|
|
}
|
|
InvalidDocumentError::StartTagMissing { kind } => {
|
|
std::write!(f, "End tag of kind {kind:?} without matching start tag.")
|
|
}
|
|
InvalidDocumentError::ExpectedStart {
|
|
expected_start,
|
|
actual,
|
|
} => match actual {
|
|
ActualStart::EndOfDocument => {
|
|
std::write!(
|
|
f,
|
|
"Expected start tag of kind {expected_start:?} but at the end of document."
|
|
)
|
|
}
|
|
ActualStart::Start(start) => {
|
|
std::write!(
|
|
f,
|
|
"Expected start tag of kind {expected_start:?} but found start tag of kind {start:?}."
|
|
)
|
|
}
|
|
ActualStart::End(end) => {
|
|
std::write!(
|
|
f,
|
|
"Expected start tag of kind {expected_start:?} but found end tag of kind {end:?}."
|
|
)
|
|
}
|
|
ActualStart::Content => {
|
|
std::write!(
|
|
f,
|
|
"Expected start tag of kind {expected_start:?} but found non-tag element."
|
|
)
|
|
}
|
|
},
|
|
InvalidDocumentError::UnknownGroupId { group_id } => {
|
|
std::write!(
|
|
f,
|
|
"Encountered unknown group id {group_id:?}. Ensure that the group with the id {group_id:?} exists and that the group is a parent of or comes before the element referring to it."
|
|
)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
#[derive(Debug, Clone, Eq, PartialEq)]
|
|
pub enum PrintError {
|
|
InvalidDocument(InvalidDocumentError),
|
|
}
|
|
|
|
impl Error for PrintError {}
|
|
|
|
impl std::fmt::Display for PrintError {
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
match self {
|
|
PrintError::InvalidDocument(inner) => {
|
|
std::write!(f, "Invalid document: {inner}")
|
|
}
|
|
}
|
|
}
|
|
}
|