mirror of
https://github.com/astral-sh/ruff.git
synced 2025-08-03 10:22:24 +00:00
ruff_python_formatter: support reformatting Markdown code blocks (#9030)
(This is not possible to actually use until https://github.com/astral-sh/ruff/pull/8854 is merged.) This commit slots in support for formatting Markdown fenced code blocks[1]. With the refactoring done for reStructuredText previously, this ended up being pretty easy to add. Markdown code blocks are also quite a bit easier to parse and recognize correctly. One point of contention in #8860 is whether to assume that unlabeled Markdown code fences are Python or not by default. In this PR, we make such an assumption. This follows what `rustdoc` does. The mitigation here is that if an unlabeled code block isn't Python, then it probably won't parse as Python. And we'll end up skipping it. So in the vast majority of cases, the worst thing that can happen is a little bit of wasted work. Closes #8860 [1]: https://spec.commonmark.org/0.30/#fenced-code-blocks
This commit is contained in:
parent
b021ede481
commit
04ec11a73d
4 changed files with 5291 additions and 1 deletions
|
@ -82,6 +82,10 @@ impl Transformer for Normalizer {
|
|||
// everything after it. Talk about a hammer.
|
||||
Regex::new(r#"::(?s:.*)"#).unwrap()
|
||||
});
|
||||
static STRIP_MARKDOWN_BLOCKS: Lazy<Regex> = Lazy::new(|| {
|
||||
// This covers more than valid Markdown blocks, but that's OK.
|
||||
Regex::new(r#"(```|~~~)\p{any}*(```|~~~|$)"#).unwrap()
|
||||
});
|
||||
|
||||
// Start by (1) stripping everything that looks like a code
|
||||
// snippet, since code snippets may be completely reformatted if
|
||||
|
@ -98,6 +102,12 @@ impl Transformer for Normalizer {
|
|||
"<RSTBLOCK-CODE-SNIPPET: Removed by normalizer>\n",
|
||||
)
|
||||
.into_owned();
|
||||
string_literal.value = STRIP_MARKDOWN_BLOCKS
|
||||
.replace_all(
|
||||
&string_literal.value,
|
||||
"<MARKDOWN-CODE-SNIPPET: Removed by normalizer>\n",
|
||||
)
|
||||
.into_owned();
|
||||
// Normalize a string by (2) stripping any leading and trailing space from each
|
||||
// line, and (3) removing any blank lines from the start and end of the string.
|
||||
string_literal.value = string_literal
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue