[red-knot] Support for TOML configs in Markdown tests (#14785)

## Summary

This adds support for specifying the target Python version from a
Markdown test. It is a somewhat limited ad-hoc solution, but designed to
be future-compatible. TOML blocks can be added to arbitrary sections in
the Markdown block. They have the following format:

````markdown
```toml
[tool.knot.environment]
target-version = "3.13"
```
````

So far, there is nothing else that can be configured, but it should be
straightforward to extend this to things like a custom typeshed path.

This is in preparation for the statically-known branches feature where
we are going to have to specify the target version for lots of tests.

## Test Plan

- New Markdown test that fails without the explicitly specified
`target-version`.
- Manually tested various error paths when specifying a wrong
`target-version` field.
- Made sure that running tests is as fast as before.
This commit is contained in:
David Peter 2024-12-06 10:22:08 +01:00 committed by GitHub
parent 56afb12ae7
commit b01a651e69
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 202 additions and 47 deletions

2
Cargo.lock generated
View file

@ -2358,7 +2358,9 @@ dependencies = [
"ruff_text_size",
"rustc-hash 2.1.0",
"salsa",
"serde",
"smallvec",
"toml",
]
[[package]]

View file

@ -0,0 +1,67 @@
This test makes sure that `red_knot_test` correctly parses the TOML configuration blocks and applies
the correct settings hierarchically.
The following configuration will be attached to the *root* section (without any heading):
```toml
[environment]
target-version = "3.10"
```
# Basic
Here, we simply make sure that we pick up the global configuration from the root section:
```py
reveal_type(sys.version_info[:2] == (3, 10)) # revealed: Literal[True]
```
# Inheritance
## Child
### Grandchild
The same should work for arbitrarly nested sections:
```py
reveal_type(sys.version_info[:2] == (3, 10)) # revealed: Literal[True]
```
# Overwriting
Here, we make sure that we can overwrite the global configuration in a child section:
```toml
[environment]
target-version = "3.11"
```
```py
reveal_type(sys.version_info[:2] == (3, 11)) # revealed: Literal[True]
```
# No global state
There is no global state. This section should again use the root configuration:
```py
reveal_type(sys.version_info[:2] == (3, 10)) # revealed: Literal[True]
```
# Overwriting affects children
Children in this section should all use the section configuration:
```toml
[environment]
target-version = "3.12"
```
## Child
### Grandchild
```py
reveal_type(sys.version_info[:2] == (3, 12)) # revealed: Literal[True]
```

View file

@ -1,5 +1,10 @@
# `sys.version_info`
```toml
[environment]
target-version = "3.9"
```
## The type of `sys.version_info`
The type of `sys.version_info` is `sys._version_info`, at least according to typeshed's stubs (which

View file

@ -27,6 +27,8 @@ regex = { workspace = true }
rustc-hash = { workspace = true }
salsa = { workspace = true }
smallvec = { workspace = true }
serde = { workspace = true }
toml = { workspace = true }
[dev-dependencies]

View file

@ -225,6 +225,22 @@ A header-demarcated section must either be a test or a grouping header; it canno
a header section can either contain embedded files (making it a test), or it can contain more
deeply-nested headers (headers with more `#`), but it cannot contain both.
## Configuration
The test framework supports a TOML-based configuration format, which is a subset of the full red-knot
configuration format. This configuration can be specified in fenced code blocks with `toml` as the
language tag:
````markdown
```toml
[environment]
target-version = "3.10"
```
````
This configuration will apply to all tests in the same section, and all nested sections within that
section. Nested sections can override configurations from their parent sections.
## Documentation of tests
Arbitrary Markdown syntax (including of course normal prose paragraphs) is permitted (and ignored by
@ -282,30 +298,6 @@ possible in these files.
A fenced code block with no language will always be an error.
### Configuration
We will add the ability to specify non-default red-knot configurations to use in tests, by including
a TOML code block:
````markdown
```toml
[tool.knot]
warn-on-any = true
```
```py
from typing import Any
def f(x: Any): # error: [use-of-any]
pass
```
````
It should be possible to include a TOML code block in a single test (as shown), or in a grouping
section, in which case it applies to all nested tests within that grouping section. Configurations
at multiple level are allowed and merged, with the most-nested (closest to the test) taking
precedence.
### Running just a single test from a suite
Having each test in a suite always run as a distinct Rust test would require writing our own test
@ -317,11 +309,11 @@ variable.
### Configuring search paths and kinds
The red-knot TOML configuration format hasn't been designed yet, and we may want to implement
The red-knot TOML configuration format hasn't been finalized, and we may want to implement
support in the test framework for configuring search paths before it is designed. If so, we can
define some configuration options for now under the `[tool.knot.tests]` namespace. In the future,
perhaps some of these can be replaced by real red-knot configuration options; some or all may also
be kept long-term as test-specific options.
define some configuration options for now under the `[tests]` namespace. In the future, perhaps
some of these can be replaced by real red-knot configuration options; some or all may also be
kept long-term as test-specific options.
Some configuration options we will want to provide:
@ -339,13 +331,13 @@ non-default value using the `workspace-root` config.
### Specifying a custom typeshed
Some tests will need to override the default typeshed with custom files. The `[tool.knot.tests]`
configuration option `typeshed-root` should be usable for this:
Some tests will need to override the default typeshed with custom files. The `[environment]`
configuration option `typeshed-path` can be used to do this:
````markdown
```toml
[tool.knot.tests]
typeshed-root = "/typeshed"
[environment]
typeshed-path = "/typeshed"
```
This file is importable as part of our custom typeshed, because it is within `/typeshed`, which we

View file

@ -0,0 +1,28 @@
//! TOML-deserializable Red Knot configuration, similar to `knot.toml`, to be able to
//! control some configuration options from Markdown files. For now, this supports the
//! following limited structure:
//!
//! ```toml
//! [environment]
//! target-version = "3.10"
//! ```
use anyhow::Context;
use serde::Deserialize;
#[derive(Deserialize)]
pub(crate) struct MarkdownTestConfig {
pub(crate) environment: Environment,
}
impl MarkdownTestConfig {
pub(crate) fn from_str(s: &str) -> anyhow::Result<Self> {
toml::from_str(s).context("Error while parsing Markdown TOML config")
}
}
#[derive(Deserialize)]
pub(crate) struct Environment {
#[serde(rename = "target-version")]
pub(crate) target_version: String,
}

View file

@ -2,14 +2,17 @@ use camino::Utf8Path;
use colored::Colorize;
use parser as test_parser;
use red_knot_python_semantic::types::check_types;
use red_knot_python_semantic::Program;
use ruff_db::diagnostic::{Diagnostic, ParseDiagnostic};
use ruff_db::files::{system_path_to_file, File, Files};
use ruff_db::parsed::parsed_module;
use ruff_db::system::{DbWithTestSystem, SystemPathBuf};
use ruff_source_file::LineIndex;
use ruff_text_size::TextSize;
use salsa::Setter;
mod assertion;
mod config;
mod db;
mod diagnostic;
mod matcher;
@ -26,7 +29,7 @@ pub fn run(path: &Utf8Path, long_title: &str, short_title: &str, test_name: &str
let suite = match test_parser::parse(short_title, &source) {
Ok(suite) => suite,
Err(err) => {
panic!("Error parsing `{path}`: {err}")
panic!("Error parsing `{path}`: {err:?}")
}
};
@ -39,6 +42,10 @@ pub fn run(path: &Utf8Path, long_title: &str, short_title: &str, test_name: &str
continue;
}
Program::get(&db)
.set_target_version(&mut db)
.to(test.target_version());
// Remove all files so that the db is in a "fresh" state.
db.memory_file_system().remove_all();
Files::sync_all(&mut db);

View file

@ -1,6 +1,8 @@
use std::sync::LazyLock;
use anyhow::{bail, Context};
use memchr::memchr2;
use red_knot_python_semantic::PythonVersion;
use regex::{Captures, Match, Regex};
use rustc_hash::{FxHashMap, FxHashSet};
@ -8,6 +10,8 @@ use ruff_index::{newtype_index, IndexVec};
use ruff_python_trivia::Cursor;
use ruff_text_size::{TextLen, TextSize};
use crate::config::MarkdownTestConfig;
/// Parse the Markdown `source` as a test suite with given `title`.
pub(crate) fn parse<'s>(title: &'s str, source: &'s str) -> anyhow::Result<MarkdownTestSuite<'s>> {
let parser = Parser::new(title, source);
@ -69,6 +73,10 @@ impl<'m, 's> MarkdownTest<'m, 's> {
pub(crate) fn files(&self) -> impl Iterator<Item = &'m EmbeddedFile<'s>> {
self.files.iter()
}
pub(crate) fn target_version(&self) -> PythonVersion {
self.section.target_version
}
}
/// Iterator yielding all [`MarkdownTest`]s in a [`MarkdownTestSuite`].
@ -117,6 +125,7 @@ struct Section<'s> {
title: &'s str,
level: u8,
parent_id: Option<SectionId>,
target_version: PythonVersion,
}
#[newtype_index]
@ -174,7 +183,7 @@ impl SectionStack {
popped
}
fn parent(&mut self) -> SectionId {
fn top(&mut self) -> SectionId {
*self
.0
.last()
@ -201,6 +210,9 @@ struct Parser<'s> {
/// Names of embedded files in current active section.
current_section_files: Option<FxHashSet<&'s str>>,
/// Whether or not the current section has a config block.
current_section_has_config: bool,
}
impl<'s> Parser<'s> {
@ -210,6 +222,7 @@ impl<'s> Parser<'s> {
title,
level: 0,
parent_id: None,
target_version: PythonVersion::default(),
});
Self {
sections,
@ -218,6 +231,7 @@ impl<'s> Parser<'s> {
source_len: source.text_len(),
stack: SectionStack::new(root_section_id),
current_section_files: None,
current_section_has_config: false,
}
}
@ -285,12 +299,13 @@ impl<'s> Parser<'s> {
self.pop_sections_to_level(header_level);
let parent = self.stack.parent();
let parent = self.stack.top();
let section = Section {
title,
level: header_level.try_into()?,
parent_id: Some(parent),
target_version: self.sections[parent].target_version,
};
if self.current_section_files.is_some() {
@ -305,13 +320,14 @@ impl<'s> Parser<'s> {
self.stack.push(section_id);
self.current_section_files = None;
self.current_section_has_config = false;
Ok(())
}
fn parse_code_block(&mut self, captures: &Captures<'s>) -> anyhow::Result<()> {
// We never pop the implicit root section.
let parent = self.stack.parent();
let section = self.stack.top();
let mut config: FxHashMap<&'s str, &'s str> = FxHashMap::default();
@ -333,16 +349,24 @@ impl<'s> Parser<'s> {
let path = config.get("path").copied().unwrap_or("test.py");
// CODE_RE can't match without matches for 'lang' and 'code'.
let lang = captures
.name("lang")
.as_ref()
.map(Match::as_str)
.unwrap_or_default();
let code = captures.name("code").unwrap().into();
if lang == "toml" {
return self.parse_config(code);
}
self.files.push(EmbeddedFile {
path,
section: parent,
lang: captures
.name("lang")
.as_ref()
.map(Match::as_str)
.unwrap_or_default(),
// CODE_RE can't match without matches for 'lang' and 'code'.
code: captures.name("code").unwrap().into(),
section,
lang,
code,
md_offset: self.offset(),
});
@ -354,12 +378,12 @@ impl<'s> Parser<'s> {
"Test `{}` has duplicate files named `{path}`. \
(This is the default filename; \
consider giving some files an explicit name with `path=...`.)",
self.sections[parent].title
self.sections[section].title
));
}
return Err(anyhow::anyhow!(
"Test `{}` has duplicate files named `{path}`.",
self.sections[parent].title
self.sections[section].title
));
};
} else {
@ -369,8 +393,36 @@ impl<'s> Parser<'s> {
Ok(())
}
fn parse_config(&mut self, code: &str) -> anyhow::Result<()> {
if self.current_section_has_config {
bail!("Multiple TOML configuration blocks in the same section are not allowed.");
}
let config = MarkdownTestConfig::from_str(code)?;
let target_version = config.environment.target_version;
let parts = target_version
.split('.')
.map(str::parse)
.collect::<Result<Vec<_>, _>>()
.context(format!(
"Invalid 'target-version' component: '{target_version}'"
))?;
if parts.len() != 2 {
bail!("Invalid 'target-version': expected MAJOR.MINOR, got '{target_version}'.",);
}
let current_section = &mut self.sections[self.stack.top()];
current_section.target_version = PythonVersion::from((parts[0], parts[1]));
self.current_section_has_config = true;
Ok(())
}
fn pop_sections_to_level(&mut self, level: usize) {
while level <= self.sections[self.stack.parent()].level.into() {
while level <= self.sections[self.stack.top()].level.into() {
self.stack.pop();
// We would have errored before pushing a child section if there were files, so we know
// no parent section can have files.