[red-knot] Support custom typeshed Markdown tests (#15683)

## Summary

- Add feature to specify a custom typeshed from within Markdown-based
  tests
- Port "builtins" unit tests from `infer.rs` to Markdown tests, part of
  #13696

## Test Plan

- Tests for the custom typeshed feature
- New Markdown tests for deleted Rust unit tests
This commit is contained in:
David Peter 2025-01-23 12:36:38 +01:00 committed by GitHub
parent 84301a7300
commit 7855f03735
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
13 changed files with 275 additions and 150 deletions

View file

@ -322,7 +322,7 @@ where
.search_paths
.extra_paths
.iter()
.chain(program_settings.search_paths.typeshed.as_ref())
.chain(program_settings.search_paths.custom_typeshed.as_ref())
{
std::fs::create_dir_all(path.as_std_path())
.with_context(|| format!("Failed to create search path `{path}`"))?;

View file

@ -92,7 +92,7 @@ impl Options {
.map(|path| path.absolute(project_root, system))
.collect(),
src_roots,
typeshed: typeshed.map(|path| path.absolute(project_root, system)),
custom_typeshed: typeshed.map(|path| path.absolute(project_root, system)),
site_packages: python
.map(|venv_path| SitePackages::Derived {
venv_path: venv_path.absolute(project_root, system),

View file

@ -1,8 +1,70 @@
# Importing builtin module
# Builtins
## Importing builtin module
Builtin symbols can be explicitly imported:
```py
import builtins
x = builtins.chr
reveal_type(x) # revealed: Literal[chr]
reveal_type(builtins.chr) # revealed: Literal[chr]
```
## Implicit use of builtin
Or used implicitly:
```py
reveal_type(chr) # revealed: Literal[chr]
reveal_type(str) # revealed: Literal[str]
```
## Builtin symbol from custom typeshed
If we specify a custom typeshed, we can use the builtin symbol from it, and no longer access the
builtins from the "actual" vendored typeshed:
```toml
[environment]
typeshed = "/typeshed"
```
```pyi path=/typeshed/stdlib/builtins.pyi
class Custom: ...
custom_builtin: Custom
```
```pyi path=/typeshed/stdlib/typing_extensions.pyi
def reveal_type(obj, /): ...
```
```py
reveal_type(custom_builtin) # revealed: Custom
# error: [unresolved-reference]
reveal_type(str) # revealed: Unknown
```
## Unknown builtin (later defined)
`foo` has a type of `Unknown` in this example, as it relies on `bar` which has not been defined at
that point:
```toml
[environment]
typeshed = "/typeshed"
```
```pyi path=/typeshed/stdlib/builtins.pyi
foo = bar
bar = 1
```
```pyi path=/typeshed/stdlib/typing_extensions.pyi
def reveal_type(obj, /): ...
```
```py
reveal_type(foo) # revealed: Unknown
```

View file

@ -0,0 +1,95 @@
# Custom typeshed
The `environment.typeshed` configuration option can be used to specify a custom typeshed directory
for Markdown-based tests. Custom typeshed stubs can then be placed in the specified directory using
fenced code blocks with language `pyi`, and will be used instead of the vendored copy of typeshed.
A fenced code block with language `text` can be used to provide a `stdlib/VERSIONS` file in the
custom typeshed root. If no such file is created explicitly, it will be automatically created with
entries enabling all specified `<typeshed-root>/stdlib` files for all supported Python versions.
## Basic example (auto-generated `VERSIONS` file)
First, we specify `/typeshed` as the custom typeshed directory:
```toml
[environment]
typeshed = "/typeshed"
```
We can then place custom stub files in `/typeshed/stdlib`, for example:
```pyi path=/typeshed/stdlib/builtins.pyi
class BuiltinClass: ...
builtin_symbol: BuiltinClass
```
```pyi path=/typeshed/stdlib/sys/__init__.pyi
version = "my custom Python"
```
And finally write a normal Python code block that makes use of the custom stubs:
```py
b: BuiltinClass = builtin_symbol
class OtherClass: ...
o: OtherClass = builtin_symbol # error: [invalid-assignment]
# Make sure that 'sys' has a proper entry in the auto-generated 'VERSIONS' file
import sys
```
## Custom `VERSIONS` file
If we want to specify a custom `VERSIONS` file, we can do so by creating a fenced code block with
language `text`. In the following test, we set the Python version to `3.10` and then make sure that
we can *not* import `new_module` with a version requirement of `3.11-`:
```toml
[environment]
python-version = "3.10"
typeshed = "/typeshed"
```
```pyi path=/typeshed/stdlib/old_module.pyi
class OldClass: ...
```
```pyi path=/typeshed/stdlib/new_module.pyi
class NewClass: ...
```
```text path=/typeshed/stdlib/VERSIONS
old_module: 3.0-
new_module: 3.11-
```
```py
from old_module import OldClass
# error: [unresolved-import] "Cannot resolve import `new_module`"
from new_module import NewClass
```
## Using `reveal_type` with a custom typeshed
When providing a custom typeshed directory, basic things like `reveal_type` will stop working
because we rely on being able to import it from `typing_extensions`. The actual definition of
`reveal_type` in typeshed is slightly involved (depends on generics, `TypeVar`, etc.), but a very
simple untyped definition is enough to make `reveal_type` work in tests:
```toml
[environment]
typeshed = "/typeshed"
```
```pyi path=/typeshed/stdlib/typing_extensions.pyi
def reveal_type(obj, /): ...
```
```py
reveal_type(()) # revealed: tuple[()]
```

View file

@ -136,7 +136,7 @@ pub(crate) mod tests {
/// Target Python platform
python_platform: PythonPlatform,
/// Path to a custom typeshed directory
custom_typeshed: Option<SystemPathBuf>,
typeshed: Option<SystemPathBuf>,
/// Path and content pairs for files that should be present
files: Vec<(&'a str, &'a str)>,
}
@ -146,7 +146,7 @@ pub(crate) mod tests {
Self {
python_version: PythonVersion::default(),
python_platform: PythonPlatform::default(),
custom_typeshed: None,
typeshed: None,
files: vec![],
}
}
@ -156,11 +156,6 @@ pub(crate) mod tests {
self
}
pub(crate) fn with_custom_typeshed(mut self, path: &str) -> Self {
self.custom_typeshed = Some(SystemPathBuf::from(path));
self
}
pub(crate) fn with_file(mut self, path: &'a str, content: &'a str) -> Self {
self.files.push((path, content));
self
@ -176,7 +171,7 @@ pub(crate) mod tests {
.context("Failed to write test files")?;
let mut search_paths = SearchPathSettings::new(vec![src_root]);
search_paths.typeshed = self.custom_typeshed;
search_paths.custom_typeshed = self.typeshed;
Program::from_settings(
&db,

View file

@ -169,7 +169,7 @@ impl SearchPaths {
let SearchPathSettings {
extra_paths,
src_roots,
typeshed,
custom_typeshed: typeshed,
site_packages: site_packages_paths,
} = settings;
@ -1308,7 +1308,7 @@ mod tests {
search_paths: SearchPathSettings {
extra_paths: vec![],
src_roots: vec![src.clone()],
typeshed: Some(custom_typeshed),
custom_typeshed: Some(custom_typeshed),
site_packages: SitePackages::Known(vec![site_packages]),
},
},
@ -1814,7 +1814,7 @@ not_a_directory
search_paths: SearchPathSettings {
extra_paths: vec![],
src_roots: vec![SystemPathBuf::from("/src")],
typeshed: None,
custom_typeshed: None,
site_packages: SitePackages::Known(vec![
venv_site_packages,
system_site_packages,

View file

@ -73,7 +73,7 @@ pub(crate) struct UnspecifiedTypeshed;
///
/// For tests checking that standard-library module resolution is working
/// correctly, you should usually create a [`MockedTypeshed`] instance
/// and pass it to the [`TestCaseBuilder::with_custom_typeshed`] method.
/// and pass it to the [`TestCaseBuilder::with_mocked_typeshed`] method.
/// If you need to check something that involves the vendored typeshed stubs
/// we include as part of the binary, you can instead use the
/// [`TestCaseBuilder::with_vendored_typeshed`] method.
@ -238,7 +238,7 @@ impl TestCaseBuilder<MockedTypeshed> {
search_paths: SearchPathSettings {
extra_paths: vec![],
src_roots: vec![src.clone()],
typeshed: Some(typeshed.clone()),
custom_typeshed: Some(typeshed.clone()),
site_packages: SitePackages::Known(vec![site_packages.clone()]),
},
},

View file

@ -108,7 +108,7 @@ pub struct SearchPathSettings {
/// Optional path to a "custom typeshed" directory on disk for us to use for standard-library types.
/// If this is not provided, we will fallback to our vendored typeshed stubs for the stdlib,
/// bundled as a zip file in the binary
pub typeshed: Option<SystemPathBuf>,
pub custom_typeshed: Option<SystemPathBuf>,
/// The path to the user's `site-packages` directory, where third-party packages from ``PyPI`` are installed.
pub site_packages: SitePackages,
@ -119,7 +119,7 @@ impl SearchPathSettings {
Self {
src_roots,
extra_paths: vec![],
typeshed: None,
custom_typeshed: None,
site_packages: SitePackages::Known(vec![]),
}
}

View file

@ -6003,14 +6003,13 @@ fn perform_membership_test_comparison<'db>(
#[cfg(test)]
mod tests {
use crate::db::tests::{setup_db, TestDb, TestDbBuilder};
use crate::db::tests::{setup_db, TestDb};
use crate::semantic_index::definition::Definition;
use crate::semantic_index::symbol::FileScopeId;
use crate::semantic_index::{global_scope, semantic_index, symbol_table, use_def_map};
use crate::types::check_types;
use crate::{HasType, SemanticModel};
use ruff_db::files::{system_path_to_file, File};
use ruff_db::parsed::parsed_module;
use ruff_db::system::DbWithTestSystem;
use ruff_db::testing::assert_function_query_was_not_run;
@ -6281,56 +6280,6 @@ mod tests {
Ok(())
}
#[test]
fn builtin_symbol_vendored_stdlib() -> anyhow::Result<()> {
let mut db = setup_db();
db.write_file("/src/a.py", "c = chr")?;
assert_public_type(&db, "/src/a.py", "c", "Literal[chr]");
Ok(())
}
#[test]
fn builtin_symbol_custom_stdlib() -> anyhow::Result<()> {
let db = TestDbBuilder::new()
.with_custom_typeshed("/typeshed")
.with_file("/src/a.py", "c = copyright")
.with_file(
"/typeshed/stdlib/builtins.pyi",
"def copyright() -> None: ...",
)
.with_file("/typeshed/stdlib/VERSIONS", "builtins: 3.8-")
.build()?;
assert_public_type(&db, "/src/a.py", "c", "Literal[copyright]");
Ok(())
}
#[test]
fn unknown_builtin_later_defined() -> anyhow::Result<()> {
let db = TestDbBuilder::new()
.with_custom_typeshed("/typeshed")
.with_file("/src/a.py", "x = foo")
.with_file("/typeshed/stdlib/builtins.pyi", "foo = bar; bar = 1")
.with_file("/typeshed/stdlib/VERSIONS", "builtins: 3.8-")
.build()?;
assert_public_type(&db, "/src/a.py", "x", "Unknown");
Ok(())
}
#[test]
fn str_builtin() -> anyhow::Result<()> {
let mut db = setup_db();
db.write_file("/src/a.py", "x = str")?;
assert_public_type(&db, "/src/a.py", "x", "Literal[str]");
Ok(())
}
#[test]
fn deferred_annotation_builtin() -> anyhow::Result<()> {
let mut db = setup_db();

View file

@ -8,8 +8,8 @@ under a certain directory as test suites.
A Markdown test suite can contain any number of tests. A test consists of one or more embedded
"files", each defined by a triple-backticks fenced code block. The code block must have a tag string
specifying its language; currently only `py` (Python files) and `pyi` (type stub files) are
supported.
specifying its language. We currently support `py` (Python files) and `pyi` (type stub files), as
well as [typeshed `VERSIONS`] files and `toml` for configuration.
The simplest possible test suite consists of just a single test, with a single embedded file:
@ -243,6 +243,20 @@ section. Nested sections can override configurations from their parent sections.
See [`MarkdownTestConfig`](https://github.com/astral-sh/ruff/blob/main/crates/red_knot_test/src/config.rs) for the full list of supported configuration options.
### Specifying a custom typeshed
Some tests will need to override the default typeshed with custom files. The `[environment]`
configuration option `typeshed` can be used to do this:
````markdown
```toml
[environment]
typeshed = "/typeshed"
```
````
For more details, take a look at the [custom-typeshed Markdown test].
## Documentation of tests
Arbitrary Markdown syntax (including of course normal prose paragraphs) is permitted (and ignored by
@ -294,36 +308,6 @@ The column assertion `6` on the ending line should be optional.
In cases of overlapping such assertions, resolve ambiguity using more angle brackets: `<<<<` begins
an assertion ended by `>>>>`, etc.
### Non-Python files
Some tests may need to specify non-Python embedded files: typeshed `stdlib/VERSIONS`, `pth` files,
`py.typed` files, `pyvenv.cfg` files...
We will allow specifying any of these using the `text` language in the code block tag string:
````markdown
```text path=/third-party/foo/py.typed
partial
```
````
We may want to also support testing Jupyter notebooks as embedded files; exact syntax for this is
yet to be determined.
Of course, red-knot is only run directly on `py` and `pyi` files, and assertion comments are only
possible in these files.
A fenced code block with no language will always be an error.
### Running just a single test from a suite
Having each test in a suite always run as a distinct Rust test would require writing our own test
runner or code-generating tests in a build script; neither of these is planned.
We could still allow running just a single test from a suite, for debugging purposes, either via
some "focus" syntax that could be easily temporarily added to a test, or via an environment
variable.
### Configuring search paths and kinds
The red-knot TOML configuration format hasn't been finalized, and we may want to implement
@ -346,38 +330,6 @@ Paths for `workspace-root` and `third-party-root` must be absolute.
Relative embedded-file paths are relative to the workspace root, even if it is explicitly set to a
non-default value using the `workspace-root` config.
### Specifying a custom typeshed
Some tests will need to override the default typeshed with custom files. The `[environment]`
configuration option `typeshed-path` can be used to do this:
````markdown
```toml
[environment]
typeshed-path = "/typeshed"
```
This file is importable as part of our custom typeshed, because it is within `/typeshed`, which we
configured above as our custom typeshed root:
```py path=/typeshed/stdlib/builtins.pyi
I_AM_THE_ONLY_BUILTIN = 1
```
This file is written to `/src/test.py`, because the default workspace root is `/src/ and the default
file path is `test.py`:
```py
reveal_type(I_AM_THE_ONLY_BUILTIN) # revealed: Literal[1]
```
````
A fenced code block with language `text` can be used to provide a `stdlib/VERSIONS` file in the
custom typeshed root. If no such file is created explicitly, one should be created implicitly
including entries enabling all specified `<typeshed-root>/stdlib` files for all supported Python
versions.
### I/O errors
We could use an `error=` configuration option in the tag string to make an embedded file cause an
@ -480,3 +432,6 @@ cold, to validate equivalence of cold and incremental check results.
[^extensions]: `typing-extensions` is a third-party module, but typeshed, and thus type checkers
also, treat it as part of the standard library.
[custom-typeshed markdown test]: ../red_knot_python_semantic/resources/mdtest/mdtest_custom_typeshed.md
[typeshed `versions`]: https://github.com/python/typeshed/blob/c546278aae47de0b2b664973da4edb613400f6ce/stdlib/VERSIONS#L1-L18%3E

View file

@ -34,6 +34,12 @@ impl MarkdownTestConfig {
.as_ref()
.and_then(|env| env.python_platform.clone())
}
pub(crate) fn typeshed(&self) -> Option<&str> {
self.environment
.as_ref()
.and_then(|env| env.typeshed.as_deref())
}
}
#[derive(Deserialize, Debug, Default, Clone)]
@ -44,6 +50,9 @@ pub(crate) struct Environment {
/// Target platform to assume when resolving types.
pub(crate) python_platform: Option<PythonPlatform>,
/// Path to a custom typeshed directory.
pub(crate) typeshed: Option<String>,
}
#[derive(Deserialize, Debug, Clone)]

View file

@ -3,7 +3,7 @@ use camino::Utf8Path;
use colored::Colorize;
use parser as test_parser;
use red_knot_python_semantic::types::check_types;
use red_knot_python_semantic::Program;
use red_knot_python_semantic::{Program, ProgramSettings, SearchPathSettings, SitePackages};
use ruff_db::diagnostic::{Diagnostic, ParseDiagnostic};
use ruff_db::files::{system_path_to_file, File, Files};
use ruff_db::panic::catch_unwind;
@ -12,7 +12,7 @@ use ruff_db::system::{DbWithTestSystem, SystemPathBuf};
use ruff_db::testing::{setup_logging, setup_logging_with_filter};
use ruff_source_file::{LineIndex, OneIndexed};
use ruff_text_size::TextSize;
use salsa::Setter;
use std::fmt::Write;
mod assertion;
mod config;
@ -50,13 +50,6 @@ pub fn run(path: &Utf8Path, long_title: &str, short_title: &str, test_name: &str
Log::Filter(filter) => setup_logging_with_filter(filter),
});
Program::get(&db)
.set_python_version(&mut db)
.to(test.configuration().python_version().unwrap_or_default());
Program::get(&db)
.set_python_platform(&mut db)
.to(test.configuration().python_platform().unwrap_or_default());
// Remove all files so that the db is in a "fresh" state.
db.memory_file_system().remove_all();
Files::sync_all(&mut db);
@ -98,6 +91,10 @@ pub fn run(path: &Utf8Path, long_title: &str, short_title: &str, test_name: &str
fn run_test(db: &mut db::Db, test: &parser::MarkdownTest) -> Result<(), Failures> {
let project_root = db.project_root().to_path_buf();
let src_path = SystemPathBuf::from("/src");
let custom_typeshed_path = test.configuration().typeshed().map(SystemPathBuf::from);
let mut typeshed_files = vec![];
let mut has_custom_versions_file = false;
let test_files: Vec<_> = test
.files()
@ -107,11 +104,33 @@ fn run_test(db: &mut db::Db, test: &parser::MarkdownTest) -> Result<(), Failures
}
assert!(
matches!(embedded.lang, "py" | "pyi"),
"Non-Python files not supported yet."
matches!(embedded.lang, "py" | "pyi" | "text"),
"Supported file types are: py, pyi, text"
);
let full_path = project_root.join(embedded.path);
let full_path = if embedded.path.starts_with('/') {
SystemPathBuf::from(embedded.path)
} else {
project_root.join(embedded.path)
};
if let Some(ref typeshed_path) = custom_typeshed_path {
if let Ok(relative_path) = full_path.strip_prefix(typeshed_path.join("stdlib")) {
if relative_path.as_str() == "VERSIONS" {
has_custom_versions_file = true;
} else if relative_path.extension().is_some_and(|ext| ext == "pyi") {
typeshed_files.push(relative_path.to_path_buf());
}
}
}
db.write_file(&full_path, embedded.code).unwrap();
if !full_path.starts_with(&src_path) || embedded.lang == "text" {
// These files need to be written to the file system (above), but we don't run any checks on them.
return None;
}
let file = system_path_to_file(db, full_path).unwrap();
Some(TestFile {
@ -121,6 +140,42 @@ fn run_test(db: &mut db::Db, test: &parser::MarkdownTest) -> Result<(), Failures
})
.collect();
// Create a custom typeshed `VERSIONS` file if none was provided.
if let Some(ref typeshed_path) = custom_typeshed_path {
if !has_custom_versions_file {
let versions_file = typeshed_path.join("stdlib/VERSIONS");
let contents = typeshed_files
.iter()
.fold(String::new(), |mut content, path| {
// This is intentionally kept simple:
let module_name = path
.as_str()
.trim_end_matches(".pyi")
.trim_end_matches("/__init__")
.replace('/', ".");
let _ = writeln!(content, "{module_name}: 3.8-");
content
});
db.write_file(&versions_file, contents).unwrap();
}
}
Program::get(db)
.update_from_settings(
db,
ProgramSettings {
python_version: test.configuration().python_version().unwrap_or_default(),
python_platform: test.configuration().python_platform().unwrap_or_default(),
search_paths: SearchPathSettings {
src_roots: vec![src_path],
extra_paths: vec![],
custom_typeshed: custom_typeshed_path,
site_packages: SitePackages::Known(vec![]),
},
},
)
.expect("Failed to update Program settings in TestDb");
let failures: Failures = test_files
.into_iter()
.filter_map(|test_file| {

View file

@ -133,12 +133,17 @@ struct EmbeddedFileId;
/// A single file embedded in a [`Section`] as a fenced code block.
///
/// Currently must be a Python file (`py` language) or type stub (`pyi`). In the future we plan
/// support other kinds of files as well (TOML configuration, typeshed VERSIONS, `pth` files...).
/// Currently must be a Python file (`py` language), a type stub (`pyi`) or a [typeshed `VERSIONS`]
/// file.
///
/// TOML configuration blocks are also supported, but are not stored as `EmbeddedFile`s. In the
/// future we plan to support `pth` files as well.
///
/// A Python embedded file makes its containing [`Section`] into a [`MarkdownTest`], and will be
/// type-checked and searched for inline-comment assertions to match against the diagnostics from
/// type checking.
///
/// [typeshed `VERSIONS`]: https://github.com/python/typeshed/blob/c546278aae47de0b2b664973da4edb613400f6ce/stdlib/VERSIONS#L1-L18
#[derive(Debug)]
pub(crate) struct EmbeddedFile<'s> {
section: SectionId,