ruff/crates/red_knot_python_semantic/src/module_resolver/module.rs
David Peter 03adae80dc
[red-knot] Initial support for dataclasses (#17353)
## Summary

Add very early support for dataclasses. This is mostly to make sure that
we do not emit false positives on dataclass construction, but it also
lies some foundations for future extensions.

This seems like a good initial step to merge to me, as it basically
removes all false positives on dataclass constructor calls. This allows
us to use the ecosystem checks for making sure we don't introduce new
false positives as we continue to work on dataclasses.

## Ecosystem analysis

I re-ran the mypy_primer evaluation of [the `__init__`
PR](https://github.com/astral-sh/ruff/pull/16512) locally with our
current mypy_primer version and project selection. It introduced 1597
new diagnostics. Filtering those by searching for `__init__` and
rejecting those that contain `invalid-argument-type` (those could not
possibly be solved by this PR) leaves 1281 diagnostics. The current
version of this PR removes 1171 diagnostics, which leaves 110
unaccounted for. I extracted the lint + file path for all of these
diagnostics and generated a diff (of diffs), to see which
`__init__`-diagnostics remain. I looked at a subset of these: There are
a lot of `SomeClass(*args)` calls where we don't understand the
unpacking yet (this is not even related to `__init__`). Some others are
related to `NamedTuple`, which we also don't support yet. And then there
are some errors related to `@attrs.define`-decorated classes, which
would probably require support for `dataclass_transform`, which I made
no attempt to include in this PR.

## Test Plan

New Markdown tests.
2025-04-15 10:39:21 +02:00

205 lines
5.2 KiB
Rust

use std::fmt::Formatter;
use std::str::FromStr;
use std::sync::Arc;
use ruff_db::files::File;
use super::path::SearchPath;
use crate::module_name::ModuleName;
/// Representation of a Python module.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Module {
inner: Arc<ModuleInner>,
}
impl Module {
pub(crate) fn new(
name: ModuleName,
kind: ModuleKind,
search_path: SearchPath,
file: File,
) -> Self {
let known = KnownModule::try_from_search_path_and_name(&search_path, &name);
Self {
inner: Arc::new(ModuleInner {
name,
kind,
search_path,
file,
known,
}),
}
}
/// The absolute name of the module (e.g. `foo.bar`)
pub fn name(&self) -> &ModuleName {
&self.inner.name
}
/// The file to the source code that defines this module
pub fn file(&self) -> File {
self.inner.file
}
/// Is this a module that we special-case somehow? If so, which one?
pub fn known(&self) -> Option<KnownModule> {
self.inner.known
}
/// Does this module represent the given known module?
pub fn is_known(&self, known_module: KnownModule) -> bool {
self.known() == Some(known_module)
}
/// The search path from which the module was resolved.
pub(crate) fn search_path(&self) -> &SearchPath {
&self.inner.search_path
}
/// Determine whether this module is a single-file module or a package
pub fn kind(&self) -> ModuleKind {
self.inner.kind
}
}
impl std::fmt::Debug for Module {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Module")
.field("name", &self.name())
.field("kind", &self.kind())
.field("file", &self.file())
.field("search_path", &self.search_path())
.finish()
}
}
#[derive(PartialEq, Eq, Hash)]
struct ModuleInner {
name: ModuleName,
kind: ModuleKind,
search_path: SearchPath,
file: File,
known: Option<KnownModule>,
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub enum ModuleKind {
/// A single-file module (e.g. `foo.py` or `foo.pyi`)
Module,
/// A python package (`foo/__init__.py` or `foo/__init__.pyi`)
Package,
}
impl ModuleKind {
pub const fn is_package(self) -> bool {
matches!(self, ModuleKind::Package)
}
pub const fn is_module(self) -> bool {
matches!(self, ModuleKind::Module)
}
}
/// Enumeration of various core stdlib modules in which important types are located
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, strum_macros::EnumString)]
#[cfg_attr(test, derive(strum_macros::EnumIter))]
#[strum(serialize_all = "snake_case")]
pub enum KnownModule {
Builtins,
Enum,
Types,
#[strum(serialize = "_typeshed")]
Typeshed,
TypingExtensions,
Typing,
Sys,
#[allow(dead_code)]
Abc, // currently only used in tests
Dataclasses,
Collections,
Inspect,
KnotExtensions,
}
impl KnownModule {
pub const fn as_str(self) -> &'static str {
match self {
Self::Builtins => "builtins",
Self::Enum => "enum",
Self::Types => "types",
Self::Typing => "typing",
Self::Typeshed => "_typeshed",
Self::TypingExtensions => "typing_extensions",
Self::Sys => "sys",
Self::Abc => "abc",
Self::Dataclasses => "dataclasses",
Self::Collections => "collections",
Self::Inspect => "inspect",
Self::KnotExtensions => "knot_extensions",
}
}
pub fn name(self) -> ModuleName {
ModuleName::new_static(self.as_str())
.unwrap_or_else(|| panic!("{self} should be a valid module name!"))
}
pub(crate) fn try_from_search_path_and_name(
search_path: &SearchPath,
name: &ModuleName,
) -> Option<Self> {
if search_path.is_standard_library() {
Self::from_str(name.as_str()).ok()
} else {
None
}
}
pub const fn is_builtins(self) -> bool {
matches!(self, Self::Builtins)
}
pub const fn is_typing(self) -> bool {
matches!(self, Self::Typing)
}
pub const fn is_knot_extensions(self) -> bool {
matches!(self, Self::KnotExtensions)
}
pub const fn is_inspect(self) -> bool {
matches!(self, Self::Inspect)
}
pub const fn is_enum(self) -> bool {
matches!(self, Self::Enum)
}
}
impl std::fmt::Display for KnownModule {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.write_str(self.as_str())
}
}
#[cfg(test)]
mod tests {
use super::*;
use strum::IntoEnumIterator;
#[test]
fn known_module_roundtrip_from_str() {
let stdlib_search_path = SearchPath::vendored_stdlib();
for module in KnownModule::iter() {
let module_name = module.name();
assert_eq!(
KnownModule::try_from_search_path_and_name(&stdlib_search_path, &module_name),
Some(module),
"The strum `EnumString` implementation appears to be incorrect for `{module_name}`"
);
}
}
}