uv/crates/uv-build-backend/src/settings.rs
konsti 7316bd01a3
Build backend: Support namespace packages (#13833)
Unlike regular packages, specifying all `__init__.py` directories for a
namespace package would be very verbose There is e.g.
https://github.com/python-poetry/poetry/tree/main/src/poetry, which has
18 modules, or https://github.com/googleapis/api-common-protos which is
inconsistently nested. For both the Google Cloud SDK, there are both
packages with a single module and those with complex structures, with
many having multiple modules due to versioning through `<module>_v1`
versioning. The Azure SDK seems to use one module per package (it's not
explicitly documented but seems to follow from the process in
https://azure.github.io/azure-sdk/python_design.html#azure-sdk-distribution-packages
and
ccb0e03a3d/doc/dev/packaging.md).

For simplicity with complex projects, we add a `namespace = true` switch
which disabled checking for an `__init__.py`. We only check that there's
no `<module_root>/<module_name>/__init__.py` and otherwise add the whole
`<module_root>/<module_name>` folder. This comes at the cost of
`namespace = true` effectively creating an opt-out from our usual checks
that allows creating an almost entirely arbitrary package.

For simple projects with only a single module, the module name can be
dotted to point to the target module, so the build still gets checked:

```toml
[tool.uv.build-backend]
module-name = "poetry.core"
```

## Alternatives

### Declare all packages

We could make `module-name` a list and allow or require declaring all
packages:

```toml
[tool.uv.build-backend]
module-name = ["cloud_sdk.service.storage", "cloud_sdk.service.storage_v1", "cloud_sdk.billing.storage"]
```

Or for Poetry:

```toml
[tool.uv.build-backend]
module-name = [
    "poetry.config",
    "poetry.console",
    "poetry.inspection",
    "poetry.installation",
    "poetry.json",
    "poetry.layouts",
    "poetry.masonry",
    "poetry.mixology",
    "poetry.packages",
    "poetry.plugins",
    "poetry.publishing",
    "poetry.puzzle",
    "poetry.pyproject",
    "poetry.repositories",
    "poetry.toml",
    "poetry.utils",
    "poetry.vcs",
    "poetry.version"
]
```

### Support multiple namespaces

We could also allow namespace packages with multiple root level module:

```toml
[tool.uv.build-backend]
module-name = ["cloud_sdk.my_ext", "local_sdk.my_ext"]
```

For lack of use cases, we delegate this to creating a workspace with one
package per module.

## Implementation

Due to the more complex options for the module name, I'm moving
verification on deserialization later, dropping the source span we'd get
from serde. We also don't show similarly named directories anymore.

---------

Co-authored-by: Andrew Gallant <andrew@astral.sh>
2025-06-12 17:23:58 +00:00

216 lines
8.2 KiB
Rust

use serde::{Deserialize, Serialize};
use std::path::PathBuf;
use uv_macros::OptionsMetadata;
/// Settings for the uv build backend (`uv_build`).
///
/// !!! note
///
/// The uv build backend is currently in preview and may change in any future release.
///
/// Note that those settings only apply when using the `uv_build` backend, other build backends
/// (such as hatchling) have their own configuration.
///
/// All options that accept globs use the portable glob patterns from
/// [PEP 639](https://packaging.python.org/en/latest/specifications/glob-patterns/).
#[derive(Deserialize, Serialize, OptionsMetadata, Debug, Clone, PartialEq, Eq)]
#[serde(default, rename_all = "kebab-case")]
#[cfg_attr(feature = "schemars", derive(schemars::JsonSchema))]
pub struct BuildBackendSettings {
/// The directory that contains the module directory.
///
/// Common values are `src` (src layout, the default) or an empty path (flat layout).
#[option(
default = r#""src""#,
value_type = "str",
example = r#"module-root = """#
)]
pub module_root: PathBuf,
/// The name of the module directory inside `module-root`.
///
/// The default module name is the package name with dots and dashes replaced by underscores.
///
/// Package names need to be valid Python identifiers, and the directory needs to contain a
/// `__init__.py`. An exception are stubs packages, whose name ends with `-stubs`, with the stem
/// being the module name, and which contain a `__init__.pyi` file.
///
/// For namespace packages with a single module, the path can be dotted, e.g., `foo.bar` or
/// `foo-stubs.bar`.
///
/// Note that using this option runs the risk of creating two packages with different names but
/// the same module names. Installing such packages together leads to unspecified behavior,
/// often with corrupted files or directory trees.
#[option(
default = r#"None"#,
value_type = "str",
example = r#"module-name = "sklearn""#
)]
pub module_name: Option<String>,
/// Glob expressions which files and directories to additionally include in the source
/// distribution.
///
/// `pyproject.toml` and the contents of the module directory are always included.
#[option(
default = r#"[]"#,
value_type = "list[str]",
example = r#"source-include = ["tests/**"]"#
)]
pub source_include: Vec<String>,
/// If set to `false`, the default excludes aren't applied.
///
/// Default excludes: `__pycache__`, `*.pyc`, and `*.pyo`.
#[option(
default = r#"true"#,
value_type = "bool",
example = r#"default-excludes = false"#
)]
pub default_excludes: bool,
/// Glob expressions which files and directories to exclude from the source distribution.
#[option(
default = r#"[]"#,
value_type = "list[str]",
example = r#"source-exclude = ["*.bin"]"#
)]
pub source_exclude: Vec<String>,
/// Glob expressions which files and directories to exclude from the wheel.
#[option(
default = r#"[]"#,
value_type = "list[str]",
example = r#"wheel-exclude = ["*.bin"]"#
)]
pub wheel_exclude: Vec<String>,
/// Build a namespace package.
///
/// Build a PEP 420 implicit namespace package, allowing more than one root `__init__.py`.
///
/// Use this option when the namespace package contains multiple root `__init__.py`, for
/// namespace packages with a single root `__init__.py` use a dotted `module-name` instead.
///
/// To compare dotted `module-name` and `namespace = true`, the first example below can be
/// expressed with `module-name = "cloud.database"`: There is one root `__init__.py` `database`.
/// In the second example, we have three roots (`cloud.database`, `cloud.database_pro`,
/// `billing.modules.database_pro`), so `namespace = true` is required.
///
/// ```text
/// src
/// └── cloud
/// └── database
/// ├── __init__.py
/// ├── query_builder
/// │ └── __init__.py
/// └── sql
/// ├── parser.py
/// └── __init__.py
/// ```
///
/// ```text
/// src
/// ├── cloud
/// │ ├── database
/// │ │ ├── __init__.py
/// │ │ ├── query_builder
/// │ │ │ └── __init__.py
/// │ │ └── sql
/// │ │ ├── __init__.py
/// │ │ └── parser.py
/// │ └── database_pro
/// │ ├── __init__.py
/// │ └── query_builder.py
/// └── billing
/// └── modules
/// └── database_pro
/// ├── __init__.py
/// └── sql.py
/// ```
#[option(
default = r#"false"#,
value_type = "bool",
example = r#"namespace = true"#
)]
pub namespace: bool,
/// Data includes for wheels.
///
/// Each entry is a directory, whose contents are copied to the matching directory in the wheel
/// in `<name>-<version>.data/(purelib|platlib|headers|scripts|data)`. Upon installation, this
/// data is moved to its target location, as defined by
/// <https://docs.python.org/3.12/library/sysconfig.html#installation-paths>. Usually, small
/// data files are included by placing them in the Python module instead of using data includes.
///
/// - `scripts`: Installed to the directory for executables, `<venv>/bin` on Unix or
/// `<venv>\Scripts` on Windows. This directory is added to `PATH` when the virtual
/// environment is activated or when using `uv run`, so this data type can be used to install
/// additional binaries. Consider using `project.scripts` instead for Python entrypoints.
/// - `data`: Installed over the virtualenv environment root.
///
/// Warning: This may override existing files!
///
/// - `headers`: Installed to the include directory. Compilers building Python packages
/// with this package as build requirement use the include directory to find additional header
/// files.
/// - `purelib` and `platlib`: Installed to the `site-packages` directory. It is not recommended
/// to uses these two options.
// TODO(konsti): We should show a flat example instead.
// ```toml
// [tool.uv.build-backend.data]
// headers = "include/headers",
// scripts = "bin"
// ```
#[option(
default = r#"{}"#,
value_type = "dict[str, str]",
example = r#"data = { "headers": "include/headers", "scripts": "bin" }"#
)]
pub data: WheelDataIncludes,
}
impl Default for BuildBackendSettings {
fn default() -> Self {
Self {
module_root: PathBuf::from("src"),
module_name: None,
source_include: Vec::new(),
default_excludes: true,
source_exclude: Vec::new(),
wheel_exclude: Vec::new(),
namespace: false,
data: WheelDataIncludes::default(),
}
}
}
/// Data includes for wheels.
///
/// See `BuildBackendSettings::data`.
#[derive(Default, Deserialize, Serialize, OptionsMetadata, Debug, Clone, PartialEq, Eq)]
// `deny_unknown_fields` to catch typos such as `header` vs `headers`.
#[serde(default, rename_all = "kebab-case", deny_unknown_fields)]
#[cfg_attr(feature = "schemars", derive(schemars::JsonSchema))]
pub struct WheelDataIncludes {
purelib: Option<String>,
platlib: Option<String>,
headers: Option<String>,
scripts: Option<String>,
data: Option<String>,
}
impl WheelDataIncludes {
/// Yield all data directories name and corresponding paths.
pub fn iter(&self) -> impl Iterator<Item = (&'static str, &str)> {
[
("purelib", self.purelib.as_deref()),
("platlib", self.platlib.as_deref()),
("headers", self.headers.as_deref()),
("scripts", self.scripts.as_deref()),
("data", self.data.as_deref()),
]
.into_iter()
.filter_map(|(name, value)| Some((name, value?)))
}
}