uv/crates/uv-globfilter
konsti dfcceb6a1d
Build backend: Revamp include/exclude (#9525)
When building the source distribution, we always need to include
`pyproject.toml` and the module, when building the wheel, we always
include the module but nothing else at top level. Since we only allow a
single module per wheel, that means that there are no specific wheel
includes. This means we have source includes, source excludes, wheel
excludes, but no wheel includes: This is defined by the module root,
plus the metadata files and data directories separately.

Extra source dist includes are currently unused (they can't end up in
the wheel currently), but it makes sense to model them here, they will
be needed for any sort of procedural build step.

This results in the following fields being relevant for inclusions and
exclusion:

* `pyproject.toml` (always included in the source dist)
* project.readme: PEP 621
* project.license-files: PEP 639
* module_root: `Path`
* source_include: `Vec<Glob>`
* source_exclude: `Vec<Glob>`
* wheel_exclude: `Vec<Glob>`
* data: `Map<KnownDataName, Path>`

An opinionated choice is that that wheel excludes always contain the
source excludes: Otherwise you could have a path A in the source tree
that gets included when building the wheel directly from the source
tree, but not when going through the source dist as intermediary,
because A is in source excludes, but not in the wheel excludes. This has
been a source of errors previously.

In the process, I fixed a bug where we would skip directories and only
include the files and were missing license due to absolute globs.
2024-12-01 11:32:35 +00:00
..
src Build backend: Revamp include/exclude (#9525) 2024-12-01 11:32:35 +00:00
Cargo.toml Align tempfile workspace dependencies with root project (#9524) 2024-11-29 12:05:10 -05:00
README.md Build backend: Switch to custom glob-walkdir implementation (#9013) 2024-11-14 13:14:58 +00:00

globfilter

Portable directory walking with includes and excludes.

Motivating example: You want to allow the user to select paths within a project.

include = ["src", "License.txt", "resources/icons/*.svg"]
exclude = ["target", "/dist", ".cache", "*.tmp"]

When traversing the directory, you can use GlobDirFilter::from_globs(...)?.match_directory(&relative) skip directories that never match in WalkDirs filter_entry.

Syntax

This crate supports the cross-language, restricted glob syntax from PEP 639:

  • Alphanumeric characters, underscores (_), hyphens (-) and dots (.) are matched verbatim.
  • The special glob characters are:
    • *: Matches any number of characters except path separators
    • ?: Matches a single character except the path separator
    • **: Matches any number of characters including path separators
    • [], containing only the verbatim matched characters: Matches a single of the characters contained. Within [...], the hyphen indicates a locale-agnostic range (e.g. a-z, order based on Unicode code points). Hyphens at the start or end are matched literally.
  • The path separator is the forward slash character (/). Patterns are relative to the given directory, a leading slash character for absolute paths is not supported.
  • Parent directory indicators (..) are not allowed.

These rules mean that matching the backslash (\) is forbidden, which avoid collisions with the windows path separator.