closes#7365
Summary
This pull request adds support for additional file extension aliases in
the SourceDistExtension and ExtensionError enums. The newly supported
file extensions include .tbz, .tgz, .txz, .tar.lz, .tar.lzma. These
changes align the extensions supported by the SourceDistExtension with
those used in Python packaging tools, enhancing compatibility with a
broader range of source distribution formats.
Test Plan
should be added or updated to verify that the new extensions are
correctly recognized as valid source distributions and that errors are
correctly raised when unsupported extensions are provided.
## Summary
Use a dedicated source type for non-package requirements. Also enables
us to support non-package `path` dependencies _and_ removes the need to
have the member `pyproject.toml` files available when we sync _and_
makes it explicit which dependencies are virtual vs. not (as evidenced
by the snapshot changes). All good things!
For users who were using absolute paths in the `pyproject.toml`
previously, this is a behavior change: We now convert all absolute paths
in `path` entries to relative paths. Since i assume that no-one relies
on absolute path in their lockfiles - they are intended to be portable -
I'm tagging this as a bugfix.
Closes https://github.com/astral-sh/uv/pull/6438
Fixes https://github.com/astral-sh/uv/issues/6371
## Summary
Normalize all `python_version` markers to their equivalent
`python_full_version` form. This avoids false positives in forking
because we currently cannot detect any relationships between the two
forms. It also avoids subtle bugs due to the truncating semantics of
`python_version`. For example, given `requires-python = ">3.12"`, we
currently simplify the marker `python_version <= 3.12` to `false`.
However, the version `3.12.1` will be truncated to `3.12` for
`python_version` comparisons, and thus it satisfies the python
requirement and evaluates to `true`.
It is possible to simplify back to `python_version` when writing markers
to the lockfile. However, the equivalent `python_full_version` markers
are often clearer and easier to simplify, so I lean towards leaving them
as `python_full_version`.
There are *a lot* of snapshot updates from this change. I'd like more
eyes on the transformation logic in `python_version_to_full_version` to
ensure that they are all correct.
Resolves https://github.com/astral-sh/uv/issues/6125.
## Summary
This PR rewrites the `MarkerTree` type to use algebraic decision
diagrams (ADD). This has many benefits:
- The diagram is canonical for a given marker function. It is impossible
to create two functionally equivalent marker trees that don't refer to
the same underlying ADD. This also means that any trivially true or
unsatisfiable markers are represented by the same constants.
- The diagram can handle complex operations (conjunction/disjunction) in
polynomial time, as well as constant-time negation.
- The diagram can be converted to a simplified DNF form for user-facing
output.
The new representation gives us a lot more confidence in our marker
operations and simplification, which is proving to be very important
(see https://github.com/astral-sh/uv/pull/5733 and
https://github.com/astral-sh/uv/pull/5163).
Unfortunately, it is not easy to split this PR into multiple commits
because it is a large rewrite of the `marker` module. I'd suggest
reading through the `marker/algebra.rs`, `marker/simplify.rs`, and
`marker/tree.rs` files for the new implementation, as well as the
updated snapshots to verify how the new simplification rules work in
practice. However, a few other things were changed:
- [We now use release-only comparisons for `python_full_version`, where
we previously only did for
`python_version`](https://github.com/astral-sh/uv/blob/ibraheem/canonical-markers/crates/pep508-rs/src/marker/algebra.rs#L522).
I'm unsure how marker operations should work in the presence of
pre-release versions if we decide that this is incorrect.
- [Meaningless marker expressions are now
ignored](https://github.com/astral-sh/uv/blob/ibraheem/canonical-markers/crates/pep508-rs/src/marker/parse.rs#L502).
This means that a marker such as `'x' == 'x'` will always evaluate to
`true` (as if the expression did not exist), whereas we previously
treated this as always `false`. It's negation however, remains `false`.
- [Unsatisfiable markers are written as `python_version <
'0'`](https://github.com/astral-sh/uv/blob/ibraheem/canonical-markers/crates/pep508-rs/src/marker/tree.rs#L1329).
- The `PubGrubSpecifier` type has been moved to the new `uv-pubgrub`
crate, shared by `pep508-rs` and `uv-resolver`. `pep508-rs` also depends
on the `pubgrub` crate for the `Range` type, we probably want to move
`pubgrub::Range` into a separate crate to break this, but I don't think
that should block this PR (cc @konstin).
There is still some remaining work here that I decided to leave for now
for the sake of unblocking some of the related work on the resolver.
- We still use `Option<MarkerTree>` throughout uv, which is unnecessary
now that `MarkerTree::TRUE` is canonical.
- The `MarkerTree` type is now interned globally and can potentially
implement `Copy`. However, it's unclear if we want to add more
information to marker trees that would make it `!Copy`. For example, we
may wish to attach extra and requires-python environment information to
avoid simplifying after construction.
- We don't currently combine `python_full_version` and `python_version`
markers.
- I also have not spent too much time investigating performance and
there is probably some low-hanging fruit. Many of the test cases I did
run actually saw large performance improvements due to the markers being
simplified internally, reducing the stress on the old `normalize`
routine, especially for the extremely large markers seen in
`transformers` and other projects.
Resolves https://github.com/astral-sh/uv/issues/5660,
https://github.com/astral-sh/uv/issues/5179.
## Summary
This PR adds a `DistExtension` field to some of our distribution types,
which requires that we validate that the file type is known and
supported when parsing (rather than when attempting to unzip). It
removes a bunch of extension parsing from the code too, in favor of
doing it once upfront.
Closes https://github.com/astral-sh/uv/issues/5858.
## Summary
We currently accept `--index-url /path/to/index` on the command line,
but confusingly, not in `requirements.txt`. This PR just brings the two
in sync.
## Test Plan
New snapshot tests.
## Summary
This is what I consider to be the "real" fix for #8072. We now treat
directory and path URLs as separate `ParsedUrl` types and
`RequirementSource` types. This removes a lot of `.is_dir()` forking
within the `ParsedUrl::Path` arms and makes some states impossible
(e.g., you can't have a `.whl` path that is editable). It _also_ fixes
the `direct_url.json` for direct URLs that refer to files. Previously,
we wrote out to these as if they were installed as directories, which is
just wrong.
By splitting `path` into a lockable, relative (or absolute) and an
absolute installable path and by splitting between urls and paths by
dist type, we can store relative paths in the lockfile.
## Summary
Given `install -e dagster`, we need to assume that the user meant
`install -e ./dagster`, even though `install dagster` should _not_ be
treated as `install ./dagster`. I suspect pip will change this in the
future (since `pip install dagster` does _not_ meant `pip install
./dagster`) but for now it's what users expect.
Closes https://github.com/astral-sh/uv/issues/3994.
## Summary
This will help prevent bugs like #3934 by unifying the implementations
for editables and non-editable unnamed requirements. Specifically, both
of these now go through the same parsing paths and use the same struct
representations (with the exception that the editable flag is flipped in
the first case):
```
-e ./foo/bar
./foo/bar
```
We also now support PEP 508 in editable URLs. It turns out this is just
a limitation in pip, so it's correct to support it. For example, this
now works:
```
-e black[d] @ file://${PROJECT_ROOT}/scripts/packages/black_editable
```
Closes#3941.
Closes#3942.
With the change, we remove the special casing of workspace dependencies
and resolve `tool.uv` for all git and directory distributions. This
gives us support for non-editable workspace dependencies and path
dependencies in other workspaces. It removes a lot of special casing
around workspaces. These changes are the groundwork for supporting
`tool.uv` with dynamic metadata.
The basis for this change is moving `Requirement` from
`distribution-types` to `pypi-types` and the lowering logic from
`uv-requirements` to `uv-distribution`. This changes should be split out
in separate PRs.
I've included an example workspace `albatross-root-workspace2` where
`bird-feeder` depends on `a` from another workspace `ab`. There's a
bunch of failing tests and regressed error messages that still need
fixing. It does fix the audited package count for the workspace tests.
## Summary
There are a few behavior changes in here:
- We now enforce `--require-hashes` for editables, like pip. So if you
use `--require-hashes` with an editable requirement, we'll reject it. I
could change this if it seems off.
- We now treat source tree requirements, editable or not (e.g., both `-e
./black` and `./black`) as if `--refresh` is always enabled. This
doesn't mean that we _always_ rebuild them; but if you pass
`--reinstall`, then yes, we always rebuild them. I think this is an
improvement and is close to how editables work today.
Closes#3844.
Closes#2695.
When parsing requirements from any source, directly parse the url parts
(and reject unsupported urls) instead of parsing url parts at a later
stage. This removes a bunch of error branches and concludes the work
parsing url parts once and passing them around everywhere.
Many usages of the assembled `VerbatimUrl` remain, but these can be
removed incrementally.
Please review commit-by-commit.
## Summary
Closes https://github.com/astral-sh/uv/issues/3715.
## Test Plan
```
❯ echo "/../test" | cargo run pip compile -
error: Couldn't parse requirement in `-` at position 0
Caused by: path could not be normalized: /../test
/../test
^^^^^^^^
❯ echo "-e /../test" | cargo run pip compile -
error: Invalid URL in `-`: `/../test`
Caused by: path could not be normalized: /../test
Caused by: cannot normalize a relative path beyond the base directory
```
## Summary
If a user includes markers after an editable, we now ignore them (rather
than including them in the parsed URL). This matches pip's behavior. In
the future, we could further improve by respecting them, but that
_would_ be a deviation from pip.
For example, given:
```
-e ./scripts/packages/black_editable ; python_version >= "3.9" and python_ver
```
We now split at the first whitespace (just before the `;`), parse
everything before, and throw out everything after.
This logic also extends to extras. So given:
```
-e ./scripts/packages/black_editable[dev, colorama]
```
We'll now parse this as the URL
`./scripts/packages/black_editable[dev,`, and throw out ` colorama]`.
Instead, you need to do:
```
-e ./scripts/packages/black_editable[dev,colorama]
```
(I.e., remove the space.)
This _also_ matches pip's behavior. I could "fix" this but I'm unsure if
I should -- it means requirements files will be parseable by uv that
won't work with pip. Open to input. My gut reaction is that we _should_
properly support `-e ./scripts/packages/black_editable[dev, colorama]`
even if pip would reject it, but `requirements.txt` is
implementation-defined so it'd be a "deviation".
Closes https://github.com/astral-sh/uv/issues/3604.
## Summary
Ensures that we track the origins for requirements regardless of whether
they come from `pyproject.toml` or `setup.py` or `setup.cfg`.
Closes#3480.
## Summary
Fixes https://github.com/astral-sh/uv/issues/1343. This is kinda a first
draft at the moment, but does at least mostly work locally (barring some
bits of the test suite that seem to not work for me in general).
## Test Plan
Mostly running the existing tests and checking the revised output is
sane
## Outstanding issues
Most of these come down to "AFAIK, the existing tools don't support
these patterns, but `uv` does" and so I'm not sure there's an existing
good answer here! Most of the answers so far are "whatever was easiest
to build"
- [x] ~~Is "-r pyproject.toml" correct? Should it show something else or
get skipped entirely~~ No it wasn't. Fixed in
3044fa8b86
- [ ] If the requirements file is stdin, that just gets skipped. Should
it be recorded?
- [ ] Overrides get shown as "--override<override.txt>". Correct?
- [x] ~~Some of the tests (e.g.
`dependency_excludes_non_contiguous_range_of_compatible_versions`) make
assumptions about the order of package versions being outputted, which
this PR breaks. I'm not sure if the text is fairly arbitrary and can be
replaced or whether the behaviour needs fixing?~~ - fixed by removing
the custom pubgrub PartialEq/Hash
- [ ] Are all the `TrackedFromStr` et al changes needed, or is there an
easier way? I don't think so, I think it's necessary to track these sort
of things fairly comprehensively to make this feature work, and this
sort of invasive change feels necessary, but happy to be proved wrong
there :)
- [x] ~~If you have a requirement coming in from two or more different
requirements files only one turns up. I've got a closed-source example
for this (can go into more detail if needed), mostly consisting of a
complicated set of common deps creating a larger set. It's a rarer case,
but worth considering.~~ 042432b200
- [ ] Doesn't add annotations for `setup.py` yet
- This is pretty hard, as the correct location to insert the path is
`crates/pypi-types/src/metadata.rs`'s `parse_pkg_info`, which as it's
based off a source distribution has entirely thrown away such matters as
"where did this package requirement get built from". Could add "`built
package name`" as a dep, but that's a little odd.
## Introduction
PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification
The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.
## `tool.uv.source`
Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:
```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
"tqdm >=4.66.2,<5",
"torch ==2.2.2",
"transformers[torch] >=4.39.3,<5",
"importlib_metadata >=7.1.0,<8; python_version < '3.10'",
"mollymawk ==0.1.0"
]
[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }
[tool.uv.workspace]
include = [
"packages/mollymawk"
]
[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```
See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.
This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.
## Changes
This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
* Git dependencies
* Url dependencies
* Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)
It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies
One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.
## Implementation
The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.
We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.
I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.
I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.
## Demo
Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:

[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
In *some* places in our crates, `serde` (and `rkyv`) are optional
dependencies. I believe this was done out of reasons of "good sense,"
that is, it follows a Rust ecosystem pattern where serde integration
tends to be an opt-in crate feature. (And similarly for `rkyv`.)
However, ultimately, `uv` itself requires `serde` and `rkyv` to
function. Since our crates are strictly internal, there are limited
consumers for our crates without `serde` (and `rkyv`) enabled. I think
one possibility is that optional `serde` (and `rkyv`) integration means
that someone can do this:
cargo test -p pep440_rs
And this will run tests _without_ `serde` or `rkyv` enabled. That in
turn could lead to faster iteration time by reducing compile times. But,
I'm not sure this is worth supporting. The iterative compilation times
of
individual crates are probably fast enough in debug mode, even with
`serde` and `rkyv` enabled. Namely, `serde` and `rkyv` themselves
shouldn't need to be re-compiled in most cases. On `main`:
```
from-scratch: `cargo test -p pep440_rs --lib` 0.685
incremental: `cargo test -p pep440_rs --lib` 0.278s
from-scratch: `cargo test -p pep440_rs --features serde,rkyv --lib` 3.948s
incremental: `cargo test -p pep440_rs --features serde,rkyv --lib` 0.321s
```
So while a from-scratch build does take significantly longer, an
incremental build is about the same.
The benefit of doing this change is two-fold:
1. It brings out crates into alignment with "reality." In particular,
some crates were _implicitly_ relying on `serde` being enabled
without explicitly declaring it. This technically means that our
`Cargo.toml`s were wrong in some cases, but it is hard to observe it
because of feature unification in a Cargo workspace.
2. We no longer need to deal with the cognitive burden of writing
`#[cfg_attr(feature = "serde", ...)]` everywhere.
Previously, uv-auth would fail to compile due to a missing process
feature. I chose to make all tokio features we use top level features,
so we can share the tokio cache between all test invocations.
## Summary
I'm surprised we haven't hit this before, but apparently we don't allow
comments after `--index-url`, `-e` entries, etc., in the
requirements.txt parser.
Closes#3011.
## Test Plan
`cargo test`
Needed to prevent circular dependencies in my toolchain work (#2931). I
think this is probably a reasonable change as we move towards persistent
configuration too?
Unfortunately `BuildIsolation` needs to be in `uv-types` to avoid
circular dependencies still. We might be able to resolve that in the
future.