Downstack PR: #4515 Upstack PR: #4481
Consider these two cases:
A:
```
werkzeug==2.0.0
werkzeug @ 960bb4017c/Werkzeug-2.0.0-py3-none-any.whl
```
B:
```toml
dependencies = [
"iniconfig == 1.1.1 ; python_version < '3.12'",
"iniconfig @ git+https://github.com/pytest-dev/iniconfig@93f5930e668c0d1ddf4597e38dd0dea4e2665e7a ; python_version >= '3.12'",
]
```
In the first case, `werkzeug==2.0.0` should be overridden by the url. In
the second case `iniconfig == 1.1.1` is in a different fork and must
remain a registry distribution.
That means the conversion from `Requirement` to `PubGrubPackage` is
dependent on the other requirements of the package. We can either look
into the other packages immediately, or we can move the forking before
the conversion to `PubGrubDependencies` instead of after. Either version
requires a flat list of `Requirement`s to use. This refactoring gives us
this list.
I'll add support for both of the above cases in the forking urls branch
before merging this PR. I also have to move constraints over to this.
## Summary
This is an intermediary change in enabling universal resolution for
`requirements.txt` files. To start, we need to be able to preserve
markers in the `requirements.txt` output _and_ propagate those markers,
such that if you have a dependency that's only included with a given
marker, the transitive dependencies respect that marker too.
Closes#1429.
When a fork is created from a list of dependencies, we were previously
adding all other sibling dependencies to every fork created. But this
isn't actually quite right, since the fork created is always created by
some marker expression. And while it is definitively disjoint from any
directly conflicting dependency specification, it is also possibly
disjoint with other dependencies. For example, as reported in #4414:
```toml
dependencies = [
"anyio==4.4.0 ; python_version >= '3.12'",
"anyio==4.3.0 ; python_version < '3.12'",
"b1 ; python_version >= '3.12'",
"b2 ; python_version < '3.12'",
]
```
The first two `anyio` requirements are conflicting with non-overlapping
marker expressions, and so a fork is created. Prior to this commit,
*both* `b1` and `b2` would be added to each fork. But of course, `b2` is
impossible in the `anyio==4.4.0` fork because of disjoint marker
expressions.
So in this commit, we specifically filter out any sibling dependencies
that could find their way into a fork that have disjoint markers with
that fork. We are careful to do this both when a new fork is created
from an existing set of dependencies, and when adding new dependencies
to a fork.
Fixes#4414
To support diverging urls, we have to check urls when adding
dependencies (after forking). To prepare for this, i've moved adding
dependencies for the current version to
`SolveState::add_package_version_dependencies` and removed the
duplication when checking for self-dependencies.
This changed is joined with a change in pubgrub
(https://github.com/astral-sh/pubgrub/pull/27) that simplifies the same
code path.
This commit adds marker expressions to our `Fork` type, which are in
turn passed down into `PubGrubDependencies::from_requirements` to filter
our any dependencies with markers that are disjoint from the fork's
marker expression.
This is necessary to avoid visiting packages in the dependency graph
that can never actually be installed. This is because when a fork is
created in the resolver, it always happens when there are two sibling
dependency specifications on a package with the same name, but with
non-overlapping marker expressions. Each fork corresponds to each
such conflicting dependency specification, and each fork assumes the
the corresponding marker expression as a pre-condition for any future
dependencies considered by it. That is, since the fork represents an
installation path that can only be taken when the corresponding
dependency specification (and its marker expression) is actually used,
it also therefore follows that the marker expression is true. Therefore,
any dependency visited in that fork with a marker expression that cannot
possibly be true when the markers of the fork are true can and ought to
be completely ignored.
There are some key invariants that I had to re-learn by reading the
code. This hopefully makes those invariants easier to discover by future
me (and others).
In the time before universal resolving, we would always pass a
`MarkerEnvironment`, and this environment would capture any relevant
`Requires-Python` specifier (including if `-p/--python` was provided on
the CLI).
But in universal resolution, we very specifically do not use a
`MarkerEnvironment` because we want to produce a resolution that is
compatible across potentially multiple environments. This in turn meant
that we lost `Requires-Python` filtering.
This PR adds it back. We do this by converting our `PythonRequirement`
into a `MarkerTree` that encodes the version specifiers in a
`Requires-Python` specifier. We then ask whether that `MarkerTree` is
disjoint with a dependency specification's `MarkerTree`. If it is, then
we know it's impossible for that dependency specification to every be
true, and we can completely ignore it.
## Summary
Similar to how we abstracted the dependencies into
`ResolutionDependencyNames`, I think it makes sense to abstract the base
packages into a `ResolutionPackage`. This also avoids leaking details
about the various `PubGrubPackage` enum variants to `ResolutionGraph`.
## Summary
If a package lacks a source distribution, and we can't find a compatible
wheel for the current platform, we need to just _assume_ that the
package will have a valid wheel on all platforms on which it's
requested; if not, we raise an error at install time.
It's possible that we can be smarter about this over time. For example,
if the package was requested _only_ for macOS, we could verify that
there's at least one macOS-compatible wheel. See the linked issue for
more details.
Closes https://github.com/astral-sh/uv/issues/4139.
The basic idea here is to make it so forking can only ever result in a
resolution that, for a particular marker environment, will only install
at most one version of a package. We can guarantee this by ensuring we
only fork on conflicting dependency specifications only when their
corresponding markers are completely disjoint. If they aren't, then
resolution _must_ find a single version of the package in the
intersection of the two dependency specifications.
A test for this case has been added to packse here:
https://github.com/astral-sh/packse/pull/182. Previously, that test
would result in a resolution with two different unconditional versions
of the same package. With this change, resolution fails (as it should).
A commit-by-commit review should be helpful here, since the first commit
is a refactor to make the second commit a bit more digestible.
## Summary
I think we should be able to model PubGrub such that this isn't
necessary (at least for the case described in the issue), but for now,
let's just avoid attempting to build very old distributions in
prefetching.
Closes https://github.com/astral-sh/uv/issues/4136.
## Summary
This PR adds a lowering similar to that seen in
https://github.com/astral-sh/uv/pull/3100, but this time, for markers.
Like `PubGrubPackageInner::Extra`, we now have
`PubGrubPackageInner::Marker`. The dependencies of the `Marker` are
`PubGrubPackageInner::Package` with and without the marker.
As an example of why this is useful: assume we have `urllib3>=1.22.0` as
a direct dependency. Later, we see `urllib3 ; python_version > '3.7'` as
a transitive dependency. As-is, we might (for some reason) pick a very
old version of `urllib3` to satisfy `urllib3 ; python_version > '3.7'`,
then attempt to fetch its dependencies, which could even involve
building a very old version of `urllib3 ; python_version > '3.7'`. Once
we fetch the dependencies, we would see that `urllib3` at the same
version is _also_ a dependency (because we tack it on). In the new
scheme though, as soon as we "choose" the very old version of `urllib3 ;
python_version > '3.7'`, we'd then see that `urllib3` (the base package)
is also a dependency; so we see a conflict before we even fetch the
dependencies of the old variant.
With this, I can successfully resolve the case in #4099.
Closes https://github.com/astral-sh/uv/issues/4099.
## Summary
This PR modifies our `Requires-Python` handling to treat
`Requires-Python` as a lower bound. There's extensive discussion around
this in https://github.com/astral-sh/uv/issues/4022 and the references
linked therein. I think it's an experiment worth trying. Even in my own
small projects, I'm running into issues whereby I'm being "forced" to
add a `<4` upper bound to my `Requires-Python` due to these caps.
Separately, we should explore adding a mechanism that's distinct from
`Requires-Python` to enable users to declare a supported range for
locking.
Closes https://github.com/astral-sh/uv/issues/4022.
## Summary
Externally, development dependencies are currently structured as a flat
list of PEP 580-compatible requirements:
```toml
[tool.uv]
dev-dependencies = ["werkzeug"]
```
When locking, we lock all development dependencies; when syncing, users
can provide `--dev`.
Internally, though, we model them as dependency groups, similar to
Poetry, PDM, and [PEP 735](https://peps.python.org/pep-0735). This
enables us to change out the user-facing frontend without changing the
internal implementation, once we've decided how these should be exposed
to users.
A few important decisions encoded in the implementation (which we can
change later):
1. Groups are enabled globally, for all dependencies. This differs from
extras, which are enabled on a per-requirement basis. Note, however,
that we'll only discover groups for uv-enabled packages anyway.
2. Installing a group requires installing the base package. We rely on
this in PubGrub to ensure that we resolve to the same version (even
though we only expect groups to come from workspace dependencies anyway,
which are unique). But anyway, that's encoded in the resolver right now,
just as it is for extras.
## Summary
This PR adds the `Requires-Python` range to the user's lockfile. This
will enable us to validate it when installing.
For now, we repeat the `Requires-Python` back to the user;
alternatively, though, we could detect the supported Python range
automatically.
See: https://github.com/astral-sh/uv/issues/4052
## Summary
Thankfully this is pretty rare since `pip sync` is usually run on `pip
compile` output, and `pip compile` never outputs markers.
Closes https://github.com/astral-sh/uv/issues/4044
## Summary
Instead of checking if the target and installed version are the same, we
model the data such that the target version is only present if it was
specified by the user. This also means that we correctly say "requested
version" even if the two happen to be the same.
## Summary
I believe this is no longer necessary. Part of the problem here is that
we can't _know_ the full set of available Python versions, especially
once we start resolving against a `Requires-Python` rather than a fixed
set of two versions.
## Summary
Once we use a _range_ rather than a precise version, it won't actually
make sense to return a version here. It's no longer required, so I'm
removing it.
## Summary
Running a resolution that required forking was failing due to breaking
an invariant in PubGrub. It looks like we were adding the same
incompatibility multiple times, or something like that. The issue
appears to be that when forking, we modify the current state, then clone
it as the "next state", then push to the "forked states" -- but that
means we're cloning the _modified_ state.
This PR changes the order of operations such that we clone, then modify.
It shouldn't introduce any additional clones though.
## Summary
This PR removes the static resolver map:
```rust
static RESOLVED_GIT_REFS: Lazy<Mutex<FxHashMap<RepositoryReference, GitSha>>> =
Lazy::new(Mutex::default);
```
With a `GitResolver` struct that we now pass around on the
`BuildContext`. There should be no behavior changes here; it's purely an
internal refactor with an eye towards making it cleaner for us to
"pre-populate" the list of resolved SHAs.
With the change, we remove the special casing of workspace dependencies
and resolve `tool.uv` for all git and directory distributions. This
gives us support for non-editable workspace dependencies and path
dependencies in other workspaces. It removes a lot of special casing
around workspaces. These changes are the groundwork for supporting
`tool.uv` with dynamic metadata.
The basis for this change is moving `Requirement` from
`distribution-types` to `pypi-types` and the lowering logic from
`uv-requirements` to `uv-distribution`. This changes should be split out
in separate PRs.
I've included an example workspace `albatross-root-workspace2` where
`bird-feeder` depends on `a` from another workspace `ab`. There's a
bunch of failing tests and regressed error messages that still need
fixing. It does fix the audited package count for the workspace tests.
We significantly regressed performance in some cases because we were
cloning the resolver state one more time than we needed to. That doesn't
sound like a lot, but in the case where there are no forks, it implies
we were cloning the state for every `get_dependencies` called when we
shouldn't have been cloning it at all.
Avoiding the clone results in somewhat tortured code. This can probably
be refactored by moving bits out to a helper routine, but that also
seemed non-trivial. So we let this suffice for now.
This addresses the lack of marker support in prior commits.
Specifically, we add them as a new field to `AnnotatedDist`, and from
there, they get added to a `Distribution` in a `Lock`.
This commit is a pretty invasive change that implements the merging
of resolutions created by each fork of the resolver.
The main idea here is that each `SolveState` is converted into a
`Resolution` (a new type) and stored on the heap after its fork
completes. When all forks complete, they are all merged into a single
`Resolution`. This `Resolution` is then used to build a `ResolutionGraph`.
Construction of `ResolutionGraph` mostly stays the same (despite the
gnarly diff due to an indent change) with one exception: the code to
extract dependency edges out of PubGrub's state has been moved to
`SolveState::into_resolution`. The idea here is that once a fork
completes, we extract what we need from the PubGrub state and then
throw it away. We store these edges in our own intermediate type which
is then converted into petgraph edges in the `ResolutionGraph`
constructor.
One interesting change we make here is that our edge
data is now a `Version` instead of a `Range<Version>`. I don't think
`Range<Version>` was actually being used anywhere, so this seems okay?
In any case, I think `Version` here is correct because a resolution
corresponds to specific dependencies of each package. Moreover, I didn't
see an easy way to make things work with `Range<Version>`. Notably,
since we no longer have the guarantee that there is only one version of
each package, we need to use `(PackageName, Version)` instead of just
`PackageName` for inverted lookups in `ResolutionGraph::from_state`.
Finally, the main resolver loop itself is changed a bit to track all
forked resolutions and then merge them at the end.
Note that we don't really have any dealings with markers in this commit.
We'll get to that in a subsequent commit.
This changes the constructor to just take an `InMemoryIndex`
directly instead of the constituent parts. No real reason other
than it seems a little simpler.
There are still some TODOs/FIXMEs here, but this makes represents a
chunk of the resolver refactoring to enable forking. We don't do any
merging of resolutions yet, so crucially, this code is broken when no
marker environment is provided. But when a marker environment is
provided, this should behave the same as a non-forking resolver. In
particular, `get_dependencies_forking` is just `get_dependencies`
whenever there's a marker environment.
## Summary
This PR adds extras to the lockfile, and enables users to selectively
sync extras in `uv sync` and `uv run`. The end result here was fairly
simple, though it required a few refactors to get here. The basic idea
is that `DistributionId` now includes `extra: Option<ExtraName>`, so we
effectively treat extras as separate packages. Generating the lockfile,
and generating the resolution from the lockfile, fall out of this
naturally with no special-casing or additional changes.
The main downside here is that it bloats the lockfile significantly.
Specifically:
- We include _all_ distribution URLs and hashes for _every_ extra
variant.
- We include all dependencies for the extra variant, even though that
are dependencies of the base package.
We could normalize this representation by changing each distribution
have an `optional-dependencies` hash map that keys on extras, but we
actually don't have the information we need to create that right now
(specifically, we can't differentiate between dependencies that
_require_ the extra and dependencies on the base package).
Closes#3700.
## Summary
There are a few behavior changes in here:
- We now enforce `--require-hashes` for editables, like pip. So if you
use `--require-hashes` with an editable requirement, we'll reject it. I
could change this if it seems off.
- We now treat source tree requirements, editable or not (e.g., both `-e
./black` and `./black`) as if `--refresh` is always enabled. This
doesn't mean that we _always_ rebuild them; but if you pass
`--reinstall`, then yes, we always rebuild them. I think this is an
improvement and is close to how editables work today.
Closes#3844.
Closes#2695.
When parsing requirements from any source, directly parse the url parts
(and reject unsupported urls) instead of parsing url parts at a later
stage. This removes a bunch of error branches and concludes the work
parsing url parts once and passing them around everywhere.
Many usages of the assembled `VerbatimUrl` remain, but these can be
removed incrementally.
Please review commit-by-commit.
Pubgrub stores incompatibilities as (package name, version range)
tuples, meaning it needs to clone the package name for each
incompatibility, and each non-borrowed operation on incompatibilities.
https://github.com/astral-sh/uv/pull/3673 made me realize that
`PubGrubPackage` has gotten large (expensive to copy), so like `Version`
and other structs, i've added an `Arc` wrapper around it.
It's a pity clippy forbids `.deref()`, it's less opaque than `&**` and
has IDE support (clicking on `.deref()` jumps to the right impl).
## Benchmarks
It looks like this matters most for complex resolutions which, i assume
because they carry larger `PubGrubPackageInner::Package` and
`PubGrubPackageInner::Extra` types.
```bash
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/jupyter.in" "./uv-branch pip compile -q ./scripts/requirements/jupyter.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/airflow.in" "./uv-branch pip compile -q ./scripts/requirements/airflow.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/boto3.in" "./uv-branch pip compile -q ./scripts/requirements/boto3.in"
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/jupyter.in
Time (mean ± σ): 18.2 ms ± 1.6 ms [User: 14.4 ms, System: 26.0 ms]
Range (min … max): 15.8 ms … 22.5 ms 181 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/jupyter.in
Time (mean ± σ): 17.8 ms ± 1.4 ms [User: 14.4 ms, System: 25.3 ms]
Range (min … max): 15.4 ms … 23.1 ms 159 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/jupyter.in ran
1.02 ± 0.12 times faster than ./uv-main pip compile -q ./scripts/requirements/jupyter.in
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/airflow.in
Time (mean ± σ): 153.7 ms ± 3.5 ms [User: 165.2 ms, System: 157.6 ms]
Range (min … max): 150.4 ms … 163.0 ms 19 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/airflow.in
Time (mean ± σ): 123.9 ms ± 4.6 ms [User: 152.4 ms, System: 133.8 ms]
Range (min … max): 118.4 ms … 138.1 ms 24 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/airflow.in ran
1.24 ± 0.05 times faster than ./uv-main pip compile -q ./scripts/requirements/airflow.in
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/boto3.in
Time (mean ± σ): 327.0 ms ± 3.8 ms [User: 344.5 ms, System: 71.6 ms]
Range (min … max): 322.7 ms … 334.6 ms 10 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/boto3.in
Time (mean ± σ): 311.2 ms ± 3.1 ms [User: 339.3 ms, System: 63.1 ms]
Range (min … max): 307.8 ms … 317.0 ms 10 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/boto3.in ran
1.05 ± 0.02 times faster than ./uv-main pip compile -q ./scripts/requirements/boto3.in
```
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->