Previously, the batch prefetcher was part of the solver loop, used
across forks. This would lead to each preference in a fork being counted
as a tried version, so that after 5 forks with the identical version, we
would start batch prefetching. The reported numbers of tried versions
are also reported. By tracking the batch prefetcher on the fork the
numbers are corrected.
An alternative would be tracking the actually tried versions, but that
would mean more overhead in the top level solver loop when the current
heuristic works.
In `ecosystem/transformers`:
```
$ hyperfine --runs 10 --prepare "rm -f uv.lock" "../../target/release/uv lock --exclude-newer 2024-08-08T00:00:00Z" "uv lock --exclude-newer 2024-08-08T00:00:00Z"
Benchmark 1: ../../target/release/uv lock --exclude-newer 2024-08-08T00:00:00Z
Time (mean ± σ): 386.2 ms ± 6.1 ms [User: 396.0 ms, System: 144.5 ms]
Range (min … max): 378.5 ms … 397.9 ms 10 runs
Benchmark 2: uv lock --exclude-newer 2024-08-08T00:00:00Z
Time (mean ± σ): 422.0 ms ± 5.5 ms [User: 459.6 ms, System: 190.3 ms]
Range (min … max): 415.0 ms … 430.5 ms 10 runs
Summary
../../target/release/uv lock --exclude-newer 2024-08-08T00:00:00Z ran
1.09 ± 0.02 times faster than uv lock --exclude-newer 2024-08-08T00:00:00Z
```
This updates the surrounding code to use the new ResolverEnvironment
type. In some cases, this simplifies caller code by removing case
analysis. There *shouldn't* be any behavior changes here. Some test
snapshots were updated to account for some minor tweaks to error
messages.
I didn't split this up into separate commits because it would have been
too difficult/costly.
## Summary
This PR lifts the restriction that a package must come from a single
index. For example, you can now do:
```toml
[project]
name = "project"
version = "0.1.0"
readme = "README.md"
requires-python = ">=3.12"
dependencies = ["jinja2"]
[tool.uv.sources]
jinja2 = [
{ index = "torch-cu118", marker = "sys_platform == 'darwin'"},
{ index = "torch-cu124", marker = "sys_platform != 'darwin'"},
]
[[tool.uv.index]]
name = "torch-cu118"
url = "https://download.pytorch.org/whl/cu118"
[[tool.uv.index]]
name = "torch-cu124"
url = "https://download.pytorch.org/whl/cu124"
```
The construction is very similar to the way we handle URLs today: you
can have multiple URLs for a given package, but they must appear in
disjoint forks. So most of the code is just adding that abstraction to
the resolver, following our handling of URLs.
Closes#7761.
## Summary
We now track the discovered `IndexCapabilities` for each `IndexUrl`. If
we learn that an index doesn't support range requests, we avoid doing
any batch prefetching.
Closes https://github.com/astral-sh/uv/issues/7221.
## Summary
Right now, we have slightly different `requires-python` semantics for
`-p 3.11` vs. `-p 3.11 --universal`, and slightly different (wrong)
semantics for how we compare against the _installed_ Python version
(which doesn't ignore upper bounds, but should).
This PR rips it all out and replaces it with consistent semantics across
`uv lock`, `uv pip compile -p 3.11`, and `uv pip compile -p 3.11
--universal`. We now always ignore upper bounds.
Closes https://github.com/astral-sh/uv/issues/6859.
Closes https://github.com/astral-sh/uv/issues/5045.
Downstack PR: #4481
## Introduction
We support forking the dependency resolution to support conflicting
registry requirements for different platforms, say on package range is
required for an older python version while a newer is required for newer
python versions, or dependencies that are different per platform. We
need to extend this support to direct URL requirements.
```toml
dependencies = [
"iniconfig @ 62565a6e1c/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
"iniconfig @ b3c12c6d70/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'"
]
```
This did not work because `Urls` was built on the assumption that there
is a single allowed URL per package. We collect all allowed URL ahead of
resolution by following direct URL dependencies (including path
dependencies) transitively, i.e. a registry distribution can't require a
URL.
## The same package can have Registry and URL requirements
Consider the following two cases:
requirements.in:
```text
werkzeug==2.0.0
werkzeug @ 960bb4017c/Werkzeug-2.0.0-py3-none-any.whl
```
pyproject.toml:
```toml
dependencies = [
"iniconfig == 1.1.1 ; python_version < '3.12'",
"iniconfig @ git+https://github.com/pytest-dev/iniconfig@93f5930e668c0d1ddf4597e38dd0dea4e2665e7a ; python_version >= '3.12'",
]
```
In the first case, we want the URL to override the registry dependency,
in the second case we want to fork and have one branch use the registry
and the other the URL. We have to know about this in
`PubGrubRequirement::from_registry_requirement`, but we only fork after
the current method.
Consider the following case too:
a:
```
c==1.0.0
b @ https://b.zip
```
b:
```
c @ https://c_new.zip ; python_version >= '3.12'",
c @ https://c_old.zip ; python_version < '3.12'",
```
When we convert the requirements of `a`, we can't know the url of `c`
yet. The solution is to remove the `Url` from `PubGrubPackage`: The
`Url` is redundant with `PackageName`, there can be only one url per
package name per fork. We now do the following: We track the urls from
requirements in `PubGrubDependency`. After forking, we call
`add_package_version_dependencies` where we apply override URLs, check
if the URL is allowed and check if the url is unique in this fork. When
we request a distribution, we ask the fork urls for the real URL. Since
we prioritize url dependencies over registry dependencies and skip
packages with `Urls` entries in pre-visiting, we know that when fetching
a package, we know if it has a url or not.
## URL conflicts
pyproject.toml (invalid):
```toml
dependencies = [
"iniconfig @ e96292c7f7/iniconfig-1.1.0.tar.gz",
"iniconfig @ b3c12c6d70/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'",
"iniconfig @ 62565a6e1c/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
]
```
On the fork state, we keep `ForkUrls` that check for conflicts after
forking, rejecting the third case because we added two packages of the
same name with different URLs.
We need to flatten out the requirements before transformation into
pubgrub requirements to get the full list of other requirements which
may contain a URL, which was changed in a previous PR: #4430.
## Complex Example
a:
```toml
dependencies = [
# Force a split
"anyio==4.3.0 ; python_version >= '3.12'",
"anyio==4.2.0 ; python_version < '3.12'",
# Include URLs transitively
"b"
]
```
b:
```toml
dependencies = [
# Only one is used in each split.
"b1 ; python_version < '3.12'",
"b2 ; python_version >= '3.12'",
"b3 ; python_version >= '3.12'",
]
```
b1:
```toml
dependencies = [
"iniconfig @ b3c12c6d70/iniconfig-1.1.1-py2.py3-none-any.whl",
]
```
b2:
```toml
dependencies = [
"iniconfig @ 62565a6e1c/iniconfig-2.0.0-py3-none-any.whl",
]
```
b3:
```toml
dependencies = [
"iniconfig @ e96292c7f7/iniconfig-1.1.0.tar.gz",
]
```
In this example, all packages are url requirements (directory
requirements) and the root package is `a`. We first split on `a`, `b`
being in each split. In the first fork, we reach `b1`, the fork URLs are
empty, we insert the iniconfig 1.1.1 URL, and then we skip over `b2` and
`b3` since the mark is disjoint with the fork markers. In the second
fork, we skip over `b1`, visit `b2`, insert the iniconfig 2.0.0 URL into
the again empty fork URLs, then visit `b3` and try to insert the
iniconfig 1.1.0 URL. At this point we find a conflict for the iniconfig
URL and error.
## Closing
The git tests are slow, but they make the best example for different URL
types i could find.
Part of #3927. This PR does not handle `Locals` or pre-releases yet.
## Summary
I think we should be able to model PubGrub such that this isn't
necessary (at least for the case described in the issue), but for now,
let's just avoid attempting to build very old distributions in
prefetching.
Closes https://github.com/astral-sh/uv/issues/4136.
## Summary
This PR adds a lowering similar to that seen in
https://github.com/astral-sh/uv/pull/3100, but this time, for markers.
Like `PubGrubPackageInner::Extra`, we now have
`PubGrubPackageInner::Marker`. The dependencies of the `Marker` are
`PubGrubPackageInner::Package` with and without the marker.
As an example of why this is useful: assume we have `urllib3>=1.22.0` as
a direct dependency. Later, we see `urllib3 ; python_version > '3.7'` as
a transitive dependency. As-is, we might (for some reason) pick a very
old version of `urllib3` to satisfy `urllib3 ; python_version > '3.7'`,
then attempt to fetch its dependencies, which could even involve
building a very old version of `urllib3 ; python_version > '3.7'`. Once
we fetch the dependencies, we would see that `urllib3` at the same
version is _also_ a dependency (because we tack it on). In the new
scheme though, as soon as we "choose" the very old version of `urllib3 ;
python_version > '3.7'`, we'd then see that `urllib3` (the base package)
is also a dependency; so we see a conflict before we even fetch the
dependencies of the old variant.
With this, I can successfully resolve the case in #4099.
Closes https://github.com/astral-sh/uv/issues/4099.
## Summary
Externally, development dependencies are currently structured as a flat
list of PEP 580-compatible requirements:
```toml
[tool.uv]
dev-dependencies = ["werkzeug"]
```
When locking, we lock all development dependencies; when syncing, users
can provide `--dev`.
Internally, though, we model them as dependency groups, similar to
Poetry, PDM, and [PEP 735](https://peps.python.org/pep-0735). This
enables us to change out the user-facing frontend without changing the
internal implementation, once we've decided how these should be exposed
to users.
A few important decisions encoded in the implementation (which we can
change later):
1. Groups are enabled globally, for all dependencies. This differs from
extras, which are enabled on a per-requirement basis. Note, however,
that we'll only discover groups for uv-enabled packages anyway.
2. Installing a group requires installing the base package. We rely on
this in PubGrub to ensure that we resolve to the same version (even
though we only expect groups to come from workspace dependencies anyway,
which are unique). But anyway, that's encoded in the resolver right now,
just as it is for extras.
This addresses the lack of marker support in prior commits.
Specifically, we add them as a new field to `AnnotatedDist`, and from
there, they get added to a `Distribution` in a `Lock`.
Pubgrub stores incompatibilities as (package name, version range)
tuples, meaning it needs to clone the package name for each
incompatibility, and each non-borrowed operation on incompatibilities.
https://github.com/astral-sh/uv/pull/3673 made me realize that
`PubGrubPackage` has gotten large (expensive to copy), so like `Version`
and other structs, i've added an `Arc` wrapper around it.
It's a pity clippy forbids `.deref()`, it's less opaque than `&**` and
has IDE support (clicking on `.deref()` jumps to the right impl).
## Benchmarks
It looks like this matters most for complex resolutions which, i assume
because they carry larger `PubGrubPackageInner::Package` and
`PubGrubPackageInner::Extra` types.
```bash
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/jupyter.in" "./uv-branch pip compile -q ./scripts/requirements/jupyter.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/airflow.in" "./uv-branch pip compile -q ./scripts/requirements/airflow.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/boto3.in" "./uv-branch pip compile -q ./scripts/requirements/boto3.in"
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/jupyter.in
Time (mean ± σ): 18.2 ms ± 1.6 ms [User: 14.4 ms, System: 26.0 ms]
Range (min … max): 15.8 ms … 22.5 ms 181 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/jupyter.in
Time (mean ± σ): 17.8 ms ± 1.4 ms [User: 14.4 ms, System: 25.3 ms]
Range (min … max): 15.4 ms … 23.1 ms 159 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/jupyter.in ran
1.02 ± 0.12 times faster than ./uv-main pip compile -q ./scripts/requirements/jupyter.in
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/airflow.in
Time (mean ± σ): 153.7 ms ± 3.5 ms [User: 165.2 ms, System: 157.6 ms]
Range (min … max): 150.4 ms … 163.0 ms 19 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/airflow.in
Time (mean ± σ): 123.9 ms ± 4.6 ms [User: 152.4 ms, System: 133.8 ms]
Range (min … max): 118.4 ms … 138.1 ms 24 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/airflow.in ran
1.24 ± 0.05 times faster than ./uv-main pip compile -q ./scripts/requirements/airflow.in
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/boto3.in
Time (mean ± σ): 327.0 ms ± 3.8 ms [User: 344.5 ms, System: 71.6 ms]
Range (min … max): 322.7 ms … 334.6 ms 10 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/boto3.in
Time (mean ± σ): 311.2 ms ± 3.1 ms [User: 339.3 ms, System: 63.1 ms]
Range (min … max): 307.8 ms … 317.0 ms 10 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/boto3.in ran
1.05 ± 0.02 times faster than ./uv-main pip compile -q ./scripts/requirements/boto3.in
```
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR introduces parallelism to the resolver. Specifically, we can
perform PubGrub resolution on a separate thread, while keeping all I/O
on the tokio thread. We already have the infrastructure set up for this
with the channel and `OnceMap`, which makes this change relatively
simple. The big change needed to make this possible is removing the
lifetimes on some of the types that need to be shared between the
resolver and pubgrub thread.
A related PR, https://github.com/astral-sh/uv/pull/1163, found that
adding `yield_now` calls improved throughput. With optimal scheduling we
might be able to get away with everything on the same thread here.
However, in the ideal pipeline with perfect prefetching, the resolution
and prefetching can run completely in parallel without depending on one
another. While this would be very difficult to achieve, even with our
current prefetching pattern we see a consistent performance improvement
from parallelism.
This does also require reverting a few of the changes from
https://github.com/astral-sh/uv/pull/3413, but not all of them. The
sharing is isolated to the resolver task.
## Test Plan
On smaller tasks performance is mixed with ~2% improvements/regressions
on both sides. However, on medium-large resolution tasks we see the
benefits of parallelism, with improvements anywhere from 10-50%.
```
./scripts/requirements/jupyter.in
Benchmark 1: ./target/profiling/baseline (resolve-warm)
Time (mean ± σ): 29.2 ms ± 1.8 ms [User: 20.3 ms, System: 29.8 ms]
Range (min … max): 26.4 ms … 36.0 ms 91 runs
Benchmark 2: ./target/profiling/parallel (resolve-warm)
Time (mean ± σ): 25.5 ms ± 1.0 ms [User: 19.5 ms, System: 25.5 ms]
Range (min … max): 23.6 ms … 27.8 ms 99 runs
Summary
./target/profiling/parallel (resolve-warm) ran
1.15 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm)
```
```
./scripts/requirements/boto3.in
Benchmark 1: ./target/profiling/baseline (resolve-warm)
Time (mean ± σ): 487.1 ms ± 6.2 ms [User: 464.6 ms, System: 61.6 ms]
Range (min … max): 480.0 ms … 497.3 ms 10 runs
Benchmark 2: ./target/profiling/parallel (resolve-warm)
Time (mean ± σ): 430.8 ms ± 9.3 ms [User: 529.0 ms, System: 77.2 ms]
Range (min … max): 417.1 ms … 442.5 ms 10 runs
Summary
./target/profiling/parallel (resolve-warm) ran
1.13 ± 0.03 times faster than ./target/profiling/baseline (resolve-warm)
```
```
./scripts/requirements/airflow.in
Benchmark 1: ./target/profiling/baseline (resolve-warm)
Time (mean ± σ): 478.1 ms ± 18.8 ms [User: 482.6 ms, System: 205.0 ms]
Range (min … max): 454.7 ms … 508.9 ms 10 runs
Benchmark 2: ./target/profiling/parallel (resolve-warm)
Time (mean ± σ): 308.7 ms ± 11.7 ms [User: 428.5 ms, System: 209.5 ms]
Range (min … max): 287.8 ms … 323.1 ms 10 runs
Summary
./target/profiling/parallel (resolve-warm) ran
1.55 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm)
```
Our current flow of data from "simple registry package" to "final
resolved distribution" goes through a number of types:
* `SimpleMetadata` is the API response from a registry that includes all
published versions for a package. Each version has an assortment of
metadata
associated with it.
* `VersionFiles` is the aforementioned metadata. It is split in two: a
group of files for source distributions and a group of files for wheels.
* `PrioritizedDist` collects a subset of the files from `VersionFiles`
to form a selection of the "best" sdist and the "best" wheel for the
current environment.
* `CompatibleDist` is created from a borrowed `PrioritizedDist` that,
perhaps among other things, encapsulates the decision of whether to pick
an sdist or a wheel. (This decision depends both on compatibility and
the action being performed. e.g., When doing installation, a
`CompatibleDist` will sometimes select an sdist over a wheel.)
* `ResolvedDistRef` is like a `ResolvedDist`, but borrows a `Dist`.
* `ResolvedDist` is the almost-final-form of a distribution in a
resolution and is created from a `ResolvedDistRef`.
* `AnnotatedResolvedDist` is a new data type that is the actual final
form of a distribution that a universal lock file cares about. It
bundles a `ResolvedDist` with some metadata needed to generate a lock
file.
One of the requirements of a universal lock file is that we include all
wheels (and maybe all source distributions? but at least one if it's
present) associated with a distribution. But the above flow of data (in
the step from `VersionFiles` to `PrioritizedDist`) drops all wheels
except for the best one.
To remedy this, in this PR, we rejigger `PrioritizedDist`,
`CompatibleDist` and `ResolvedDistRef` so that all wheel data is
preserved. And when a `ResolvedDistRef` is finally turned into a
`ResolvedDist`, we copy all of the wheel data. And finally, we adjust
the `Lock` constructor to read this new data and include it in the lock
file. To make this work, we also modify `RegistryBuiltDist` so that it
can contain one or more wheels instead of just one.
One shortcoming here (called out in the code as a FIXME) is that if a
source distribution is selected as the "best" thing to use (perhaps
there are no compatible wheels), then the wheels won't end up in the
lock file. I plan to fix this in a follow-up PR.
We also aren't totally consistent on source distribution naming.
Sometimes we use `sdist`. Sometimes `source`. Sometimes `source_dist`.
I think it'd be nice to just use `sdist` everywhere, but I do prefer
the type names to be `SourceDist`. And sometimes you want function
names to match the type names (i.e., `from_source_dist`), which in turn
leads to an appearance of inconsistency. I'm open to ideas.
Closes#3351
## Summary
This PR enables `--require-hashes` with unnamed requirements. The key
change is that `PackageId` becomes `VersionId` (since it refers to a
package at a specific version), and the new `PackageId` consists of
_either_ a package name _or_ a URL. The hashes are keyed by `PackageId`,
so we can generate the `RequiredHashes` before we have names for all
packages, and enforce them throughout.
Closes#2979.
## Summary
The prefetcher tallies the number of times we tried a given package, and
then once we hit a threshold, grabs the version map, assuming it's
already been fetched. For direct URL distributions, though, we don't
have a version map! And there's no need to prefetch.
Closes https://github.com/astral-sh/uv/issues/2941.
With pubgrub being fast for complex ranges, we can now compute the next
n candidates without taking a performance hit. This speeds up cold cache
`urllib3<1.25.4` `boto3` from maybe 40s - 50s to ~2s. See docstrings for
details on the heuristics.
**Before**

**After**

---
We need two parts of the prefetching, first looking for compatible
version and then falling back to flat next versions. After we selected a
boto3 version, there is only one compatible botocore version remaining,
so when won't find other compatible candidates for prefetching. We see
this as a pattern where we only prefetch boto3 (stack bars), but not
botocore (sequential requests between the stacked bars).

The risk is that we're completely wrong with the guess and cause a lot
of useless network requests. I think this is acceptable since this
mechanism only triggers when we're already on the bad path and we should
simply have fetched all versions after some seconds (assuming a fast
index like pypi).
---
It would be even better if the pubgrub state was copy-on-write so we
could simulate more progress than we actually have; currently we're
guessing what the next version is which could be completely wrong, but i
think this is still a valuable heuristic.
Fixes#170.