Background reading: https://github.com/astral-sh/uv/issues/8157
Companion PR: https://github.com/astral-sh/pubgrub/pull/36
Requires for test coverage: https://github.com/astral-sh/packse/pull/230
When two packages A and B conflict, we have the option to choose a lower
version of A, or a lower version of B. Currently, we determine this by
the order we saw a package (assuming equal specificity of the
requirement): If we saw A before B, we pin A until all versions of B are
exhausted. This can lead to undesirable outcomes, from cases where it's
just slow (sentry) to others cases without lower bounds where be
backtrack to a very old version of B. This old version may fail to build
(terminating the resolution), or it's a version so old that it doesn't
depend on A (or the shared conflicting package) anymore - but also is
too old for the user's application (fastapi). #8157 collects such cases,
and the `wrong-backtracking` packse scenario contains a minimized
example.
We try to solve this by tracking which packages are "A"s, culprits, and
"B"s, affected, and manually interfering with project selection and
backtracking. Whenever a version we just chose is rejected, we give the
current package a counter for being affected, and the package it
conflicted with a counter for being a culprit. If a package accumulates
more counts than a threshold, we reprioritize: Undecided after the
culprits, after the affected, after packages that only have a single
version (URLs, `==<version>`). We then ask pubgrub to backtrack just
before the culprit. Due to the changed priorities, we now select package
B, the affected, instead of package A, the culprit.
To do this efficiently, we ask pubgrub for the incompatibility that
caused backtracking, or just the last version to be discarded (due to
its dependencies). For backtracking, we use the last incompatibility
from unit propagation as a heuristic. When a version is discarded
because one of its dependencies conflicts with the partial solution, the
incompatibility tells us the package in the partial solution that
conflicted.
We only backtrack once per package, on the first time it passes the
threshold. This prevents backtracking loops in which we make the same
decisions over and over again. But we also changed the priority, so that
we shouldn't take the same path even after the one time we backtrack (it
would defeat the purpose of this change).
There are some parameters that can be tweaked: Currently, the threshold
is set to 5, which feels not too eager with so me of the conflicts that
we want to tolerate but also changes strategies quickly. The relative
order of the new priorities can also be changed, as for each (A, B) pair
the priority of B is afterwards lower than that for A. Currently,
culprits capture conflict for the whole package, but we could limit that
to a specific version. We could discard conflict counters after
backtracking instead of keeping them eternally as we do now. Note that
we're always taking about pairs (A, B), but in practice we track
individual packages, not pairs.
A case that we wouldn't capture is when B is only introduced to the
dependency graph after A, but I think that would require cyclical
dependency for A and B to conflict? There may also be cases where
looking at the last incompatibility is insufficient.
Another example that we can't repair with prioritization is
urllib3/boto3/botocore: We actually have to check all the newer versions
of boto3 and botocore to identify the version that allows with the older
urllib3, no shortcuts allowed.
```
urllib3<1.25.4
boto3
```
All examples I tested were cases with two packages where we only had to
switch the order, so I've abstracted them into a single packse case.
This PR changes the resolution for certain paths, and there is the risk
for regressions.
Fixes#8157
---
All tested examples improved.
Input fastapi:
```text
starlette<=0.36.0
fastapi<=0.115.2
```
```
# BEFORE
$ uv pip --no-progress compile -p 3.11 --exclude-newer 2024-10-01 --no-annotate debug/fastapi.txt
annotated-types==0.7.0
anyio==4.6.0
fastapi==0.1.17
idna==3.10
pydantic==2.9.2
pydantic-core==2.23.4
sniffio==1.3.1
starlette==0.36.0
typing-extensions==4.12.2
# AFTER
$ cargo run --profile fast-build --no-default-features pip compile -p 3.11 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/fastapi.txt
annotated-types==0.7.0
anyio==4.6.0
fastapi==0.109.1
idna==3.10
pydantic==2.9.2
pydantic-core==2.23.4
sniffio==1.3.1
starlette==0.35.1
typing-extensions==4.12.2
```
Input xarray:
```text
xarray[accel]
```
```
# BEFORE
$ uv pip --no-progress compile -p 3.11 --exclude-newer 2024-10-01 --no-annotate debug/xarray-accel.txt
bottleneck==1.4.0
flox==0.9.13
llvmlite==0.36.0
numba==0.53.1
numbagg==0.8.2
numpy==2.1.1
numpy-groupies==0.11.2
opt-einsum==3.4.0
packaging==24.1
pandas==2.2.3
python-dateutil==2.9.0.post0
pytz==2024.2
scipy==1.14.1
setuptools==75.1.0
six==1.16.0
toolz==0.12.1
tzdata==2024.2
xarray==2024.9.0
# AFTER
$ cargo run --profile fast-build --no-default-features pip compile -p 3.11 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/xarray-accel.txt
bottleneck==1.4.0
flox==0.9.13
llvmlite==0.43.0
numba==0.60.0
numbagg==0.8.2
numpy==2.0.2
numpy-groupies==0.11.2
opt-einsum==3.4.0
packaging==24.1
pandas==2.2.3
python-dateutil==2.9.0.post0
pytz==2024.2
scipy==1.14.1
six==1.16.0
toolz==0.12.1
tzdata==2024.2
xarray==2024.9.0
```
Input sentry: The resolution is identical, but arrived at much faster:
main tries 69 versions (sentry-kafka-schemas: 63), PR tries 12 versions
(sentry-kafka-schemas: 6; 5 times conflicting, then once the right
version).
```text
python-rapidjson<=1.20,>=1.4
sentry-kafka-schemas<=0.1.113,>=0.1.50
```
```
# BEFORE
$ uv pip --no-progress compile -p 3.11 --exclude-newer 2024-10-01 --no-annotate debug/sentry.txt
fastjsonschema==2.20.0
msgpack==1.1.0
python-rapidjson==1.8
pyyaml==6.0.2
sentry-kafka-schemas==0.1.111
typing-extensions==4.12.2
# AFTER
$ cargo run --profile fast-build --no-default-features pip compile -p 3.11 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/sentry.txt
fastjsonschema==2.20.0
msgpack==1.1.0
python-rapidjson==1.8
pyyaml==6.0.2
sentry-kafka-schemas==0.1.111
typing-extensions==4.12.2
```
Input apache-beam
```text
# Run on Python 3.10
dill<0.3.9,>=0.2.2
apache-beam<=2.49.0
```
```
# BEFORE
$ uv pip --no-progress compile -p 3.10 --exclude-newer 2024-10-01 --no-annotate debug/apache-beam.txt
× Failed to download and build `apache-beam==2.0.0`
╰─▶ Build backend failed to determine requirements with `build_wheel()` (exit status: 1)
# AFTER
$ cargo run --profile fast-build --no-default-features pip compile -p 3.10 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/apache-beam.txt
apache-beam==2.49.0
certifi==2024.8.30
charset-normalizer==3.3.2
cloudpickle==2.2.1
crcmod==1.7
dill==0.3.1.1
dnspython==2.6.1
docopt==0.6.2
fastavro==1.9.7
fasteners==0.19
grpcio==1.66.2
hdfs==2.7.3
httplib2==0.22.0
idna==3.10
numpy==1.24.4
objsize==0.6.1
orjson==3.10.7
proto-plus==1.24.0
protobuf==4.23.4
pyarrow==11.0.0
pydot==1.4.2
pymongo==4.10.0
pyparsing==3.1.4
python-dateutil==2.9.0.post0
pytz==2024.2
regex==2024.9.11
requests==2.32.3
six==1.16.0
typing-extensions==4.12.2
urllib3==2.2.3
zstandard==0.23.0
```
## Summary
This PR makes the behavior in https://github.com/astral-sh/uv/pull/9827
the default: we try to select the latest supported package version for
each supported Python version, but we still optimize for choosing fewer
versions when stratifying by platform.
However, you can opt out with `--fork-strategy fewest`.
Closes https://github.com/astral-sh/uv/issues/7190.
## Summary
This PR addresses a significant limitation in the resolver whereby we
avoid choosing the latest versions of packages when the user supports a
wider range.
For example, with NumPy, the latest versions only support Python 3.10
and later. If you lock a project with `requires-python = ">=3.8"`, we
pick the last NumPy version that supported Python 3.8, and use that for
_all_ Python versions. So you get `1.24.4` for all versions, rather than
`2.2.0`. And we'll never upgrade you unless you bump your
`requires-python`. (Even worse, those versions don't have wheels for
Python 3.12, etc., so you end up building from source.)
(As-is, this is intentional. We optimize for minimizing the number of
selected versions, and the current logic does that well!)
Instead, we know recognize when a version has an elevated
`requires-python` specifier and fork. This is a new fork point, since we
need to fork once we have the package metadata, as opposed to when we
see the dependencies.
In this iteration, I've made this behavior the default. I'm sort of
undecided on whether I want to push on that... Previously, I'd suggested
making it opt-in via a setting
(https://github.com/astral-sh/uv/pull/8686).
Closes https://github.com/astral-sh/uv/issues/8492.
## Summary
Very tricky problem whereby `workspace_root.join(path)` returns the
workspace root with a trailing slash if `path` is empty... This caused
us to accidentally _include_ excluded members during workspace
discovery, since (e.g.) `packages/seeds` doesn't match
`packages/seeds/`.
Closes
https://github.com/astral-sh/uv/issues/9832#issuecomment-2539121761.
Since we don't (currently) include conflict markers with our
`resolution-markers` in the lock file, it's possible that we end up
with duplicate markers. This happens when the resolver creates more
than one fork with the same PEP 508 markers but different conflict
markers, _and_ where those PEP 508 markers don't simplify to "always
true" after accounting for `requires-python`.
This change should be a strict improvement on the status quo. We aren't
removing any information. It is possible that we should be writing
conflict markers here (like we do for dependency edges), but I haven't
been able to come up with a case or think through a scenario where they
are necessary.
Fixes#9296
The resolver methods are already too large and complex, especially
`choose_version*`, so i wanted to shrink and simplify them a bit before
adding new methods to them.
I've split `MetadataResponse` into three variants: success, non-fatal
error (reported through pubgrub), fatal error (reported as error trace).
The resulting non-fatal `MetadataUnavailable` type is equivalent to the
`IncompletePackage` type, so they are now merged. (`UnavailableVersion`
is a bit different since, besides the extra `IncompatibleDist` variant,
it have no error source attached). This shows that the missing metadata
variant was unused, which I removed.
Tagging as error messages for the logging format changes.
This PR adds a notion of "conflict markers" to the lock file as an
attempt to address #9289. The idea is to encode a new kind of boolean
expression indicating how to choose dependencies based on which extras
are activated.
As an example of what conflict markers look like, consider one of the
cases
brought up in #9289, where `anyio` had unconditional dependencies on
two different versions of `idna`. Now, those are gated by markers, like
this:
```toml
[[package]]
name = "anyio"
version = "4.3.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "idna", version = "3.5", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-7-project-foo'" },
{ name = "idna", version = "3.6", source = { registry = "https://pypi.org/simple" }, marker = "extra == 'extra-7-project-bar' or extra != 'extra-7-project-foo'" },
{ name = "sniffio" },
]
```
The odd extra values like `extra-7-project-foo` are an encoding of not
just the conflicting extra (`foo`) but also the package it's declared
for (`project`). We need both bits of information because different
packages may have the same extra name, even if they are completely
unrelated. The `extra-` part is a prefix to distinguish it from groups
(which, in this case, would be encoded as `group-7-project-foo` if `foo`
were a dependency group). And the `7` part indicates the length of the
package name which makes it possible to parse out the package and extra
name from this encoding. (We don't actually utilize that property, but
it seems like good sense to do it in case we do need to extra
information from these markers.)
While this preserves PEP 508 compatibility at a surface level, it does
require utilizing this encoding scheme in order
to evaluate them when they're present (which only occurs when
conflicting extras/groups are declared).
My sense is that the most complex part of this change is not just adding
conflict markers, but their simplification. I tried to address this in
the code comments and commit messages.
Reviewers should look at this commit-by-commit.
Fixes#9289, Fixes#9546, Fixes#9640, Fixes#9622, Fixes#9498, Fixes
#9701, Fixes#9734
## Summary
Sort of ridiculous, but today this passes, when it should fail:
```toml
[project]
name = "foo"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13.0"
dependencies = []
[project.optional-dependencies]
async = [
"foo[async]==0.2.0",
]
```
Instead of modifying the error to replace a dummy derivation chain from
construction with the real one, build the error with the real derivation
chain directly.
This came up when trying to improve the build error reporting.
Introduces `DistErrorKind` to avoid error variants for each case that
are only different in one line of the message.
In https://github.com/astral-sh/uv/issues/8155#issuecomment-2508969900,
resolution lowest was complaining about missing lower bounds for a
pacakge, even though the package had a URL, too:
```
uv pip install dist/pymatgen-2024.10.3.tar.gz pymatgen[ci,optional] --resolution=lowest
```
The error was raised from `pymatgen[ci,optional]`, because we were
looking at it before looking at the "URL"
`dist/pymatgen-2024.10.3.tar.gz`.
I've also added constraints and overrides to the bounds lookup, since
they are missing from the dependency graph.
Fixes#8155 (again)
When encountering `dynamic = ["version"]` in the pyproject.toml of a
source dist, we can ignore that and treat it as a statically known
metadata distribution, since the filename tells us the version and that
version must not change on build.
This fixed locking PyGObject 3.50.0 from `pygobject-3.50.0.tar.gz`
(minimized):
```toml
[project]
name = "PyGObject"
description = "Python bindings for GObject Introspection"
requires-python = ">=3.9, <4.0"
dependencies = [
"pycairo>=1.16"
]
dynamic = ["version"]
```
Afterwards, `uv add --no-sync toga` passes on Ubuntu 24.04 without the
pygobject build deps, when previously it needed `{ name = "pygobject",
version = "3.50.0", requires-dist = [], requires-python = ">=3.9" }`.
I've added a check that source distribution versions are respected after
build.
Fixes#9548
## Summary
Today, our dependency group implementation is a little awkward... For
each package `P`, we check if `P` contains dependencies for each enabled
group, then add a dependency on `P` with the group enabled. There are a
few issues here:
1. It's sort of backwards... We add a dependency from the base package
`P` to `P` with the group enabled. Then `P` with the group enabled adds
a dependency on the base package.
2. We can't, e.g., enable different groups for different packages. (We
don't have a way for users to specify this on the CLI, but there's no
reason that it should be _impossible_ in the resolver.)
3. It's inconsistent with how extras work, which leads to confusing
differences in the resolver.
Instead, our internal requirement type can now include dependency
groups, which makes dependency groups look much, much more like extras
in the resolver.
## Summary
Discovered while working on https://github.com/astral-sh/uv/issues/9516.
In the linked repo, the root uses a `../dependency` path for the
workspace member, which we weren't normalizing.
This _partially_ unwinds the optimization in #9540 by adding back the
base package dependency as a sibling to the extra package dependency
in some cases. Specifically, this occurs when _any_ of the extras are
declared as conflicting.
This is believed to be necessary (until another method is found) to
handle the forking logic based on conflicts. Namely, the forking logic
depends on the base and extra packages being sibling dependencies. If
only the extra is present, then it won't be included in the fork that
excludes all conflicting extras. And that means the base package won't
either, even though it should be included in that fork in some cases. If
the base package dependency is deferred, then it will never be reached.
This also adds another test and updates the snapshots that would have
caught the regression in #9540 if the conflict tests had been enabled.
## Summary
Previously, when we encountered `foo[bar]`, we'd add a dependency on
`PubGrubPackage::Package` for `foo`, and then `PubGrubPackage::Extra`
for `foo[bar]`.
Later, when we ask for the dependencies of the `PubGrubPackage::Extra`,
we add `PubGrubPackage::Package` for `foo`, and
`PubGrubPackage::Package` for `foo[bar]`. This is an intentional
strategy because it ensures that PubGrub "knows" that these have to be
solved to the same version as early as possible.
It turns out that the first part here ("add a dependency on
`PubGrubPackage::Package` for `foo`") is suboptimal, because it means
PubGrub might try to solve _just_ `foo` without realizing that it also
has to accommodate all the constraints from the extra.
Instead, we now add _just_ `PubGrubPackage::Extra` for `foo[bar]`, and
defer adding the base package. It looks like this leads to a far more
efficient solve for Airflow.
## Summary
When we serialize and deserialize the lockfile, we remove the conflict
markers. So in the linked case, the edges for the `tqdm` entries are
like:
```
complexified_marker: UniversalMarker {
pep508_marker: python_full_version >= '3.9.0',
conflict_marker: true,
},
```
However... when we evaluate in-memory, the conflict markers are still
there...
```
complexified_marker: UniversalMarker {
pep508_marker: true,
conflict_marker: extra == 't1' and extra != 't2',
},
```
So if `uv run` creates the lockfile, we evaluate this as `false`.
We should make this consistent, and I expect @BurntSushi is aware. But
for now, it's reasonable / correct to pass the extra when evaluating at
this specific point, since we know the dependency was enabled by the
marker.
Closes
https://github.com/astral-sh/uv/issues/9533#issuecomment-2508908591.
## Summary
A lot of good new lints, and most importantly, error stabilizations. I
tried to find a few usages of the new stabilizations, but I'm sure there
are more.
IIUC, this _does_ require bumping our MSRV.