We significantly regressed performance in some cases because we were
cloning the resolver state one more time than we needed to. That doesn't
sound like a lot, but in the case where there are no forks, it implies
we were cloning the state for every `get_dependencies` called when we
shouldn't have been cloning it at all.
Avoiding the clone results in somewhat tortured code. This can probably
be refactored by moving bits out to a helper routine, but that also
seemed non-trivial. So we let this suffice for now.
This addresses the lack of marker support in prior commits.
Specifically, we add them as a new field to `AnnotatedDist`, and from
there, they get added to a `Distribution` in a `Lock`.
This commit is a pretty invasive change that implements the merging
of resolutions created by each fork of the resolver.
The main idea here is that each `SolveState` is converted into a
`Resolution` (a new type) and stored on the heap after its fork
completes. When all forks complete, they are all merged into a single
`Resolution`. This `Resolution` is then used to build a `ResolutionGraph`.
Construction of `ResolutionGraph` mostly stays the same (despite the
gnarly diff due to an indent change) with one exception: the code to
extract dependency edges out of PubGrub's state has been moved to
`SolveState::into_resolution`. The idea here is that once a fork
completes, we extract what we need from the PubGrub state and then
throw it away. We store these edges in our own intermediate type which
is then converted into petgraph edges in the `ResolutionGraph`
constructor.
One interesting change we make here is that our edge
data is now a `Version` instead of a `Range<Version>`. I don't think
`Range<Version>` was actually being used anywhere, so this seems okay?
In any case, I think `Version` here is correct because a resolution
corresponds to specific dependencies of each package. Moreover, I didn't
see an easy way to make things work with `Range<Version>`. Notably,
since we no longer have the guarantee that there is only one version of
each package, we need to use `(PackageName, Version)` instead of just
`PackageName` for inverted lookups in `ResolutionGraph::from_state`.
Finally, the main resolver loop itself is changed a bit to track all
forked resolutions and then merge them at the end.
Note that we don't really have any dealings with markers in this commit.
We'll get to that in a subsequent commit.
This changes the constructor to just take an `InMemoryIndex`
directly instead of the constituent parts. No real reason other
than it seems a little simpler.
There are still some TODOs/FIXMEs here, but this makes represents a
chunk of the resolver refactoring to enable forking. We don't do any
merging of resolutions yet, so crucially, this code is broken when no
marker environment is provided. But when a marker environment is
provided, this should behave the same as a non-forking resolver. In
particular, `get_dependencies_forking` is just `get_dependencies`
whenever there's a marker environment.
## Summary
Ensures that we avoid upgrading packages unless `--upgrade` or similar
is passed.
For now, the resolver only respects these for registry distributions.
Closes https://github.com/astral-sh/uv/issues/3918.
## Summary
This PR adds extras to the lockfile, and enables users to selectively
sync extras in `uv sync` and `uv run`. The end result here was fairly
simple, though it required a few refactors to get here. The basic idea
is that `DistributionId` now includes `extra: Option<ExtraName>`, so we
effectively treat extras as separate packages. Generating the lockfile,
and generating the resolution from the lockfile, fall out of this
naturally with no special-casing or additional changes.
The main downside here is that it bloats the lockfile significantly.
Specifically:
- We include _all_ distribution URLs and hashes for _every_ extra
variant.
- We include all dependencies for the extra variant, even though that
are dependencies of the base package.
We could normalize this representation by changing each distribution
have an `optional-dependencies` hash map that keys on extras, but we
actually don't have the information we need to create that right now
(specifically, we can't differentiate between dependencies that
_require_ the extra and dependencies on the base package).
Closes#3700.
## Summary
This PR just ensures that when running `uv lock` (or `uv run`), we lock
with all extras. When we later install, we'll also _install_ with all
extras, but that will be changed in a future PR.
## Summary
Today, we represent each package as a single node in the graph, and
combine all the extras. This is helpful for the `requirements.txt`-style
resolution, in which we want to show each a single line for each package
with the extras combined into a single array.
This PR modifies the representation to instead use a separate node for
each (package, extra) pair. We then reduce into the previous format when
printing in the `requirements.txt`-style format, so there shouldn't be
any user-facing changes here.
## Summary
There are a few behavior changes in here:
- We now enforce `--require-hashes` for editables, like pip. So if you
use `--require-hashes` with an editable requirement, we'll reject it. I
could change this if it seems off.
- We now treat source tree requirements, editable or not (e.g., both `-e
./black` and `./black`) as if `--refresh` is always enabled. This
doesn't mean that we _always_ rebuild them; but if you pass
`--reinstall`, then yes, we always rebuild them. I think this is an
improvement and is close to how editables work today.
Closes#3844.
Closes#2695.
## Summary
This PR makes a variety of invalid states unrepresentable by changing
`Preference` to require a `PackageName` and `Version`, rather than
accepting a generic `Requirement`. There should be no meaningful
behavior changes.
## Summary
We actually _already_ ignore these (preferences only apply to versions,
not URLs), it just happens later on. This PR thus just avoids crashing.
The behavior is unchanged.
Closes#3822.
## Summary
Related to https://github.com/astral-sh/uv/issues/3818. We should
_always_ include the package name if we know it's not a file path, even
if it starts with an environment variable.
## Summary
It turns out that in the
[spec](https://packaging.python.org/en/latest/specifications/binary-distribution-format/#file-name-convention),
if a wheel filename includes a build tag, then we need to use it to
break ties. This PR implements that behavior. (Previously, we dropped
the build tag entirely.)
Closes#3779.
## Test Plan
Run: `cargo run pip install -i https://pypi.anaconda.org/intel/simple
mkl_fft==1.3.8 --python-platform linux --python-version 3.10`. This now
resolves without error. Previously, we selected build tag 63 of
`mkl_fft==1.3.8`, which led to an incompatibility with NumPy. Now, we
select build tag 70.
When parsing requirements from any source, directly parse the url parts
(and reject unsupported urls) instead of parsing url parts at a later
stage. This removes a bunch of error branches and concludes the work
parsing url parts once and passing them around everywhere.
Many usages of the assembled `VerbatimUrl` remain, but these can be
removed incrementally.
Please review commit-by-commit.
## Summary
We now show yanks as part of the resolution diagnostics, so they now
appear for `sync`, `install`, `compile`, and any other operations.
Further, they'll also appear for cached packages (but not packages that
are _already_ installed).
Closes https://github.com/astral-sh/uv/issues/3768.
Closes#3766.
## Summary
This PR takes the functions used in `pip install`, moves them into a
common module, and then replaces all the `pip sync` logic with calls
into those functions. The net effect is that `pip install` and `pip
sync` share far more code and demonstrate much more consistent behavior.
Closes https://github.com/astral-sh/uv/issues/3555.
## Summary
This PR adds editables using a new source type (`editable+...`), and
then extracts the editables from the lockfile in `uv sync`.
Closes https://github.com/astral-sh/uv/issues/3695.
Updates our Python interpreter discovery to conform to the rules
described in #2386, please see that issue for a full description of the
behavior. Briefly, we now will search for interpreters that satisfy a
requested version without stopping at the first Python executable.
Additionally, if retrieving information about an interpreter fails we
will continue to search for a working interpreter. We also add the
plumbing necessary to request Python implementations other than CPython,
though we do not add support for other implementations at this time.
A major internal goal of this work is to prepare for user-facing managed
toolchains i.e. fetching a requested version during `uv run`. These APIs
are not introduced, but there is some managed toolchain handling as
required for our test suite.
Some noteworthy implementation changes:
- The `uv_interpreter::find_python` module has been removed in favor of
a `uv_interpreter::discovery` module.
- There are new types to help structure interpreter requests and track
sources
- Executable discovery is implemented as a big lazy iterator and is a
central authority for source precedence
- `uv_interpreter::Error` variants were split into scoped types in each
module
- There's much more unit test coverage, but not for Windows yet
Remaining work:
- [x] Write new test cases
- [x] Determine correct behavior around executables in the current
directory
- _Future_: Combine `PythonVersion` and `VersionRequest`
- _Future_: Consider splitting `ManagedToolchain` into local and remote
variants
- _Future_: Add Windows unit test coverage
- _Future_: Explore behavior around implementation precedence (i.e.
CPython over PyPy)
Refactors split into:
- #3329
- #3330
- #3331
- #3332Closes#2386
Instead of saying
> we can conclude that you require==0a0.dev0 and
pandas-stubs==2.0.3.230814 are incompatible.
we'll say
> we can conclude that your requirements and pandas-stubs==2.0.3.230814
are incompatible.
Closes#3710
I'm not sure how to get unit test coverage for this, might look into
that. Ideally we'd skip this branch entirely?
Pubgrub stores incompatibilities as (package name, version range)
tuples, meaning it needs to clone the package name for each
incompatibility, and each non-borrowed operation on incompatibilities.
https://github.com/astral-sh/uv/pull/3673 made me realize that
`PubGrubPackage` has gotten large (expensive to copy), so like `Version`
and other structs, i've added an `Arc` wrapper around it.
It's a pity clippy forbids `.deref()`, it's less opaque than `&**` and
has IDE support (clicking on `.deref()` jumps to the right impl).
## Benchmarks
It looks like this matters most for complex resolutions which, i assume
because they carry larger `PubGrubPackageInner::Package` and
`PubGrubPackageInner::Extra` types.
```bash
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/jupyter.in" "./uv-branch pip compile -q ./scripts/requirements/jupyter.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/airflow.in" "./uv-branch pip compile -q ./scripts/requirements/airflow.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/boto3.in" "./uv-branch pip compile -q ./scripts/requirements/boto3.in"
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/jupyter.in
Time (mean ± σ): 18.2 ms ± 1.6 ms [User: 14.4 ms, System: 26.0 ms]
Range (min … max): 15.8 ms … 22.5 ms 181 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/jupyter.in
Time (mean ± σ): 17.8 ms ± 1.4 ms [User: 14.4 ms, System: 25.3 ms]
Range (min … max): 15.4 ms … 23.1 ms 159 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/jupyter.in ran
1.02 ± 0.12 times faster than ./uv-main pip compile -q ./scripts/requirements/jupyter.in
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/airflow.in
Time (mean ± σ): 153.7 ms ± 3.5 ms [User: 165.2 ms, System: 157.6 ms]
Range (min … max): 150.4 ms … 163.0 ms 19 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/airflow.in
Time (mean ± σ): 123.9 ms ± 4.6 ms [User: 152.4 ms, System: 133.8 ms]
Range (min … max): 118.4 ms … 138.1 ms 24 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/airflow.in ran
1.24 ± 0.05 times faster than ./uv-main pip compile -q ./scripts/requirements/airflow.in
```
```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/boto3.in
Time (mean ± σ): 327.0 ms ± 3.8 ms [User: 344.5 ms, System: 71.6 ms]
Range (min … max): 322.7 ms … 334.6 ms 10 runs
Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/boto3.in
Time (mean ± σ): 311.2 ms ± 3.1 ms [User: 339.3 ms, System: 63.1 ms]
Range (min … max): 307.8 ms … 317.0 ms 10 runs
Summary
./uv-branch pip compile -q ./scripts/requirements/boto3.in ran
1.05 ± 0.02 times faster than ./uv-main pip compile -q ./scripts/requirements/boto3.in
```
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR falls back to writing an unnamed requirement if it appears to be
a relative URL. pip is way more flexible when providing an unnamed
requirement than when providing a PEP 508 requirement. For example,
_only_ this works:
```
black @ file:///Users/crmarsh/workspace/uv/scripts/packages/black_editable
```
Any other form will fail.
Meanwhile, _all_ of these work:
```
file:///Users/crmarsh/workspace/uv/scripts/packages/black_editable
scripts/packages/black_editable
./scripts/packages/black_editable
file:./scripts/packages/black_editable
file:scripts/packages/black_editable
```
Closes https://github.com/astral-sh/uv/issues/3180.
This makes use of the newly added `Ord` impl on `PubGrubPackage` to make
the output of `format_terms` independent of hashmap iteration order.
This was already collecting the terms into an intermediate `Vec`, so
sorting probably isn't going to add any significant overhead here.
(Plus, this is only running when formatting an error message after a
solution could not be found, so an extra sort doesn't seem like a big
deal here.)
Note that some tests are updated in this commit as a result of this
change. As far as I can tell, the semantic meaning of the output remains
the same. But the order of the listed packages does not.
Specific thing motivating this change is, in a subsequent, I added
`Option<MarkerTree>` to `PubGrubPackage::Package`, and this caused
similar changes in test output. So I backtracked and isolated this
change from the addition of `Option<MarkerTree>`.
It turns out that we use PubGrubPackage as the key in hashmaps in a fair
few places. And when we iterate over hashmaps, the order is unspecified.
This can in turn result in changes in output as a result of changes in
the PubGrubPackage definition, purely as a function of its changing
hash. This is confusing as there should be no semantic difference.
Thus, this is a precursor to introducing some more determinism to places
I found in the error reporting whose output depending on hashmap
iteration order.
It looks like the last vestiges of `Derivative` were removed in commit
7eaed07f6c, but the then rendered
superfluous `derive(Derivative)` wasn't removed.
This is split out from workspaces support, which needs editables in the
bluejay commands. It consists mainly of refactorings:
* Move the `editable` module one level up.
* Introduce a `BuiltEditableMetadata` type for `(LocalEditable,
Metadata23, Requirements)`.
* Add editables to `InstalledPackagesProvider` so we can use
`EmptyInstalledPackages` for them.
## Summary
The main motivation here is that the `.filename()` method that we
implement on `Url` will do URL decoding for the last segment, which we
were missing here.
The errors are a bit awkward, because in
`crates/uv-resolver/src/lock.rs`, we wrap in `failed to extract filename
from URL: {url}`, so in theory we want the underlying errors to _omit_
the URL? But sometimes they use `#[error(transparent)]`?
## Summary
Uncertain about this, but we don't actually need the full
`SourceDistFilename`, only the name and version -- and we often have
that information already (as in the lockfile routines). So by flattening
the fields onto `RegistrySourceDist`, we can avoid re-parsing for
information we already have.
## Summary
This PR adds lossless deserialization for `GitSourceDist` distributions
in the lockfile. Specifically, we now properly preserve the requested
revision, the subdirectory, and the precise Git commit SHA.
## Test Plan
`cargo test`