Another split out from https://github.com/astral-sh/uv/pull/3263. This
abstracts the copy&pasted check whether an installed distribution
satisfies a requirement used by both plan.rs and site_packages.rs into a
shared module. It's less useful here than with the new requirement but
helps with reducing https://github.com/astral-sh/uv/pull/3263 diff size.
## Summary
It turns out that `normalize_path` (sourced from Cargo) has a subtle
bug. If you pass it a relative path that traverses beyond the root, it
silently drops components. So, e.g., passing `../foo/bar`, it will just
drop the leading `..` and return `foo/bar`.
This PR encodes that behavior as a `Result` and avoids using it in such
cases.
Closes https://github.com/astral-sh/uv/issues/3012.
## Summary
This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.
Here are some of the most important highlights:
- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.
There are a few notable TODOs:
- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.
Closes#474.
## Test Plan
I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
## Summary
This PR enables the resolver to "accept" URLs, prereleases, and local
version specifiers for direct dependencies of path dependencies. As a
result, `uv pip install .` and `uv pip install -e .` now behave
identically, in that neither has a restriction on URL dependencies and
the like.
Closes https://github.com/astral-sh/uv/issues/2643.
Closes https://github.com/astral-sh/uv/issues/1853.
These are useful for converting lower level marker values types
to their corresponding values from a marker environment.
We'll use these for generating marker expressions based on both
the dependency graph and the current marker environment.
## Summary
Now that we're resolving metadata more aggressively for local sources,
it's worth doing this. We now pull metadata from the `pyproject.toml`
directly if it's statically-defined.
Closes https://github.com/astral-sh/uv/issues/2629.
## Summary
When a user passes a `pyproject.toml` to `pip compile` (e.g., `uv pip
compile pyproject.toml`), we extract the requirements from the
`pyproject.toml` directly. However... that isn't always possible (as
seen in the linked issues). When it's _not_, we instead need to run the
PEP 517 build hooks to identify the metadata.
Closes https://github.com/astral-sh/uv/issues/1624.
Closes https://github.com/astral-sh/uv/issues/1644.
## Test Plan
`cargo test`
## Summary
`uv` was failing to install requirements defined like:
```
file://localhost/Users/crmarsh/Downloads/iniconfig-2.0.0-py3-none-any.whl
```
Closes https://github.com/astral-sh/uv/issues/2652.
## Summary
This PR ensures that if a package is already satisfied by the current
environment, we don't bother resolving the named requirement.
Part of: https://github.com/astral-sh/uv/issues/313.
## Test Plan
- `cargo run pip install ./scripts/editable-installs/black_editable`
- `cargo run pip install black --verbose`
## Summary
First piece of https://github.com/astral-sh/uv/issues/313. In order to
support unnamed requirements, we need to be able to parse them in
`requirements-txt`, which in turn means that we need to introduce a new
type that's distinct from `pep508::Requirement`, given that these
_aren't_ PEP 508-compatible requirements.
Part of: https://github.com/astral-sh/uv/issues/313.
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
This PR handles the fragment part of the URL path.
It achieves this by splitting the fragment from the path before
normalization and parsing. It then sets the fragment back after the URL
has been parsed.
Resolve#2501
## Summary
This PR adds limited support for PEP 440-compatible local version
testing. Our behavior is _not_ comprehensively in-line with the spec.
However, it does fix by _far_ the biggest practical limitation, and
resolves all the issues that've been raised on uv related to local
versions without introducing much complexity into the resolver, so it
feels like a good tradeoff for me.
I'll summarize the change here, but for more context, see [Andrew's
write-up](https://github.com/astral-sh/uv/issues/1855#issuecomment-1967024866)
in the linked issue.
Local version identifiers are really tricky because of asymmetry.
`==1.2.3` should allow `1.2.3+foo`, but `==1.2.3+foo` should not allow
`1.2.3`. It's very hard to map them to PubGrub, because PubGrub doesn't
think of things in terms of individual specifiers (unlike the PEP 440
spec) -- it only thinks in terms of ranges.
Right now, resolving PyTorch and friends fails, because...
- The user provides requirements like `torch==2.0.0+cu118` and
`torchvision==0.15.1+cu118`.
- We then match those exact versions.
- We then look at the requirements of `torchvision==0.15.1+cu118`, which
includes `torch==2.0.0`.
- Under PEP 440, this is fine, because `torch @ 2.0.0+cu118` should be
compatible with `torch==2.0.0`.
- In our model, though, it's not, because these are different versions.
If we change our comparison logic in various places to allow this, we
risk breaking some fundamental assumptions of PubGrub around version
continuity.
- Thus, we fail to resolve, because we can't accept both `torch @ 2.0.0`
and `torch @ 2.0.0+cu118`.
As compared to the solutions we explored in
https://github.com/astral-sh/uv/issues/1855#issuecomment-1967024866, at
a high level, this approach differs in that we lie about the
_dependencies_ of packages that rely on our local-version-using package,
rather than lying about the versions that exist, or the version we're
returning, etc.
In short:
- When users specify local versions upfront, we keep track of them. So,
above, we'd take note of `torch` and `torchvision`.
- When we convert the dependencies of a package to PubGrub ranges, we
check if the requirement matches `torch` or `torchvision`. If it's
an`==`, we check if it matches (in the above example) for
`torch==2.0.0`. If so, we _change_ the requirement to
`torch==2.0.0+cu118`. (If it's `==` some other version, we return an
incompatibility.)
In other words, we selectively override the declared dependencies by
making them _more specific_ if a compatible local version was specified
upfront.
The net effect here is that the motivating PyTorch resolutions all work.
And, in general, transitive local versions work as expected.
The thing that still _doesn't_ work is: imagine if there were _only_
local versions of `torch` available. Like, `torch @ 2.0.0` didn't exist,
but `torch @ 2.0.0+cpu` did, and `torch @ 2.0.0+gpu` did, and so on.
`pip install torch==2.0.0` would arbitrarily choose one one `2.0.0+cpu`
or `2.0.0+gpu`, and that's correct as per PEP 440 (local version
segments should be completely ignored on `torch==2.0.0`). However, uv
would fail to identify a compatible version. I'd _probably_ prefer to
fix this, although candidly I think our behavior is _ok_ in practice,
and it's never been reported as an issue.
Closes https://github.com/astral-sh/uv/issues/1855.
Closes https://github.com/astral-sh/uv/issues/2080.
Closes https://github.com/astral-sh/uv/issues/2328.
## Summary
Per [PEP 508](https://peps.python.org/pep-0508/), `python_version` is
just major and minor:

Right now, we're using the provided version directly, so if it's, e.g.,
`-p 3.11.8`, we'll inject the wrong marker. This was causing `pandas` to
omit `numpy` when `-p 3.11.8` was provided, since its markers look like:
```
Requires-Dist: numpy<2,>=1.22.4; python_version < "3.11"
Requires-Dist: numpy<2,>=1.23.2; python_version == "3.11"
Requires-Dist: numpy<2,>=1.26.0; python_version >= "3.12"
```
Closes https://github.com/astral-sh/uv/issues/2392.
## Summary
This PR ensures that we expand environment variables _before_ sniffing
for the URL scheme (e.g., `file://` vs. `https://` vs. something else).
Closes https://github.com/astral-sh/uv/issues/2375.
## Summary
This is analogous to #669, but for cases in which the package name is a
filesystem path. In such cases, we'll fail when parsing the _package
name_, since it doesn't start with a valid character, as opposed to
failing when we go to parse the remaining version specifier.
Inspired by https://github.com/astral-sh/uv/issues/2356.
Fix parsing `pytest;'4.0'>=python_version`, where previously the
operator and the variable were incorrectly tokenized as one invalid
operator.
Fixes#2247
## Summary
This PR expands environment variables in `-r` and `-c` paths _within_
requirements files. We already do this for `@` URL references and
others.
Closes https://github.com/astral-sh/uv/issues/1473.
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Expose the uv_normalize types from pep508, so that these can be used
with a crates.io version.
---------
Co-authored-by: konsti <konstin@mailbox.org>
## Summary
We shouldn't be resolving symlinks on the provided interpreter;
otherwise we break `pyenv`, since running `cargo run pip install mypy
--python .venv/bin/python` will immediately resolve to (e.g.)
`/Users/crmarsh/.pyenv/versions/3.10.2/bin/python3.10`, and pyenv relies
on the path to do its lookups.
Instead, the canonicalizing happens when we query the interpreter
metadata.
Closes https://github.com/astral-sh/uv/issues/2068.
## Test Plan
Ran `cargo run pip install mypy --python .venv/bin/python -v -n` with a
virtualenv created using a pyenv Python; verified that Mypy was
installed into the virtual environment, rather than into the global
environment.
## Summary
This also preserves the environment variables in the output file, e.g.:
```
Resolved 1 package in 216ms
# This file was autogenerated by uv via the following command:
# uv pip compile requirements.in --emit-index-url
--index-url https://test.pypi.org/${SUFFIX}
requests==2.5.4.1
```
I'm torn on whether that's correct or undesirable here.
Closes#2035.
Address a few pedantic lints
lints are separated into separate commits so they can be reviewed
individually.
I've not added enforcement for any of these lints, but that could be
added if desirable.
## Summary
Like `platform.python_implementation`, we should support the
`python_implementation` "alias" for `platform_python_implementation`.
Closes https://github.com/astral-sh/uv/issues/1906.
## Test Plan
```shell
❯ cargo run pip install "pynacl==1.4.0"
Finished dev [unoptimized + debuginfo] target(s) in 7.02s
Running `target/debug/uv pip install pynacl==1.4.0`
Resolved 4 packages in 9ms
Built pynacl==1.4.0 Downloaded 1 package in 31.51s
Installed 4 packages in 3ms
+ cffi==1.16.0
+ pycparser==2.21
+ pynacl==1.4.0
+ six==1.16.0
```
PEP 508 requires a space between a URL and the semicolon separating it
from the markers to disambiguate it from a url ending with a semicolon.
This is easy to get wrong because the space is not required after a
plain name of PEP 440 specifier. The new error message explicitly points
out the missing space.
Fixes#1637
## Summary
The main change is that we need to have an explicit list of protocols we
_do_ support (like `https`), so that when we see a Windows absolute path
(`C:\...`), we don't treat the `C` as a protocol itself.
Closes https://github.com/astral-sh/uv/issues/1539.
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.
Then, update directory and file names:
```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```
Then add all the files again
```
# Add all the files again
git add crates
git add python/uv
# This one needs a force-add
git add -f crates/uv-trampoline
```
## Summary
See: https://github.com/astral-sh/puffin/issues/1181.
## Test Plan
```
❯ cargo run -- pip install packse@../../zanieb/packse
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/puffin pip install 'packse@../../zanieb/packse'`
error: Distribution not found at: file:///Users/crmarsh/zanieb/packse
```
## Summary
This PR adds support for `--find-links`, `--index-url`, and
`--extra-index-url` arguments when specified in a `requirements.txt`.
It's a mostly-straightforward change. The only uncertain piece is what
to do when multiple files include these flags, and/or when we include
them on the CLI and in other files.
In general:
- If _anything_ specifies `--no-index`, we respect it.
- We combine all `--extra-index-url` and `--find-links` across all
sources, since those are just vectors.
- If we see multiple `--index-url` in requirements files, we error.
- We respect the `--index-url` from the command line over any provided
in a requirements file.
(`pip-compile` seems to just pick one semi-arbitrarily when multiple are
provided.)
Closes https://github.com/astral-sh/puffin/issues/1143.
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.
For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:
* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
allocate and create all necessary objects.
This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.
The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.
My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.
There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.
[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
## Summary
First batch of changes for windows support. Notable changes:
* Fixes all compile errors and added windows specific paths.
* Working venv creation on windows, both from a base interpreter and
from a venv. This requires querying `stdlib` from the sysconfig paths to
find the launcher.
* Basic url/path conversion handling for windows.
* `if cfg!(...)` instead of `#[cfg()]`. This should make it easier to
keep everything compiling across platforms.
## Outlook
Test summary: 402 tests run: 299 passed (15 slow), 103 failed, 1 skipped
There are various reason for the remaining test failure:
* Windows-specific colorama and tzdata dependencies that change the
snapshot slightly. This is by far the biggest batch.
* Some url-path handling issues. I fixed some in the PR, some remain.
* Lack of the latest python patch versions for older pythons on my
machine, since there are no builds for windows and we need to register
them in the registry for them to be picked up for `py --list-paths` (CC
@zanieb RE #1070).
* Lack of entrypoint launchers.
* ... likely more
This PR attempts to fix a common footgun in `requirements.txt` files.
Previously, to provide a file, you had to use `package_name @
file:///Users/crmarsh/...` -- in other words, an absolute path.
Now, these requirements follow the exact same rules as editables, so you
can do:
```
package_name @ ./file.zip
```
And similar.
The way the parsing is setup, this is intentionally _not_ supported when
reading metadata -- only when parsing `requirements.txt` directly.
Closes#984.
## Summary
Based on user feedback. Calling it a "parse error" is misleading, since
this is really something we don't support, but that users can work
around.