freethreaded python reintroduces abiflags since it is incompatible with
regular native modules and abi3.
Tests: None yet! We're lacking cpython 3.13 no-gil builds we can use in
ci.
My test setup:
```
PYTHON_CONFIGURE_OPTS="--enable-shared --disable-gil" pyenv install 3.13.0a5
cargo run -q -- venv -q -p python3.13 .venv3.13 --no-cache-dir && cargo run -q -- pip install -v psutil --no-cache-dir && .venv3.13/bin/python -c "import psutil"
```
Fixes#2429
## Summary
Similar to `Revision`, we now store IDs in the `Archive` entires rather
than absolute paths. This makes the cache robust to moves, etc.
Closes https://github.com/astral-sh/uv/issues/2908.
## Summary
This PR formalizes some of the concepts we use in the cache for
"pointers to things".
In the wheel cache, we have files like
`annotated_types-0.6.0-py3-none-any.http`. This represents an unzipped
wheel, cached alongside an HTTP caching policy. We now have a struct for
this to encapsulate the logic: `HttpArchivePointer`.
Similarly, we have files like `annotated_types-0.6.0-py3-none-any.rev`.
This represents an unzipped local wheel, alongside with a timestamp. We
now have a struct for this to encapsulate the logic:
`LocalArchivePointer`.
We have similar structs for source distributions too.
## Summary
This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.
Here are some of the most important highlights:
- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.
There are a few notable TODOs:
- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.
Closes#474.
## Test Plan
I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
## Summary
Right now, we have a `Hashes` representation that looks like:
```rust
/// A dictionary mapping a hash name to a hex encoded digest of the file.
///
/// PEP 691 says multiple hashes can be included and the interpretation is left to the client.
#[derive(Debug, Clone, Eq, PartialEq, Default, Deserialize)]
pub struct Hashes {
pub md5: Option<Box<str>>,
pub sha256: Option<Box<str>>,
pub sha384: Option<Box<str>>,
pub sha512: Option<Box<str>>,
}
```
It stems from the PyPI API, which returns a dictionary of hashes.
We tend to pass these around as a vector of `Vec<Hashes>`. But it's a
bit strange because each entry in that vector could contain multiple
hashes. And it makes it difficult to ask questions like "Is
`sha256:ab21378ca980a8` in the set of hashes"?
This PR instead treats `Hashes` as the PyPI-internal type, and uses a
new `Vec<HashDigest>` everywhere in our own APIs.
## Summary
Right now, the path-based wheel cache just looks at the symlink to the
archives directory, checks the timestamp on it, and continues with that
symlink as long as the timestamp is up-to-date.
The HTTP-based wheel meanwhile, uses an intermediary `.http` file, which
includes the HTTP caching information. The `.http` file's payload is
just a path pointing to an entry in the archives directory.
This PR modifies the path-based codepaths to use a similar cache file,
which stores a timestamp along with a path to the archives directory.
The main advantage here is that we can add other data to this cache file
(namely, hashes in the future).
## Test Plan
Beyond existing tests, I also verified that this doesn't require a
version bump:
```
git checkout main
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall
git checkout charlie/manifest
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall --refresh
```
It turns out that #2712 did _not_ fix#2711. After I put up #2712, I
started trying to track down the specific change that caused the
failure. I had assumed at first that it was related to one of our `rkyv`
types, but it actually ended up being one of our msgpack caches. I think
the failure mode is still fundamentally the same idea: the cached data
changed in a way that is still valid msgpack, but got interpreted
differently after deserializing.
The specific change that caused this was the [removal of a field] from
our
metadata type.
Ideally we would just undo the change and add the field back. But that
change has already been shipped out to users. So I believe the only
plausible choice at this point is to bump the `built-wheels` cache. This
will unfortunately mean that `uv` will need to re-build wheels.
Fixes#2711
[removal of a field]:
365c292525 (diff-e42586829f9c2cdbb909bedc5cf95691cc415247f2cbc2ebeb80d887020457bbL29)
It seems likely that we forgot to bump the version of the "simple" cache
in the 0.1.25 release. I'm still working on confirming it, but I figured
I'd get this bump up first.
The main problem here is that our "simple" cache is represented by
`rkyv`, and that in turn is tightly coupled to the representation of a
selection of data types in `uv`. Changing those data types without
bumping the cache version can result in cache deserialization errors
like this, or in the worst case, silent logic errors.
One possibility here is that the representation changed in a way that
permitted it to pass `rkyv` validation, but changed how the data itself
is interpreted. Our cache is robust with respect to `rkyv` validation
(if it fails, the cache will invalidate the entry and self-heal), but
being robust to higher level logical errors in interpretation of the
data is a much more significant challenge. Our best bet there is perhaps
some kind of checksum that we could do on top of `rkyv` validation (or
instead of it), and thus convert silent logical changes in how the data
is interpreted into failure modes that we're already robust to.
Fixes#2711
We put a `.gitignore` with `*` at the top of our cache. When maturin was
building a source distribution inside the cache, it would walk up the
tree to find a gitignore, see that and ignore all python files. We now
add an (empty) `.git` directory one directory below, in the root of
built-wheels cache. This prevents ignore walking further up (it marks
the top level a git repository).
Deptry (from #2490) is a mid sized rust package with additional python
packages, so instead of using it in the test i've replaced it with a
small (44KB total) reproducer that uses cffi for faster building, the
entire test taking <2s on my machine.
Fixes#2490
## Summary
This PR enables the source distribution database to be used with unnamed
requirements (i.e., URLs without a package name). The (significant)
upside here is that we can now use PEP 517 hooks to resolve unnamed
requirement metadata _and_ reuse any computation in the cache.
The changes to `crates/uv-distribution/src/source/mod.rs` are quite
extensive, but mostly mechanical. The core idea is that we introduce a
new `BuildableSource` abstraction, which can either be a distribution,
or an unnamed URL:
```rust
/// A reference to a source that can be built into a built distribution.
///
/// This can either be a distribution (e.g., a package on a registry) or a direct URL.
///
/// Distributions can _also_ point to URLs in lieu of a registry; however, the primary distinction
/// here is that a distribution will always include a package name, while a URL will not.
#[derive(Debug, Clone, Copy)]
pub enum BuildableSource<'a> {
Dist(&'a SourceDist),
Url(SourceUrl<'a>),
}
```
All the methods on the source distribution database now accept
`BuildableSource`. `BuildableSource` has a `name()` method, but it
returns `Option<&PackageName>`, and everything is required to work with
and without a package name.
The main drawback of this approach (which isn't a terrible one) is that
we can no longer include the package name in the cache. (We do continue
to use the package name for registry-based distributions, since those
always have a name.). The package name was included in the cache route
for two reasons: (1) it's nice for debugging; and (2) we use it to power
`uv cache clean flask`, to identify the entries that are relevant for
Flask.
To solve this, I changed the `uv cache clean` code to look one level
deeper. So, when we want to determine whether to remove the cache entry
for a given URL, we now look into the directory to see if there are any
wheels that match the package name. This isn't as nice, but it does work
(and we have test coverage for it -- all passing).
I also considered removing the package name from the cache routes for
non-registry _wheels_, for consistency... But, it would require a cache
bump, and it didn't feel important enough to merit that.
## Summary
Detects unused cache entries, which can come in a few forms:
1. Directories that are out-dated via our versioning scheme.
2. Old source distribution builds (i.e., we have a more recent version).
3. Old wheels (stored in `archive-v0`, but not symlinked-to from
anywhere in the cache).
Closes https://github.com/astral-sh/puffin/issues/1059.
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
## Summary
I tried out `cargo shear` to see if there are any unused dependencies
that `cargo udeps` isn't reporting. It turned out, there are a few. This
PR removes those dependencies.
## Test Plan
`cargo build`
## Summary
This PR attempts to use a similar trick to that we added in
https://github.com/astral-sh/uv/pull/1878, but for post-releases.
In https://github.com/astral-sh/uv/pull/1878, we added a fake "minimum"
version to enable us to treat `< 1.0.0` as _excluding_ pre-releases of
1.0.0.
Today, on `main`, we accept post-releases and local versions in `>
1.0.0`. But per PEP 440, that should _exclude_ post-releases and local
versions, unless the specifier is itself a pre-release, in which case,
pre-releases are allowed (e.g., `> 1.0.0.post0` should allow `>
1.0.0.post1`).
To support this, we add a fake "maximum" version that's greater than all
the post and local releases for a given version. This leverages our last
remaining free bit in the compact representation.
## Summary
This may be required elsewhere, but all the traces in that issue are
related to persisting the temporary directory to our persistent cache,
so lets start there.
See: https://github.com/astral-sh/uv/issues/1491.
## Summary
Internal-only refactor to consolidate multiple codepaths we have for
checking whether a cached or installed entry is up-to-date with a local
requirement.
## Summary
This also preserves the environment variables in the output file, e.g.:
```
Resolved 1 package in 216ms
# This file was autogenerated by uv via the following command:
# uv pip compile requirements.in --emit-index-url
--index-url https://test.pypi.org/${SUFFIX}
requests==2.5.4.1
```
I'm torn on whether that's correct or undesirable here.
Closes#2035.
Address a few pedantic lints
lints are separated into separate commits so they can be reviewed
individually.
I've not added enforcement for any of these lints, but that could be
added if desirable.
## Summary
Instead of looking at _either_ `pyproject.toml` or `setup.py`, we should
just be conservative and take the most-recent timestamp out of
`pyproject.toml`, `setup.py`, and `setup.cfg`. That will help prevent
staleness issues like those described in
https://github.com/astral-sh/uv/issues/1913#issuecomment-1961544084.
## Summary
Even when pre-releases are "allowed", per PEP 440, `pydantic<2.0.0`
should _not_ include pre-releases. This PR modifies the specifier
translation to treat `pydantic<2.0.0` as `pydantic<2.0.0.min0`, where
`min` is an internal-only version segment that's invisible to users.
Closes https://github.com/astral-sh/uv/issues/1641.
## Summary
If a `pyproject.toml` or similar is changed within an editable, we
should avoid passing our audit check (and thus re-install the package).
Closes https://github.com/astral-sh/uv/issues/1913.
A couple moons ago, I introduced an optimization for version comparisons
by devising a format where *most* versions would be represented by a
single `u64`. This in turn meant most comparisons (of which many are
done during resolution) would be extremely cheap.
Unfortunately, when I did that, I screwed up the preservation of
ordering as defined by the [Version Specifiers spec]. I think I messed
it up because I had originally devised the representation so that we
could pack things like `1.2.3.dev1.post5`, but later realized it would
be better to limit ourselves to a single suffix. However, I never
updated the binary encoding to better match "up to 4 release versions
and up to precisely 1 suffix." Because of that, there were cases where
versions weren't ordered correctly. For example, this fixes a bug where
`1.0a2 < 1.0dev2`, even though all dev releases should order before
pre-releases.
We also update a test so that it catches these kinds of bugs in the
future. (By testing all pairs of versions in a sequence instead of just
the adjacent versions.)
[Version Specifiers spec]:
https://packaging.python.org/en/latest/specifications/version-specifiers/#summary-of-permitted-suffixes-and-relative-ordering
## Summary
Some packages encode logic to embed the current commit SHA in the
version tag, when built within a Git repo. This typically results in an
invalid (non-compliant) version. Here's an example from `pylzma`:
ccb0e7cff3/version.py (L45).
This PR adds a phony, empty `.git` to the cache root, to ensure that any
`git` commands fail.
Closes https://github.com/astral-sh/uv/issues/1768.
## Test Plan
- Create a tag on the current commit, like `v0.5.0`.
- Build `pylzma`, using a cache _within_ the repo:
```
rm -rf foo
cargo run venv
cargo run pip install "pylzma @ 10ef072c3c/pylzma-0.5.0.tar.gz" --verbose --cache-dir bar
```
It's a little picky about the value, but that seems okay.
```
❯ ./target/debug/uv pip install trio
Audited 1 package in 4ms
❯ UV_NO_CACHE=true ./target/debug/uv pip install trio
Audited 1 package in 50ms
```
Closes#1382
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.
Then, update directory and file names:
```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```
Then add all the files again
```
# Add all the files again
git add crates
git add python/uv
# This one needs a force-add
git add -f crates/uv-trampoline
```