In #13302, there was an IO error without context. This error seems to be
caused by a symlink error. Switching as symlinking to `fs_err` ensures
these errors will carry context in the future.
Part of #11834
Currently, all Python installation are a streaming download-and-extract.
With this PR, we add the `UV_PYTHON_CACHE_DIR` variable. When set, the
installation is split into downloading the interpreter into
`UV_PYTHON_CACHE_DIR` and extracting it there from a second step. If the
archive is already present in `UV_PYTHON_CACHE_DIR`, we skip the
download.
The feature can be used to speed up tests and CI. Locally for me, `cargo
test -p uv -- python_install` goes from 43s to 7s (1,7s in release mode)
when setting `UV_PYTHON_CACHE_DIR`. It can also be used for offline
installation of Python interpreter, by copying the archives to a
directory in the offline machine, while the path rewriting is still
performed on the target machine on installation.
## Summary
I don't know if I actually want to commit this, but I did it on the
plane last time and just polished it off (got it to compile) while
waiting to board.
## Summary
This ended up being more involved than expected. The gist is that we
setup all the packages we want to reinstall upfront (they're passed in
on the command-line); but at that point, we don't have names for all the
packages that the user has specified. (Consider, e.g., `uv pip install
.` -- we don't have a name for `.`, so we can't add it to the list of
`Reinstall` packages.)
Now, `Reinstall` also accepts paths, so we can augment `Reinstall` based
on the user-provided paths.
Closes#12038.
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Closes#2410
<!-- What's the purpose of the change? What does it do, and why? -->
This changes the name of files in `wheels` bucket to use a hash instead
of the wheel name as to not exceed maximum file length limit on various
systems.
This only addresses the primary concern of #2410. It still does _not_
address:
- Path limit of 260 on windows:
https://github.com/astral-sh/uv/issues/2410#issuecomment-2062020882
To solve this we need to opt-in to longer path limits on windows
([ref](https://github.com/astral-sh/uv/issues/2410#issuecomment-2150532658)),
but I think that is a separate issue and should be a separate MR.
- Exceeding filename limit while building a wheel from source
distribution
As per my understanding, this is out of uv's control. Name of the output
wheel will be decided by build-backend used by the project. For wheels
built from source distribution, pip also uses the wheel names in cache.
So I have not touched `sdists` cache.
I have added a `filename: WheelFileName` field in `Archive`, so we can
use it while indexing instead of relying on the filename on disk.
Another way to do this was to read `.dist-info/WHEEL` and
`.dist-info/METADATA` and build `WheelFileName` but that seems less
robust and will be slower.
## Test Plan
<!-- How was it tested? -->
Tested by installing `yt-dlp`, `httpie` and `sqlalchemy` and verifying
that cache files in `wheels` bucket use hash.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
* Upgrade the rust toolchain to 1.85.0. This does not increase the MSRV.
* Update windows trampoline to 1.86 nightly beta (previously in 1.85
nightly beta).
## Test Plan
Existing tests
Revert #11601 for now
We run Python interpreter discovery with `-I` (#2500) which means these
environments variables are ignored when determining `sys.path`. Unless
we decide to remove the `-I` flag from the `sys.path` query, we
shouldn't release these changes to interpreter discovery caching.
We want to use `sys.path` for package discovery (#2500, #9849). For
that, we need to know the correct value of `sys.path`. `sys.path` is a
runtime-changeable value, which gets influenced from a lot of different
sources: Environment variables, CLI arguments, `.pth` files with
scripting, `sys.path.append()` at runtime, a distributor patching
Python, etc. We cannot capture them all accurately, especially since
it's possible to change `sys.path` mid-execution. Instead, we do a best
effort attempt at matching the user's expectation.
The assumption is that package installation generally happens in venv
site-packages, system/user site-packages (including pypy shipping
packages with std), and `PYTHONPATH`. Specifically, we reuse
`PYTHONPATH` as dedicated way for users to tell uv to include specific
directories in package discovery.
A common way to influence `sys.path` that is not using venvs is setting
`PYTHONPATH`. To support this we're capturing `PYTHONPATH` as part of
the cache invalidation, i.e. we refresh the interpreter metadata if it
changed. For completeness, we're also capturing other environment
variables documented as influencing `sys.path` or other fields in the
interpreter info.
This PR does not include reading registry values for `sys.path`
additions on Windows as documented in
https://docs.python.org/3.11/using/windows.html#finding-modules. It
notably also does not include parsing of python CLI arguments, we only
consider their environment variable versions for package installation
and listing. We could try parsing CLI flags in `uv run python`, but we'd
still miss them when Python is launched indirectly through a script, and
it's more consistent to only consider uv's own arguments and environment
variables, similar to uv's behavior in other places.
Instead of using junctions, we can just write files that contain (as the
file contents) the target path. This requires a little more finesse in
that, as readers, we need to know where to expect these. But it also
means we get to avoid junctions, which have led to a variety of
confusing behaviors. Further, `replace_symlink` should now be on atomic
on Windows.
Closes#11263.
We've never bumped the version of this bucket, and we may never do so...
But it's still incorrect for us to omit it from these serialized structs
in the cache. Specifically, these structs include a pointer into the
archive bucket (namely, the ID). But we don't include the bucket
version! So, in theory, we could end up pointing to archives that don't
match the current bucket version expected in the code.
## Summary
Today, scripts use `CachedEnvironment`, which results in a different
virtual environment path every time the interpreter changes _or_ the
project requirements change. This makes it impossible to provide users
with a stable path to the script that they can use for (e.g.) directing
their editor.
This PR modifies `uv run` to use a stable path for local scripts (we
continue to use `CachedEnvironment` for remote scripts and scripts from
`stdin`). The logic now looks a lot more like it does for projects: we
`get_or_init` an environment, etc.
For now, the path to the script is like:
`environments-v1/4485801245a4732f`, where `4485801245a4732f` is a SHA of
the absolute path to the script. But I'm not picky on that :)
## Summary
If we fail to deserialize cached metadata in the cache, we should just
ignore it, rather than failing.
Ideally, this never happens. If it does, it means we missed a cache
version bump. But if it does happen, it should still be non-fatal.
Closes https://github.com/astral-sh/uv/issues/11043.
Closes https://github.com/astral-sh/uv/issues/11101.
## Test Plan
Prior to this PR, the following would fail:
- `uvx uv@0.5.25 venv --python 3.12 --cache-dir foo`
- `uvx uv@0.5.25 pip install ./scripts/packages/hatchling_dynamic
--no-deps --python 3.12 --cache-dir foo`
- `uvx uv@0.5.18 venv --python 3.12 --cache-dir foo`
- `uvx uv@0.5.18 pip install ./scripts/packages/hatchling_dynamic
--no-deps --python 3.12 --cache-dir foo`
We can't go back and fix 0.5.18, but this will prevent such regressions
in the future.
## Summary
On Windows, we have a lot of issues with atomic replacement and such.
There are a bunch of different failure modes, but they generally
involve: trying to persist a fail to a path at which the file already
exists, trying to replace or remove a file while someone else is reading
it, etc.
This PR adds locks to all of the relevant database paths. We already use
these advisory locks when building source distributions; now we use them
when unzipping wheels, storing metadata, etc.
Closes#11002.
## Test Plan
I ran the following script:
```shell
# Define the cache directory path
$cacheDir = "C:\Users\crmar\workspace\uv\cache"
# Clear the cache directory if it exists
if (Test-Path $cacheDir) {
Remove-Item -Recurse -Force $cacheDir
}
# Create the cache directory again
New-Item -ItemType Directory -Force -Path $cacheDir
# Define the command to run with --cache-dir flag
$command = {
param ($venvPath)
# Create a virtual environment in the specified path with --python
uv venv $venvPath
# Run the pip install command with --cache-dir flag
C:\Users\crmar\workspace\uv\target\profiling\uv.exe pip install flask==1.0.4 --no-binary flask --cache-dir C:\Users\crmar\workspace\uv\cache -v --python $venvPath
}
# Define the paths for the different virtual environments
$venv1 = "C:\Users\crmar\workspace\uv\venv1"
$venv2 = "C:\Users\crmar\workspace\uv\venv2"
$venv3 = "C:\Users\crmar\workspace\uv\venv3"
$venv4 = "C:\Users\crmar\workspace\uv\venv4"
$venv5 = "C:\Users\crmar\workspace\uv\venv5"
# Start the command in parallel five times using Start-Job, each with a different venv
$job1 = Start-Job -ScriptBlock $command -ArgumentList $venv1
$job2 = Start-Job -ScriptBlock $command -ArgumentList $venv2
$job3 = Start-Job -ScriptBlock $command -ArgumentList $venv3
$job4 = Start-Job -ScriptBlock $command -ArgumentList $venv4
$job5 = Start-Job -ScriptBlock $command -ArgumentList $venv5
# Wait for all jobs to complete
$jobs = @($job1, $job2, $job3, $job4, $job5)
$jobs | ForEach-Object { Wait-Job $_ }
# Retrieve the results (optional)
$jobs | ForEach-Object { Receive-Job -Job $_ }
# Clean up the jobs
$jobs | ForEach-Object { Remove-Job -Job $_ }
```
And ensured it succeeded in five straight invocations (whereas on
`main`, it consistently fails with a variety of different traces).
## Summary
This PR extends the thinking in #10525 to platform tags, and then uses
the structured tag enums everywhere, rather than passing around strings.
I think this is a big improvement! It means we're no longer doing ad hoc
tag parsing all over the place.
## Summary
Going forward, we're going to provide better versioning guarantees
around using the same cache across multiple uv versions, so this PR
updates the docs to reflect that. It also bumps the `sdists-` version to
fix the inconvenience demonstrated in
https://github.com/astral-sh/uv/issues/8367.
Closes https://github.com/astral-sh/uv/issues/8367.
When building a source distribution to a wheels, we perform the build
inside a temporary directory inside the output directory. By default,
the output directory is `dist/` in the repository root. This temp dir
placement allows us to move the final wheel to the output directory
instead of copying it (a temp dir might be on another device, which
means we need to copy instead of moving).
Some build backends such as hatchling traverse upwards from the current
directory (the source dist build location) looking for gitignore files
to consider. By adding a gitignore in `dist/` with `*`, we caused
hatchling to ignore all files in our temporary build directory below it,
causing empty wheels. To prevent this, we add a `.git` file as a phony
git root. We are already using this trick successfully in the cache.
Hatchling sees this `.git` file, considers it a boundary and does not
traverse up to `dist/.gitignore`.
Fixes#8200
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Random, but I noticed that we can remove a ton of serialize and
deserialize derives by using `rkyv` for the flat-index caches. (We
already use `rkyv` for these same structs in the registry cache.)
## Summary
`uv cache prune --ci` will remove the source distribution directory. If
we then need to build a _different_ wheel (e.g., you're building a
package that has Python minor version-specific wheels), we fail, because
we expect the source to be there.
Now, if the source is missing, we re-download it. It would be slightly
easier to just _ignore_ that revision, but that would mean we'd also
lose the already-built wheels -- so if you ran against many Python
versions, we'd continuously lose the cached data.
Closes https://github.com/astral-sh/uv/issues/7543.
## Test Plan
We can add tests, but they _need_ to build non-pure Python wheels, which
tends to be expensive...
For reference:
```console
$ cargo run venv --python 3.12
$ cargo run pip install mercurial==6.8.1 --verbose
$ cargo run cache prune --ci
$ cargo run venv --python 3.11
$ cargo run pip install mercurial==6.8.1 --verbose
```
I also did this with a local `.tar.gz` that I downloaded from PyPI.
## Summary
Both of these can contain rkyv data in their HTTP cache envelopes. As
such, the entries aren't readable by earlier versions of uv, and `uv
cache prune` can break. I should make `uv cache prune` robust to this,
but this feels safest.
Recently, rkyv 0.8 was released. Its API is a fair bit simpler now for
higher level uses (like for us in `uv`) and results in us being able to
delete a fair bit of code. This also removes our last dependency on `syn
1.0`, and thus drops that dependency.
Performance (via testing on the `transformers` example) seems to remain
about the same, which is what was expected:
```
$ hyperfine -w5 -r100 'uv lock' 'uv-ag-rkyv-update lock'
Benchmark 1: uv lock
Time (mean ± σ): 55.6 ms ± 6.4 ms [User: 30.4 ms, System: 35.1 ms]
Range (min … max): 43.0 ms … 73.1 ms 100 runs
Benchmark 2: uv-ag-rkyv-update lock
Time (mean ± σ): 56.5 ms ± 7.2 ms [User: 30.5 ms, System: 36.3 ms]
Range (min … max): 39.1 ms … 71.5 ms 100 runs
Summary
uv lock ran
1.02 ± 0.18 times faster than uv-ag-rkyv-update lock
```
Closes#7415
## Summary
It's very unlikely that retaining these is beneficial, since you tend to
partition the cache by platform anyway.
Closes https://github.com/astral-sh/uv/issues/7394.
## Summary
This PR adds a more flexible cache invalidation abstraction for uv, and
uses that new abstraction to improve support for dynamic metadata.
Specifically, instead of relying solely on a timestamp, we now pass
around a `CacheInfo` struct which (as of now) contains
`Option<Timestamp>` and `Option<Commit>`. The `CacheInfo` is saved in
`dist-info` as `uv_cache.json`, so we can test already-installed
distributions for cache validity (along with testing _cached_
distributions for cache validity).
Beyond the defaults (`pyproject.toml`, `setup.py`, and `setup.cfg`
changes), users can also specify additional cache keys, and it's easy
for us to extend support in the future. Right now, cache keys can either
be instructions to include the current commit (for `setuptools_scm` and
similar) or file paths (for `hatch-requirements-txt` and similar):
```toml
[tool.uv]
cache-keys = [{ file = "requirements.txt" }, { git = true }]
```
This change should be fully backwards compatible.
Closes https://github.com/astral-sh/uv/issues/6964.
Closes https://github.com/astral-sh/uv/issues/6255.
Closes https://github.com/astral-sh/uv/issues/6860.
## Summary
This has bothered me for a while and should be fairly impactful for
users. It requires a weird implementation, since the
distribution-building crate depends on the cache, and so the prune
operation can't live in the cache, since it needs to access internals of
the distribution-building crate.
Closes https://github.com/astral-sh/uv/issues/7096.
As suggested by @samypr100 on #6680:
https://github.com/astral-sh/uv/pull/6680#issuecomment-2313607984
## Summary
Instead of using `UV_INTERNAL__TEST_DIR`, it simply exports `TEMP` when
running Windows jobs.
## Test Plan
I'm going to run this manually under ProcMon on my Windows machine and
see where uv writes temp files, hopefully to the dev drive and not
`%(LOCAL)APPDATA%` or something.
I'm going to commit a dummy code change and look at build time changes
in CI.
## Summary
This PR makes `cargo test | windows` faster in CI.
### Before

### After

## Also
This PR disables the `brotli` feature of `async-compression` since it's
not strictly needed, but this has little to do with the improvements
(it's still less code to build).
This PR introduces additional code in uv tool uninstall to ignore errors
(that only seem to happen on ReFS, ie. on Dev Drives) akin to "the thing
we're trying to delete cannot be deleted because it's already being
deleted".
If `raw_os_error` was stable we could do u32 matching instead of that
`.to_string().contains()` abomination.
## Summary
Passing `--upgrade` to `tool run` is confusing, because it doesn't
upgrade the installed tool. It just causes us to use an isolated tool
environment, which seems wrong.
## Summary
This PR adds a `DistExtension` field to some of our distribution types,
which requires that we validate that the file type is known and
supported when parsing (rather than when attempting to unzip). It
removes a bunch of extension parsing from the code too, in favor of
doing it once upfront.
Closes https://github.com/astral-sh/uv/issues/5858.
## Summary
I think this seems reasonable... Otherwise, we might not go back to PyPI
to revalidate the list of available versions despite the user passing
`--upgrade`.
The comment in the code explains the bulk of this:
```rust
// We previously computed this heuristic freshness lifetime by
// looking at the difference between the last modified header and
// the response's date header. We then asserted that the cached
// response ought to be "fresh" for 10% of that interval.
//
// It turns out that this can result in very long freshness
// lifetimes[1] that lead to uv caching too aggressively.
//
// Since PyPI sets a max-age of 600 seconds and since we're
// principally just interacting with Python package indices here,
// we just assume a freshness lifetime equal to what PyPI has.
//
// Note though that a better solution here is for the index to
// support proper HTTP caching headers (ideally Cache-Control, but
// Expires also works too, as above).
```
We also remove the `heuristic_percent` field on `CacheConfig`. Since
that's actually part of the cache itself, we bump the simple cache
version.
Finally, we add some more `trace!` calls that should hopefully make
diagnosing issues related to the freshness lifetime a bit easier in the
future.
Fixes#5351
## Summary
Users can now run `uv cache prune --ci` (open to feedback on the name of
that flag) to remove all pre-built wheels from the cache, leaving behind
zipped, built wheels (which tend to be the most expensive assets to
re-create). This should greatly increase cache performance in CI
environments, since uploading unzipped wheels can actually hurt
performance if you're persisting the uv cache.
Closes https://github.com/astral-sh/uv/issues/5282.
This name should lead to less confusion. Unfortunately this is a
"breaking cache change" so everyone's cache will be invalidated. I'm not
sure if we should support a rename-on-upgrade.
edit: We can make the breaking change next time we bump the version