## Summary
So the error here is:
```rust
ExtractError("cpython-3.11.11%2B20241206-aarch64-apple-darwin-install_only_stripped.tar.gz", Io(Custom { kind: UnexpectedEof, error: TarError { desc: "failed to unpack `/Users/crmarsh/.local/share/uv/python/.cache/.tmpkqFzqE/python/lib/libpython3.11.dylib`", io: Custom { kind: UnexpectedEof, error: TarError { desc: "failed to unpack `python/lib/libpython3.11.dylib` into `/Users/crmarsh/.local/share/uv/python/.cache/.tmpkqFzqE/python/lib/libpython3.11.dylib`", io: Custom { kind: UnexpectedEof, error: "unexpected end of file" } } } } }))
```
This isn't a Reqwest error, so we miss it in
`is_extended_transient_error`.
We could add `TarError` or `ExtractError` here, but... should we? This
PR just extends it to any error that has an IO source. I don't see much
of a downside.
Closes https://github.com/astral-sh/uv/issues/9747.
## Test Plan
First, ran: `uv run ./scripts/create-python-mirror.py --name cpython
--arch aarch64 --os darwin`.
Then, dropped this into `./scripts/mirror/server.py`:
```python
import os
import random
from http.server import SimpleHTTPRequestHandler, HTTPServer
class GlitchyStaticServer(SimpleHTTPRequestHandler):
def do_GET(self):
"""Handle GET request."""
file_path = self.translate_path(self.path)
if not os.path.exists(file_path):
self.send_error(404, "File not found")
return
try:
with open(file_path, 'rb') as f:
file_content = f.read()
# Introduce an "unexpected end of file" glitch randomly
if random.random() < 0.75: # 75% chance of glitch
glitch_point = random.randint(1, len(file_content) - 1)
file_content = file_content[:glitch_point]
self.send_response(200)
self.send_header("Content-type", self.guess_type(file_path))
self.send_header("Content-Length", len(file_content))
self.end_headers()
self.wfile.write(file_content)
except Exception as e:
self.send_error(500, f"Internal Server Error: {e}")
def run(server_class=HTTPServer, handler_class=GlitchyStaticServer, port=8080):
"""Run the server."""
server_address = ('', port)
httpd = server_class(server_address, handler_class)
print(f"Serving on port {port} with glitchy behavior")
httpd.serve_forever()
if __name__ == "__main__":
run()
```
Then ran `python server.py` from that directory.
From there, ran `UV_PYTHON_INSTALL_MIRROR="http://localhost:8080" cargo
run python install 3.11 --reinstall --verbose` to reliably test retries.
## Summary
A lot of good new lints, and most importantly, error stabilizations. I
tried to find a few usages of the new stabilizations, but I'm sure there
are more.
IIUC, this _does_ require bumping our MSRV.
## Summary
It turns out that `WrappedReqwestError` skips the `reqwest::Error`
itself in order to hack the display. This PR adds it to the list of
variants we check when retrying transient errors.
Closes https://github.com/astral-sh/uv/issues/9246.
## Test Plan
Patched `reqwest` locally to return an error in `bytes()`. Verified that
it was _not_ caught prior to this PR, but was caught afterwards.
## Summary
The reqwest middleware doesn't retry errors that occur "after" the
request completes -- but in some cases, these do include spurious errors
that we want to retry. See https://github.com/astral-sh/uv/issues/8144
for examples. This PR adds a second retry layer during the response
_handler_, which should help with some of the spurious failures we see
in the linked issue.
Closes https://github.com/astral-sh/uv/issues/8144.
## Summary
These were moved as part of a broader refactor to create a single
integration test module. That "single integration test module" did
indeed have a big impact on compile times, which is great! But we aren't
seeing any benefit from moving these tests into their own files (despite
the claim in [this blog
post](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
I see the same compilation pattern regardless of where the tests are
located). Plus, we don't have many of these, and same-file tests is such
a strong Rust convention.
This PR is a revival of #3502, albeit in a much simpler form.
This would allow for different middlewares like authentication and such,
useful for if you want to deviate from the keychain authentication
methods when using uv as a library.
@zanieb I hope I made the changes as you noted you wanted to see them :)
Happy to add/change anything you need.
This still utilizes the RFC 2822 datetime formatter, but utilizes new
methods [added in jiff 0.1.14] to emit timestamps in a format strictly
compatible with RFC 9110.
It seems like most HTTP servers were pretty flexible and supported RFC
2822 datetime formats, but #8747 shows at least one case where that
isn't true. Given that the [MDN docs prescribe RFC 9110], we defer to
them.
Fixes#8747
[added in jiff 0.1.14]: https://github.com/BurntSushi/jiff/pull/154
[MDN docs prescribe RFC 9110]:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since
## Summary
This PR adds a first-class API for defining registry indexes, beyond our
existing `--index-url` and `--extra-index-url` setup.
Specifically, you now define indexes like so in a `uv.toml` or
`pyproject.toml` file:
```toml
[[tool.uv.index]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
```
You can also provide indexes via `--index` and `UV_INDEX`, and override
the default index with `--default-index` and `UV_DEFAULT_INDEX`.
### Index priority
Indexes are prioritized in the order in which they're defined, such that
the first-defined index has highest priority.
Indexes are also inherited from parent configuration (e.g., the
user-level `uv.toml`), but are placed after any indexes in the current
project, matching our semantics for other array-based configuration
values.
You can mix `--index` and `--default-index` with the legacy
`--index-url` and `--extra-index-url` settings; the latter two are
merely treated as unnamed `[[tool.uv.index]]` entries.
### Index pinning
If an index includes a name (which is optional), it can then be
referenced via `tool.uv.sources`:
```toml
[[tool.uv.index]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
[tool.uv.sources]
torch = { index = "pytorch" }
```
If an index is marked as `explicit = true`, it can _only_ be used via
such references, and will never be searched implicitly:
```toml
[[tool.uv.index]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
explicit = true
[tool.uv.sources]
torch = { index = "pytorch" }
```
Indexes defined outside of the current project (e.g., in the user-level
`uv.toml`) can _not_ be explicitly selected.
(As of now, we only support using a single index for a given
`tool.uv.sources` definition.)
### Default index
By default, we include PyPI as the default index. This remains true even
if the user defines a `[[tool.uv.index]]` -- PyPI is still used as a
fallback. You can mark an index as `default = true` to (1) disable the
use of PyPI, and (2) bump it to the bottom of the prioritized list, such
that it's used only if a package does not exist on a prior index:
```toml
[[tool.uv.index]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
default = true
```
### Name reuse
If a name is reused, the higher-priority index with that name is used,
while the lower-priority indexes are ignored entirely.
For example, given:
```toml
[[tool.uv.index]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
[[tool.uv.index]]
name = "pytorch"
url = "https://test.pypi.org/simple"
```
The `https://test.pypi.org/simple` index would be ignored entirely,
since it's lower-priority than `https://download.pytorch.org/whl/cu121`
but shares the same name.
Closes#171.
## Future work
- Users should be able to provide authentication for named indexes via
environment variables.
- `uv add` should automatically write `--index` entries to the
`pyproject.toml` file.
- Users should be able to provide multiple indexes for a given package,
stratified by platform:
```toml
[tool.uv.sources]
torch = [
{ index = "cpu", markers = "sys_platform == 'darwin'" },
{ index = "gpu", markers = "sys_platform != 'darwin'" },
]
```
- Users should be able to specify a proxy URL for a given index, to
avoid writing user-specific URLs to a lockfile:
```toml
[[tool.uv.index]]
name = "test"
url = "https://private.org/simple"
proxy = "http://<omitted>/pypi/simple"
```
## Summary
This PR declares and documents all environment variables that are used
in one way or another in `uv`, either internally, or externally, or
transitively under a common struct.
I think over time as uv has grown there's been many environment
variables introduced. Its harder to know which ones exists, which ones
are missing, what they're used for, or where are they used across the
code. The docs only documents a handful of them, for others you'd have
to dive into the code and inspect across crates to know which crates
they're used on or where they're relevant.
This PR is a starting attempt to unify them, make it easier to discover
which ones we have, and maybe unlock future posibilities in automating
generating documentation for them.
I think we can split out into multiple structs later to better organize,
but given the high influx of PR's and possibly new environment variables
introduced/re-used, it would be hard to try to organize them all now
into their proper namespaced struct while this is all happening given
merge conflicts and/or keeping up to date.
I don't think this has any impact on performance as they all should
still be inlined, although it may affect local build times on changes to
the environment vars as more crates would likely need a rebuild. Lastly,
some of them are declared but not used in the code, for example those in
`build.rs`. I left them declared because I still think it's useful to at
least have a reference.
Did I miss any? Are their initial docs cohesive?
Note, `uv-static` is a terrible name for a new crate, thoughts? Others
considered `uv-vars`, `uv-consts`.
## Test Plan
Existing tests
As per
https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html
Before that, there were 91 separate integration tests binary.
(As discussed on Discord — I've done the `uv` crate, there's still a few
more commits coming before this is mergeable, and I want to see how it
performs in CI and locally).
## Summary
Random, but I noticed that we can remove a ton of serialize and
deserialize derives by using `rkyv` for the flat-index caches. (We
already use `rkyv` for these same structs in the registry cache.)
Recently, rkyv 0.8 was released. Its API is a fair bit simpler now for
higher level uses (like for us in `uv`) and results in us being able to
delete a fair bit of code. This also removes our last dependency on `syn
1.0`, and thus drops that dependency.
Performance (via testing on the `transformers` example) seems to remain
about the same, which is what was expected:
```
$ hyperfine -w5 -r100 'uv lock' 'uv-ag-rkyv-update lock'
Benchmark 1: uv lock
Time (mean ± σ): 55.6 ms ± 6.4 ms [User: 30.4 ms, System: 35.1 ms]
Range (min … max): 43.0 ms … 73.1 ms 100 runs
Benchmark 2: uv-ag-rkyv-update lock
Time (mean ± σ): 56.5 ms ± 7.2 ms [User: 30.5 ms, System: 36.3 ms]
Range (min … max): 39.1 ms … 71.5 ms 100 runs
Summary
uv lock ran
1.02 ± 0.18 times faster than uv-ag-rkyv-update lock
```
Closes#7415
This is preparatory work for the upload functionality, which needs to
read the METADATA file and attach its parsed contents to the POST
request: We move finding the `.dist-info` from `install-wheel-rs` and
`uv-client` to a new `uv-metadata` crate, so it can be shared with the
publish crate.
I don't properly know if its the right place since the upload code isn't
ready, but i'm PR-ing it now because it already had merge conflicts.
## Summary
We now track the discovered `IndexCapabilities` for each `IndexUrl`. If
we learn that an index doesn't support range requests, we avoid doing
any batch prefetching.
Closes https://github.com/astral-sh/uv/issues/7221.
## Summary
Http headers are supposed to be case-insensitive (RFC 2616), but there
are some implementations that don't normalize them.
I noticed it while migrating to `uv`, calls to an internal registry
failed. A man in the middle server helped me to find that `pip` uses
Title-Case while `uv pip` uses lowercase.
## Test Plan
I tested `uv` with the same server and now it works fine.
Forward an error for missing temp directories:
```
$ env TMPDIR=.tmp uv-debug pip install httpx
error: No such file or directory (os error 2) at path "/home/konsti/projects/uv/.tmp/.tmpgIBhhh"
```
Fixes#6878
## Summary
This PR revives https://github.com/astral-sh/uv/pull/4944, which I think
was a good start towards adding `--trusted-host`. Last night, I tried to
add `--trusted-host` with a custom verifier, but we had to vendor a lot
of `reqwest` code and I eventually hit some private APIs. I'm not
confident that I can implement it correctly with that mechanism, and
since this is security, correctness is the priority.
So, instead, we now use two clients and multiplex between them.
Closes https://github.com/astral-sh/uv/issues/1339.
## Test Plan
Created self-signed certificate, and ran `python3 -m http.server --bind
127.0.0.1 4443 --directory . --certfile cert.pem --keyfile key.pem` from
the packse index directory.
Verified that `cargo run pip install
transitive-yanked-and-unyanked-dependency-a-0abad3b6 --index-url
https://127.0.0.1:8443/simple-html` failed with:
```
error: Request failed after 3 retries
Caused by: error sending request for url (https://127.0.0.1:8443/simple-html/transitive-yanked-and-unyanked-dependency-a-0abad3b6/)
Caused by: client error (Connect)
Caused by: invalid peer certificate: Other(OtherError(CaUsedAsEndEntity))
```
Verified that `cargo run pip install
transitive-yanked-and-unyanked-dependency-a-0abad3b6 --index-url
'https://127.0.0.1:8443/simple-html' --trusted-host '127.0.0.1:8443'`
failed with the expected error (invalid resolution) and made valid
requests.
Verified that `cargo run pip install
transitive-yanked-and-unyanked-dependency-a-0abad3b6 --index-url
'https://127.0.0.1:8443/simple-html' --trusted-host '127.0.0.2' -n` also
failed.
## Summary
This reverts commit 7d92915f3d.
I thought this would be a net performance improvement, but we've now had
multiple reports that this made locking _extremely_ slow. I also tested
this today with a very large codebase against a registry that does not
support range requests, and the number of downloads was sort of wild to
watch. Reverting the reduced resolution time by over 50%.
Closes#6104.
This PR migrates uv's use of `chrono` to `jiff`.
I did most of this work a while back as one of my tests to ensure Jiff
could actually be used in a real world project. I decided to revive
this because I noticed that `reqwest-retry` dropped its Chrono
dependency,
which is I believe the only other thing requiring Chrono in uv.
(Although, we use a fork of `reqwest-middleware` at present, and that
hasn't been updated to latest upstream yet. I wasn't quite sure of the
process we have for that.)
In course of doing this, I actually made two changes to uv:
First is that the lock file now writes an RFC 3339 timestamp for
`exclude-newer`. Previously, we were using Chrono's `Display`
implementation for this which is a non-standard but "human readable"
format. I think the right thing to do here is an RFC 3339 timestamp.
Second is that, in addition to an RFC 3339 timestamp, `--exclude-newer`
used to accept a "UTC date." But this PR changes it to a "local date."
That is, a date in the user's system configured time zone. I think
this makes more sense than a UTC date, but one alternative is to drop
support for a date and just rely on an RFC 3339 timestamp. The main
motivation here is that automatically assuming UTC is often somewhat
confusing, since just writing an unqualified date like `2024-08-19` is
often assumed to be interpreted relative to the writer's "local" time.
## Summary
The strategy here is: if the user provides supported environments, we
use those as the initial forks when resolving. As a result, we never add
or explore branches that are disjoint with the supported environments.
(If the supported environments change, we ignore the lockfile entirely,
so we don't have to worry about any interactions between supported
environments and the preference forks.)
Closes https://github.com/astral-sh/uv/issues/6184.
Surprisingly, this is a lockfile schema change: We can't store relative
paths in urls, so we have to store a `filename` entry instead of the
whole url.
Fixes#4355
## Summary
This PR adds a `DistExtension` field to some of our distribution types,
which requires that we validate that the file type is known and
supported when parsing (rather than when attempting to unzip). It
removes a bunch of extension parsing from the code too, in favor of
doing it once upfront.
Closes https://github.com/astral-sh/uv/issues/5858.
The comment in the code explains the bulk of this:
```rust
// We previously computed this heuristic freshness lifetime by
// looking at the difference between the last modified header and
// the response's date header. We then asserted that the cached
// response ought to be "fresh" for 10% of that interval.
//
// It turns out that this can result in very long freshness
// lifetimes[1] that lead to uv caching too aggressively.
//
// Since PyPI sets a max-age of 600 seconds and since we're
// principally just interacting with Python package indices here,
// we just assume a freshness lifetime equal to what PyPI has.
//
// Note though that a better solution here is for the index to
// support proper HTTP caching headers (ideally Cache-Control, but
// Expires also works too, as above).
```
We also remove the `heuristic_percent` field on `CacheConfig`. Since
that's actually part of the cache itself, we bump the simple cache
version.
Finally, we add some more `trace!` calls that should hopefully make
diagnosing issues related to the freshness lifetime a bit easier in the
future.
Fixes#5351