Since we're now using read timeouts and not total timeouts, we can use a
lower threshold, a single read shouldn't take 5 min (and not even 10s).
The 10s value is somewhat arbitrary.
Like #3144, this is a breaking change in some sense.
## Summary
This leverages the new `read_timeout` property, which ensures that (like
pip) our timeout is not applied to the _entire_ request, but rather, to
each individual read operation.
Closes: #1921.
See: #1912.
## Summary
This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.
Here are some of the most important highlights:
- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.
There are a few notable TODOs:
- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.
Closes#474.
## Test Plan
I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
## Summary
This lets us remove circular dependencies (in the future, e.g., #2945)
that arise from `FlatIndex` needing a bunch of resolver-specific
abstractions (like incompatibilities, required hashes, etc.) that aren't
necessary to _fetch_ the flat index entries.
## Summary
Right now, we have a `Hashes` representation that looks like:
```rust
/// A dictionary mapping a hash name to a hex encoded digest of the file.
///
/// PEP 691 says multiple hashes can be included and the interpretation is left to the client.
#[derive(Debug, Clone, Eq, PartialEq, Default, Deserialize)]
pub struct Hashes {
pub md5: Option<Box<str>>,
pub sha256: Option<Box<str>>,
pub sha384: Option<Box<str>>,
pub sha512: Option<Box<str>>,
}
```
It stems from the PyPI API, which returns a dictionary of hashes.
We tend to pass these around as a vector of `Vec<Hashes>`. But it's a
bit strange because each entry in that vector could contain multiple
hashes. And it makes it difficult to ask questions like "Is
`sha256:ab21378ca980a8` in the set of hashes"?
This PR instead treats `Hashes` as the PyPI-internal type, and uses a
new `Vec<HashDigest>` everywhere in our own APIs.
Needed to prevent circular dependencies in my toolchain work (#2931). I
think this is probably a reasonable change as we move towards persistent
configuration too?
Unfortunately `BuildIsolation` needs to be in `uv-types` to avoid
circular dependencies still. We might be able to resolve that in the
future.
## Summary
In working on `--require-hashes`, I noticed that we're missing some
incompatibility tracking for `--find-links` distributions. Specifically,
we don't respect `--no-build` or `--no-binary`, so if we select a wheel
due to `--find-links`, we then throw a hard error when trying to build
it later (if `--no-binary` is provided), rather than selecting the
source distribution instead.
Closes https://github.com/astral-sh/uv/issues/2827.
## Summary
Upgrading `rs-async-zip` enables us to support data descriptors in
streaming. This both greatly improves performance for indexes that use
data descriptors _and_ ensures that we support them in a few other
places (e.g., zipped source distributions created in Finder).
Closes#2808.
## Summary
This partially revives https://github.com/astral-sh/uv/pull/2135 (with
some modifications) to enable users to opt-in to looking for packages
across multiple indexes.
The behavior is such that, in version selection, we take _any_
compatible version from a "higher-priority" index over the compatible
versions of a "lower-priority" index, even if that means we might accept
an "older" version.
Closes https://github.com/astral-sh/uv/issues/2775.
## Summary
In `pip sync`, we weren't properly handling cases in which a package
_only_ existed in `--find-links` (e.g., the user passed `--offline` or
`--no-index`).
I plan to explore removing `Finder` entirely to avoid these mismatch
bugs between `pip sync` and other commands, but this is fine for now.
Closes https://github.com/astral-sh/uv/issues/2688.
## Test Plan
`cargo test`
## Summary
This PR enables the source distribution database to be used with unnamed
requirements (i.e., URLs without a package name). The (significant)
upside here is that we can now use PEP 517 hooks to resolve unnamed
requirement metadata _and_ reuse any computation in the cache.
The changes to `crates/uv-distribution/src/source/mod.rs` are quite
extensive, but mostly mechanical. The core idea is that we introduce a
new `BuildableSource` abstraction, which can either be a distribution,
or an unnamed URL:
```rust
/// A reference to a source that can be built into a built distribution.
///
/// This can either be a distribution (e.g., a package on a registry) or a direct URL.
///
/// Distributions can _also_ point to URLs in lieu of a registry; however, the primary distinction
/// here is that a distribution will always include a package name, while a URL will not.
#[derive(Debug, Clone, Copy)]
pub enum BuildableSource<'a> {
Dist(&'a SourceDist),
Url(SourceUrl<'a>),
}
```
All the methods on the source distribution database now accept
`BuildableSource`. `BuildableSource` has a `name()` method, but it
returns `Option<&PackageName>`, and everything is required to work with
and without a package name.
The main drawback of this approach (which isn't a terrible one) is that
we can no longer include the package name in the cache. (We do continue
to use the package name for registry-based distributions, since those
always have a name.). The package name was included in the cache route
for two reasons: (1) it's nice for debugging; and (2) we use it to power
`uv cache clean flask`, to identify the entries that are relevant for
Flask.
To solve this, I changed the `uv cache clean` code to look one level
deeper. So, when we want to determine whether to remove the cache entry
for a given URL, we now look into the directory to see if there are any
wheels that match the package name. This isn't as nice, but it does work
(and we have test coverage for it -- all passing).
I also considered removing the package name from the cache routes for
non-registry _wheels_, for consistency... But, it would require a cache
bump, and it didn't feel important enough to merit that.
## Summary
When a user runs with `--output-file` and `--generate-hashes`, we should
_only_ update the hashes if the pinned version itself changes.
Closes https://github.com/astral-sh/uv/issues/1530.
## Summary
Closes#1958
This adds linehaul metadata to uv's user-agent when pep 508 markers are
provided to the RegistryClientBuilder. Thanks to #2381, we were able to
leverage most information from markers and avoid inconsistency.
Linehaul is meant to be accompanying metadata pip sends in it's user
agent when talking to registries. You can see this output by running
something like `python -c 'from pip._internal.network.session import
user_agent; print(user_agent())'`.
In PyPI, this metadata processed by the
[linehaul-cloud-function](https://github.com/pypi/linehaul-cloud-function).
More info about linehaul can be found in #1958.
Below are some examples from pip:
* Linux GHA: `pip/24.0
{"ci":true,"cpu":"x86_64","distro":{"id":"jammy","libc":{"lib":"glibc","version":"2.35"},"name":"Ubuntu","version":"22.04"},"implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL
3.0.2 15 Mar
2022","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Linux","release":"6.5.0-1016-azure"}}`
* Windows GHA: `pip/24.0
{"ci":true,"cpu":"AMD64","implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL
3.0.13 30 Jan
2024","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Windows","release":"2022Server"}}`
* OSX GHA: `pip/24.0
{"ci":true,"cpu":"arm64","distro":{"name":"macOS","version":"14.2.1"},"implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL
3.0.13 30 Jan
2024","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Darwin","release":"23.2.0"}}`
Here's how uv results look like (sorry for the keys not having the same
order):
* Linux GHA: `uv/0.1.21
{"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":"Linux","release":"6.5.0-1016-azure"},"cpu":"x86_64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}`
* Windows GHA: `uv/0.1.21
{"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":null,"system":{"name":"Windows","release":"2022Server"},"cpu":"AMD64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}`
* OSX GHA: `uv/0.1.21
{"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":{"name":"macOS","version":"14.2.1","id":null,"libc":null},"system":{"name":"Darwin","release":"23.2.0"},"cpu":"arm64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}`
Distro information (such as the one pip uses `from pip._vendor import
distro` to retrieve instead of `platform` module) was not retrieved from
markers. Instead, the linux release codename/name/version uses
`sys-info` crate, adding about 50us of extra overhead on linux. The
distro osx version re-used the [mac_os version
implementation](99c992e38b/crates/platform-host/src/mac_os.rs)
from #2381 which adds about 20us of overhead on osx. I tried to use
other crates to avoid re-introducing `mac_os.rs` but most of them didn't
yield satisfactory performance (40ms-60ms~) or had the wrong values
needed (e.g. darwin version vs osx version).
I also didn't add libc retrieval or rustc retrieval as those seem to add
substantial overhead due to querying `ldd` or `rustc`. PyPy version
detection was also not added to avoid adding extra overhead to [support
PyPy for
linehaul](https://github.com/pypa/pip/blob/24.0/src/pip/_internal/network/session.py#L123).
All other behavior was kept 1-1 to match what pip's linehaul
implementation does (as of 24.0). This also aligns with what was
discussed in #1958.
## Test Plan
Added new integration test to uv-client.
---------
Co-authored-by: konstin <konstin@mailbox.org>
## Summary
Right now, the middleware doesn't apply credentials that were
_originally_ sourced from a URL. This requires that we call
`with_url_encoded_auth` whenever we create a request to ensure that any
credentials that were passed in as part of an index URL (for example)
are respected.
This PR modifies `uv-auth` to instead apply those credentials in the
middleware itself. This seems preferable to me. As far as I can tell, we
can _only_ add in-URL credentials to the store ourselves (since in-URL
credentials are converted to headers by the time they reach the
middleware). And if we ever _didn't_ apply those credentials to new
URLs, it'd be a bug in the logic that precedes the middleware (i.e., us
forgetting to call `with_url_encoded_auth`).
## Test Plan
`cargo run pip install` with an authenticated index.
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Adds basic keyring auth support for `uv` commands. Adds clone of `pip`'s
`--keyring-provider subprocess` argument (using CLI `keyring` tool).
See issue: https://github.com/astral-sh/uv/issues/1520
## Test Plan
<!-- How was it tested? -->
Hard to write full-suite unit tests due to reliance on
`process::Command` for `keyring` cli
Manually tested end-to-end in a project with GCP artifact registry using
keyring password:
```bash
➜ uv pip uninstall watchdog
Uninstalled 1 package in 46ms
- watchdog==4.0.0
➜ cargo run -- pip install --index-url https://<redacted>/python/simple/ --extra-index-url https://<redacted>/pypi-mirror/simple/ watchdog
Finished dev [unoptimized + debuginfo] target(s) in 0.18s
Running `target/debug/uv pip install --index-url 'https://<redacted>/python/simple/' --extra-index-url 'https://<redacted>/pypi-mirror/simple/' watchdog`
error: HTTP status client error (401 Unauthorized) for url (https://<redacted>/pypi-mirror/simple/watchdog/)
➜ cargo run -- pip install --keyring-provider subprocess --index-url https://<redacted>/python/simple/ --extra-index-url https://<redacted>/pypi-mirror/simple/ watchdog
Finished dev [unoptimized + debuginfo] target(s) in 0.17s
Running `target/debug/uv pip install --keyring-provider subprocess --index-url 'https://<redacted>/python/simple/' --extra-index-url 'https://<redacted>/pypi-mirror/simple/' watchdog`
Resolved 1 package in 2.34s
Installed 1 package in 27ms
+ watchdog==4.0.0
```
`requirements.txt`
```
#
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# .bin/generate-requirements
#
--index-url https://<redacted>/python/simple/
--extra-index-url https://<redacted>/pypi-mirror/simple/
...
```
```bash
➜ cargo run -- pip install --keyring-provider subprocess -r requirements.txt
Finished dev [unoptimized + debuginfo] target(s) in 0.19s
Running `target/debug/uv pip install --keyring-provider subprocess -r requirements.txt`
Resolved 205 packages in 23.52s
Built <redacted>
...
Downloaded 47 packages in 19.32s
Installed 195 packages in 276ms
+ <redacted>
...
```
---------
Co-authored-by: Thomas Gilgenast <thomas@vant.ai>
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
Small follow up to https://github.com/astral-sh/uv/pull/2362 to check if
`SSL_CERT_FILE` is set to enable `--native-tls` functionality. This
maintains backwards compatibility with `0.1.17` and below users
leveraging only `SSL_CERT_FILE`.
Closes https://github.com/astral-sh/uv/issues/2400
## Test Plan
<!-- How was it tested? -->
Assuming `SSL_CERT_FILE` is already working via `--native-tls`, this is
simply a shortcut to enable `--native-tls` functionality implicitly
while still being able to let `rustls-native-certs` handle the loading
of `SSL_CERT_FILE` instead of ourselves.
Edit: Manually tested by setting up own self-signed CA certificate
bundle and set `SSL_CERT_FILE` to this and confirmed the loading happens
without having to specify `--native-tls`.
## Summary
It turns out that on macOS, reading the native certificates can add
hundreds of milliseconds to client initialization. This PR makes
`--native-tls` a command-line flag, to toggle (at runtime) the choice of
the `webpki` roots or the native system roots.
You can't accomplish this kind of configuration with the `reqwest`
builder API, so instead, I pulled out the heart of that logic from the
crate
(e319263851/src/async_impl/client.rs (L498)),
and modified it to allow toggling a choice of root.
Note that there's an open PR for this in reqwest
(https://github.com/seanmonstar/reqwest/pull/1848), along with an issue
(https://github.com/seanmonstar/reqwest/issues/1843), which I may ping,
but it's been around for a while and I believe reqwest is focused on its
next major release.
Closes https://github.com/astral-sh/uv/issues/2346.
## Summary
Some zip files can't be streamed; in particular, `rs-async-zip` doesn't
support data descriptors right now (though it may in the future). This
PR adds a fallback path for such zips that downloads the entire zip file
to disk, then unzips it from disk (which gives us `Seek`).
Closes https://github.com/astral-sh/uv/issues/2216.
## Test Plan
`cargo run pip install --extra-index-url https://buf.build/gen/python
hashb_foxglove_protocolbuffers_python==25.3.0.1.20240226043130+465630478360
--force-reinstall -n`
## Summary
The netrc middleware we added in
https://github.com/astral-sh/uv/pull/2241 has a slight problem. If you
include credentials in your index URL, _and_ in the netrc file, the
crate blindly adds the netrc credentials as a header. And given the
`ReqwestBuilder` API, this means you end up with _two_ `Authorization`
headers, which always leads to an invalid request, though the exact
failure can take different forms.
This PR removes the middleware crate in favor of our own middleware.
Instead of using the `RequestInitialiser` API, we have to use the
`Middleware` API, so that we can remove the header on the request
itself.
Closes https://github.com/astral-sh/uv/issues/2323.
## Test Plan
- Verified that running against a private index with credentials in the
URL (but no netrc file) worked without error.
- Verified that running against a private index with credentials in the
netrc file (but not the URL) worked without error.
- Verified that running against a private index with a mix of
credentials in both _also_ worked without error.
Extends the "compatibility" types introduced in #1293 to apply to source
distributions as well as wheels.
- We now track the most-relevant incompatible source distribution
- Exclude newer, Python requirements, and yanked versions are all
tracked as incompatibilities in the new model (this lets us remove
`DistMetadata`!)
## Summary
PyPI now supports Metadata 2.2, which means distributions with Metadata
2.2-compliant metadata will start to appear. The upside is that if a
source distribution includes a `PKG-INFO` file with (1) a metadata
version of 2.2 or greater, and (2) no dynamic fields (at least, of the
fields we rely on), we can read the metadata from the `PKG-INFO` file
directly rather than running _any_ of the PEP 517 build hooks.
Closes https://github.com/astral-sh/uv/issues/2009.
## Summary
If we fallback to streaming the wheel (because the registry doesn't
support range requests), we currently don't cache the metadata at all.
This PR fixes that, ensuring that we cache based on the same HTTP
policies, etc.
## Summary
We're seeing reports that Sonatype Nexus isn't working with cached data.
Users are reporting 304 responses that show "Found modified response..."
path in the logs. I can't reproduce this on latest Sonatype Nexus, but
my best guess is that there's a 304 response that is failing our
validators, and we try to use that as if it's a "complete" response?
Closes https://github.com/astral-sh/uv/issues/1754.
## Summary
Add netrc support to the uv-client.
closes#1405
## Test Plan
I've added a corresponding test case to validate the correct header.
Furthermore a tested it against a real world private repository.
## Summary
This is no longer necessary as `AsyncHttpRangeReader` now accepts
`ClientWithMiddleware` -- which is good, because it means all relevant
middleware will be enforced (like offline, or `.netrc` in the future).
## Summary
Allow using http(s) urls for constraints and requirements files handed
to the CLI, by handling paths starting with `http://` or `https://`
differently. This allows commands for such as: `uv pip install -c
https://raw.githubusercontent.com/apache/airflow/constraints-2.8.1/constraints-3.8.txt
requests`.
closes#1332
## Test Plan
Testing install using a `constraints.txt` file hosted on github in the
airflow repository:
fbdc2eba8e/crates/uv/tests/pip_install.rs (L1440-L1484)
## Advice Needed
- filesystem/http dispatch is implemented at a relatively low level (at
`crates/uv-fs/src/lib.rs#read_to_string`). Should I change some naming
here so it is obvious that the function is able to dispatch?
- I kept the CLI argument for -c and -r as a PathBuf, even though now it
is technically either a path or a url. We could either keep this as is
for now, or implement a new enum for this case? The enum could then
handle dispatch to files/http.
- Using another abstraction layer like
https://docs.rs/object_store/latest/object_store/ for the
files/urls/[s3] could work as well, though I ran into a bug during
testing which I couldn't debug
## Summary
We have at least one reported case of this happening. It's preferable
IMO to move on rather than fail hard despite sub-pbar registry behavior.
Closes https://github.com/astral-sh/uv/issues/2099.
## Summary
Closes#1977
This allows us to send uv's version in the `uv-client` User Agent
header.
Here's how request headers look like to a server now:
```
...
Accept: application/vnd.pypi.simple.v1+json, application/vnd.pypi.simple.v1+html;q=0.2, text/html;q=0.01
User-Agent: uv/0.1.13
...
```
~~I went for a mix of Option 1 and 2 from #1977.~~ Open to alternative
naming as well, not tied too strongly here to the names picked.
~~Another possibility for this new crate is that we can use it to
consolidate metadata that exists across crates to ultimately be able to
create linehaul information described in #1958, but I haven't looked
into what those changes might look like.~~
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
<!-- How was it tested? -->
Added initial tests in the new crate to exercise its public API and
added a new test to uv-client to validate the headers using a 1-time
disposable server.
Error for `uv pip compile scripts/requirements/jupyter.in` without
internet:
**Before**
```
error: error sending request for url (https://pypi.org/simple/jupyter/): error trying to connect: dns error: failed to lookup address information: No such host is known. (os error 11001)
Caused by: error trying to connect: dns error: failed to lookup address information: No such host is known. (os error 11001)
Caused by: dns error: failed to lookup address information: No such host is known. (os error 11001)
Caused by: failed to lookup address information: No such host is known. (os error 11001)
```
**After**
```
error: Could not connect, are you offline?
Caused by: error sending request for url (https://pypi.org/simple/django/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: failed to lookup address information: Temporary failure in name resolution
```
On linux, it would be "Temporary failure in name resolution" instead of
"No such host is known. (os error 11001)".
The implementation checks for "dne error" stringly as hyper errors are
opaque. The danger is that this breaks with a hyper update. We still get
the complete error trace since reqwest eagerly inlines errors
(https://github.com/seanmonstar/reqwest/issues/2147).
No test since i wouldn't know how to simulate this in cargo test.
Fixes#1971