Commit graph

51 commits

Author SHA1 Message Date
Charlie Marsh
2ff3b380b1
Cache downloaded wheel when range requests aren't supported (#5089)
## Summary

When range requests aren't supported, we fall back to streaming the
wheel, stopping as soon as we hit a `METADATA` file. This is a small
optimization, but the downside is that we don't get to cache the
resulting wheel...

We don't know whether `METADATA` will be at the beginning or end of the
wheel, but it _seems_ like a better tradeoff to download and cache the
entire wheel?

Closes: https://github.com/astral-sh/uv/issues/5088.

Sort of a revert of: https://github.com/astral-sh/uv/pull/1792.
2024-07-16 09:21:47 -04:00
messense
38504dcaee
Download wheel to disk when streaming unzip failed with HTTP streaming error (#5094)
## Summary

Workaround the `stream_wheel` not retry issue
[found](https://github.com/astral-sh/uv/issues/3514#issuecomment-2229820667)
in #3514, it's not a perfect solution but I think it's acceptable
because the error should not occur frequently.

## Test Plan

Manually using `iptables -A OUTPUT -p tcp -dport 3128 -j REJECT
--reject-with tcp-reset` to inject connection reset error to the HTTP
proxy that proxies PyPI requests.

```
error: Failed to prepare distributions
  Caused by: Failed to fetch wheel: piqp==0.4.1
  Caused by: Request failed after 3 retries
  Caused by: error sending request for url (09ade94dfd/piqp-0.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl)
  Caused by: client error (Connect)
  Caused by: tcp connect error: Connection refused (os error 111)
  Caused by: Connection refused (os error 111)
```
2024-07-16 09:00:46 -04:00
Ibraheem Ahmed
d833910a5d
Avoid reparsing wheel URLs (#4947)
## Summary

We currently store wheel URLs in an unparsed state because we don't have
a stable parsed representation to use with rykv. Unfortunately this
means we end up reparsing unnecessarily in a lot of places, especially
when constructing a `Lock`. This PR adds a `UrlString` type that lets us
avoid reparsing without losing the validity of the `Url`.

## Test Plan

Shaves off another ~10 ms from
https://github.com/astral-sh/uv/issues/4860.

```
➜  transformers hyperfine "../../uv/target/profiling/uv lock" "../../uv/target/profiling/baseline lock" --warmup 3
Benchmark 1: ../../uv/target/profiling/uv lock
  Time (mean ± σ):     120.9 ms ±   2.5 ms    [User: 126.0 ms, System: 80.6 ms]
  Range (min … max):   116.8 ms … 125.7 ms    23 runs
 
Benchmark 2: ../../uv/target/profiling/baseline lock
  Time (mean ± σ):     129.9 ms ±   4.2 ms    [User: 127.1 ms, System: 86.1 ms]
  Range (min … max):   123.4 ms … 141.2 ms    23 runs

Summary
  ../../uv/target/profiling/uv lock ran
    1.07 ± 0.04 times faster than ../../uv/target/profiling/baseline lock
```
2024-07-10 05:16:30 -04:00
Charlie Marsh
ff72bb9bcc
Read content length from response rather than request (#4488)
## Summary

I might be mistaken, but I think we need to read the header from the
response, not the request. The request would only contain headers that
we set.

I verified (with extra logging) that the request header is `None` while
PyPI returns a valid length in the response header.
2024-06-24 15:58:21 -04:00
Charlie Marsh
f07308823e
Add --emit-build-options flag to uv pip compile interface (#4463)
## Summary

Closes https://github.com/astral-sh/uv/issues/4420.
2024-06-24 12:25:01 +00:00
Zanie Blue
5a007b6b9f
Add BuildOptions for centralized combination of NoBuild and NoBinary (#4284)
As requested in review of https://github.com/astral-sh/uv/pull/4067
2024-06-12 21:33:33 +00:00
Zanie Blue
1ab4041baa
Allow specific --only-binary and --no-binary packages to override :all: (#4067)
Updates `--no-binary <package>` to take precedence over `--only-binary
:all:` and `--only-binary <package>` to take precedence over
`--no-binary :all:`.

I'm not entirely sure about this behavior, e.g. maybe I provided
`--only-binary :all:` later on the command line and really want it to
override those earlier arguments of `--no-binary <package>` for safety.
Right now we just fail to solve though since we can't satisfy the
overlapping requests.

Closes https://github.com/astral-sh/uv/issues/4063
2024-06-12 15:47:45 -05:00
Charlie Marsh
656fc427b9
Add support for local directories with --index-url (#4226)
## Summary

Closes #4078.
2024-06-10 22:27:04 -04:00
Charlie Marsh
191f9556b7
Avoid building packages with dynamic versions (#4058)
## Summary

This PR separates "gathering the requirements" from the rest of the
metadata (e.g., version), which isn't required when installing a
package's _dependencies_ (as opposed to installing the package itself).
It thus ensures that we don't need to build a package when a static
`pyproject.toml` is provided in `pip compile`.

Closes https://github.com/astral-sh/uv/issues/4040.
2024-06-05 18:11:58 +00:00
konsti
72b1642232
Move metadata into its own file (#3939)
Move `Metadata`, `MetadataLoweringError` and `ArchiveMetadata` into
their own file `metadata.rs` in `uv-distribution`, moving it out from
`lib.rs`. No functional changes.
2024-05-31 15:24:10 +02:00
konsti
081f20c53e
Add support for tool.uv into distribution building (#3904)
With the change, we remove the special casing of workspace dependencies
and resolve `tool.uv` for all git and directory distributions. This
gives us support for non-editable workspace dependencies and path
dependencies in other workspaces. It removes a lot of special casing
around workspaces. These changes are the groundwork for supporting
`tool.uv` with dynamic metadata.

The basis for this change is moving `Requirement` from
`distribution-types` to `pypi-types` and the lowering logic from
`uv-requirements` to `uv-distribution`. This changes should be split out
in separate PRs.

I've included an example workspace `albatross-root-workspace2` where
`bird-feeder` depends on `a` from another workspace `ab`. There's a
bunch of failing tests and regressed error messages that still need
fixing. It does fix the audited package count for the workspace tests.
2024-05-31 02:42:03 +00:00
Charlie Marsh
09f55482a0
Remove some unused pub use exports (#3930) 2024-05-30 22:26:52 -04:00
Charlie Marsh
1fc6a59707
Remove special-casing for editable requirements (#3869)
## Summary

There are a few behavior changes in here:

- We now enforce `--require-hashes` for editables, like pip. So if you
use `--require-hashes` with an editable requirement, we'll reject it. I
could change this if it seems off.
- We now treat source tree requirements, editable or not (e.g., both `-e
./black` and `./black`) as if `--refresh` is always enabled. This
doesn't mean that we _always_ rebuild them; but if you pass
`--reinstall`, then yes, we always rebuild them. I think this is an
improvement and is close to how editables work today.

Closes #3844.

Closes #2695.
2024-05-28 15:49:34 +00:00
Ibraheem Ahmed
7dc322665c
Concurrent progress bars (#3252)
## Summary

Implements concurrent progress bars. Resolves
https://github.com/astral-sh/uv/issues/1209.

## Test Plan

b21bdfbb-8817-4873-a65c-16c9e8c7c460
2024-05-27 01:21:07 +00:00
Andrew Gallant
018a7150d6
uv-distribution: include all wheels in distribution types (#3595)
Our current flow of data from "simple registry package" to "final
resolved distribution" goes through a number of types:

* `SimpleMetadata` is the API response from a registry that includes all
published versions for a package. Each version has an assortment of
metadata
associated with it.
* `VersionFiles` is the aforementioned metadata. It is split in two: a
group of files for source distributions and a group of files for wheels.
* `PrioritizedDist` collects a subset of the files from `VersionFiles`
to form a selection of the "best" sdist and the "best" wheel for the
current environment.
* `CompatibleDist` is created from a borrowed `PrioritizedDist` that,
perhaps among other things, encapsulates the decision of whether to pick
an sdist or a wheel. (This decision depends both on compatibility and
the action being performed. e.g., When doing installation, a
`CompatibleDist` will sometimes select an sdist over a wheel.)
* `ResolvedDistRef` is like a `ResolvedDist`, but borrows a `Dist`.
* `ResolvedDist` is the almost-final-form of a distribution in a
resolution and is created from a `ResolvedDistRef`.
* `AnnotatedResolvedDist` is a new data type that is the actual final
form of a distribution that a universal lock file cares about. It
bundles a `ResolvedDist` with some metadata needed to generate a lock
file.

One of the requirements of a universal lock file is that we include all
wheels (and maybe all source distributions? but at least one if it's
present) associated with a distribution. But the above flow of data (in
the step from `VersionFiles` to `PrioritizedDist`) drops all wheels
except for the best one.

To remedy this, in this PR, we rejigger `PrioritizedDist`,
`CompatibleDist` and `ResolvedDistRef` so that all wheel data is
preserved. And when a `ResolvedDistRef` is finally turned into a
`ResolvedDist`, we copy all of the wheel data. And finally, we adjust
the `Lock` constructor to read this new data and include it in the lock
file. To make this work, we also modify `RegistryBuiltDist` so that it
can contain one or more wheels instead of just one.

One shortcoming here (called out in the code as a FIXME) is that if a
source distribution is selected as the "best" thing to use (perhaps
there are no compatible wheels), then the wheels won't end up in the
lock file. I plan to fix this in a follow-up PR.

We also aren't totally consistent on source distribution naming.
Sometimes we use `sdist`. Sometimes `source`. Sometimes `source_dist`.
I think it'd be nice to just use `sdist` everywhere, but I do prefer
the type names to be `SourceDist`. And sometimes you want function
names to match the type names (i.e., `from_source_dist`), which in turn
leads to an appearance of inconsistency. I'm open to ideas.

Closes #3351
2024-05-15 15:07:28 -04:00
Ibraheem Ahmed
783df8f657
Consolidate concurrency limits (#3493)
## Summary

This PR consolidates the concurrency limits used throughout `uv` and
exposes two limits, `UV_CONCURRENT_DOWNLOADS` and
`UV_CONCURRENT_BUILDS`, as environment variables.

Currently, `uv` has a number of concurrent streams that it buffers using
relatively arbitrary limits for backpressure. However, many of these
limits are conflated. We run a relatively small number of tasks overall
and should start most things as soon as possible. What we really want to
limit are three separate operations:
- File I/O. This is managed by tokio's blocking pool and we should not
really have to worry about it.
- Network I/O.
- Python build processes.

Because the current limits span a broad range of tasks, it's possible
that a limit meant for network I/O is occupied by tasks performing
builds, reading from the file system, or even waiting on a `OnceMap`. We
also don't limit build processes that end up being required to perform a
download. While this may not pose a performance problem because our
limits are relatively high, it does mean that the limits do not do what
we want, making it tricky to expose them to users
(https://github.com/astral-sh/uv/issues/1205,
https://github.com/astral-sh/uv/issues/3311).

After this change, the limits on network I/O and build processes are
centralized and managed by semaphores. All other tasks are unbuffered
(note that these tasks are still bounded, so backpressure should not be
a problem).
2024-05-10 12:43:08 -04:00
Ibraheem Ahmed
94cf604574
Remove unnecessary uses of DashMap and Arc (#3413)
## Summary

All of the resolver code is run on the main thread, so a lot of the
`Send` bounds and uses of `DashMap` and `Arc` are unnecessary. We could
also switch to using single-threaded versions of `Mutex` and `Notify` in
some places, but there isn't really a crate that provides those I would
be comfortable with using.

The `Arc` in `OnceMap` can't easily be removed because of the uv-auth
code which uses the
[reqwest-middleware](https://docs.rs/reqwest-middleware/latest/reqwest_middleware/trait.Middleware.html)
crate, that seems to adds unnecessary `Send` bounds because of
`async-trait`. We could duplicate the code and create a `OnceMapLocal`
variant, but I don't feel that's worth it.
2024-05-06 22:30:43 -04:00
Charlie Marsh
32f129c245
Store IDs rather than paths in the cache (#2985)
## Summary

Similar to `Revision`, we now store IDs in the `Archive` entires rather
than absolute paths. This makes the cache robust to moves, etc.

Closes https://github.com/astral-sh/uv/issues/2908.
2024-04-10 21:07:51 -04:00
Charlie Marsh
5583b90c30
Create dedicated abstractions for .rev and .http pointers (#2977)
## Summary

This PR formalizes some of the concepts we use in the cache for
"pointers to things".

In the wheel cache, we have files like
`annotated_types-0.6.0-py3-none-any.http`. This represents an unzipped
wheel, cached alongside an HTTP caching policy. We now have a struct for
this to encapsulate the logic: `HttpArchivePointer`.

Similarly, we have files like `annotated_types-0.6.0-py3-none-any.rev`.
This represents an unzipped local wheel, alongside with a timestamp. We
now have a struct for this to encapsulate the logic:
`LocalArchivePointer`.

We have similar structs for source distributions too.
2024-04-10 17:30:27 -04:00
Charlie Marsh
006379c50c
Add support for URL requirements in --generate-hashes (#2952)
## Summary

This PR enables hash generation for URL requirements when the user
provides `--generate-hashes` to `pip compile`. While we include the
hashes from the registry already, today, we omit hashes for URLs.

To power hash generation, we introduce a `HashPolicy` abstraction:

```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum HashPolicy<'a> {
    /// No hash policy is specified.
    None,
    /// Hashes should be generated (specifically, a SHA-256 hash), but not validated.
    Generate,
    /// Hashes should be validated against a pre-defined list of hashes. If necessary, hashes should
    /// be generated so as to ensure that the archive is valid.
    Validate(&'a [HashDigest]),
}
```

All of the methods on the distribution database now accept this policy,
instead of accepting `&'a [HashDigest]`.

Closes #2378.
2024-04-10 20:02:45 +00:00
Charlie Marsh
8513d603b4
Return computed hashes from metadata requests (#2951)
## Summary

This PR modifies the distribution database to return both the
`Metadata23` and the computed hashes when clients request metadata.

No behavior changes, but this will be necessary to power
`--generate-hashes`.
2024-04-10 19:31:41 +00:00
Charlie Marsh
1f3b5bb093
Add hash-checking support to install and sync (#2945)
## Summary

This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.

Here are some of the most important highlights:

- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.

There are a few notable TODOs:

- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.

Closes #474.

## Test Plan

I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
2024-04-10 19:09:03 +00:00
Zanie Blue
1512e07a2e
Split configuration options out of uv-types (#2924)
Needed to prevent circular dependencies in my toolchain work (#2931). I
think this is probably a reasonable change as we move towards persistent
configuration too?

Unfortunately `BuildIsolation` needs to be in `uv-types` to avoid
circular dependencies still. We might be able to resolve that in the
future.
2024-04-09 11:35:53 -05:00
Charlie Marsh
c46772eec5
Add a layer of indirection to the local path-based wheel cache (#2909)
## Summary

Right now, the path-based wheel cache just looks at the symlink to the
archives directory, checks the timestamp on it, and continues with that
symlink as long as the timestamp is up-to-date.

The HTTP-based wheel meanwhile, uses an intermediary `.http` file, which
includes the HTTP caching information. The `.http` file's payload is
just a path pointing to an entry in the archives directory.

This PR modifies the path-based codepaths to use a similar cache file,
which stores a timestamp along with a path to the archives directory.
The main advantage here is that we can add other data to this cache file
(namely, hashes in the future).

## Test Plan

Beyond existing tests, I also verified that this doesn't require a
version bump:

```
git checkout main 
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall
git checkout charlie/manifest
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall --refresh
```
2024-04-08 19:32:59 +00:00
Charlie Marsh
134810c547
Respect cached local --find-links in install plan (#2907)
## Summary

I think this is kind of just an oversight. If a wheel is available via
`--find-links`, and the index is "local", we never find it in the cache.

## Test Plan

`cargo test`
2024-04-08 18:58:33 +00:00
Charlie Marsh
31a67f539f
Remove unused local wheel types (#2906)
## Summary

No behavior changes. Just removing unused code.
2024-04-08 18:15:20 +00:00
Charlie Marsh
1daa35176f
Always return unzipped wheels from the distribution database (#2905)
## Summary

In all cases, we unzip these immediately after returning. By moving the
unzipping into the database, we can remove a bunch of code (coming in a
separate PR), and pave the way for hash-checking, since hash generation
will _also_ happen in the database, and splitting the caching layers
across the database and the unzipper creates complications.

Closes #2863.
2024-04-08 14:07:17 -04:00
Charlie Marsh
10dfd43af9
DRY up HTTP request builder in source database (#2902) 2024-04-08 14:45:26 +00:00
Charlie Marsh
f11a5e2208
DRY up local wheel path in distribution database (#2901) 2024-04-08 10:40:17 -04:00
Charlie Marsh
d4a258422b
DRY up request builer in wheel download (#2857) 2024-04-07 02:12:03 +00:00
Charlie Marsh
dc2c289dff
Upgrade rs-async-zip to support data descriptors (#2809)
## Summary

Upgrading `rs-async-zip` enables us to support data descriptors in
streaming. This both greatly improves performance for indexes that use
data descriptors _and_ ensures that we support them in a few other
places (e.g., zipped source distributions created in Finder).

Closes #2808.
2024-04-04 01:31:40 +00:00
Charlie Marsh
189d0d41d0
Remove redirects from the resolver (#2792)
## Summary

Rather than storing the `redirects` on the resolver, this PR just
re-uses the "convert this URL to precise" logic when we convert to a
`Resolution` after-the-fact. I think this is a lot simpler: it removes
state from the resolver, and simplifies a lot of the hooks around
distribution fetching (e.g., `get_or_build_wheel_metadata` no longer
returns `(Metadata23, Option<Url>)`).
2024-04-03 02:43:57 +00:00
Charlie Marsh
dfdcce68fd
Split get_or_build_wheel_metadata into distinct methods (#2765)
## Summary

Internal refactor which makes `DistributionDatabase` for unnamed
requirements (or, at least, source distributions).
2024-04-01 23:52:21 +00:00
Charlie Marsh
e68cdb1049
Rename to SourceDistributionBuilder (#2750)
## Summary

This is more consistent with `DistributionDatabase`. The order of the
arguments is also now consistent between the two structs.
2024-04-01 02:37:43 +00:00
Charlie Marsh
8596ff3470
Remove Cache argument from DistributionDatabase (#2749)
## Summary

We can access cache from `BuildContext`. This mirrors
`SourceDistCachedBuilder`, which doesn't accept `Cache` as an argument
and always accesses it through `BuildContext`.
2024-03-31 22:22:25 -04:00
Charlie Marsh
f8f7f848f5
Remove Tags from tracing (#2704)
Oops, these show in `tracing` now, and they're huge.
2024-03-28 02:18:54 +00:00
Charlie Marsh
b6ab919945
Make tags non-required for fetching wheel metadata (#2700)
## Summary

This looks like a big change but it really isn't. Rather, I just split
`get_or_build_wheel` into separate `get_wheel` and `build_wheel` methods
internally, which made `get_or_build_wheel_metadata` capable of _not_
relying on `Tags`, which in turn makes it easier for us to use the
`DistributionDatabase` in various places without having it coupled to an
interpreter or environment (something we already did for
`SourceDistributionBuilder`).
2024-03-28 00:06:25 +00:00
Charlie Marsh
ffd78d0821
Add an in-memory cache for Git references (#2682)
## Summary

Ensures that, even if we try to resolve the same Git reference twice
within an invocation, it always returns a (cached) consistent result.

Closes https://github.com/astral-sh/uv/issues/2673.

## Test Plan

```
❯ cargo run pip install git+https://github.com/pallets/flask.git --reinstall --no-cache
   Compiling uv-distribution v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-distribution)
   Compiling uv-resolver v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-resolver)
   Compiling uv-installer v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-installer)
   Compiling uv-dispatch v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-dispatch)
   Compiling uv-requirements v0.1.0 (/Users/crmarsh/workspace/uv/crates/uv-requirements)
   Compiling uv v0.1.24 (/Users/crmarsh/workspace/uv/crates/uv)
    Finished dev [unoptimized + debuginfo] target(s) in 3.95s
     Running `target/debug/uv pip install 'git+https://github.com/pallets/flask.git' --reinstall --no-cache`
 Updated https://github.com/pallets/flask.git (b90a4f1)
Resolved 7 packages in 280ms
   Built flask @ git+https://github.com/pallets/flask.git@b90a4f1f4a370e92054b9cc9db0efcb864f87ebe                                                                                                                                            Downloaded 7 packages in 212ms
Installed 7 packages in 9ms
```
2024-03-27 01:39:01 +00:00
Zanie Blue
0b08ba1e67
Rename uv-traits and split into separate modules (#2674)
This is driving me a little crazy and is becoming a larger problem in
#2596 where I need to move more types (like `Upgrade` and `Reinstall`)
into this crate. Anything that's shared across our core resolver,
install, and build crates needs to be defined in this crate to avoid
cyclic dependencies. We've outgrown it being a single file with some
shared traits.

There are no behavioral changes here.
2024-03-26 15:39:43 -05:00
Charlie Marsh
31743f2bd8
Use PEP 517 build hooks to resolve unnamed requirements (#2604) 2024-03-22 03:20:40 +00:00
Charlie Marsh
5d7d7dce24
Enable PEP 517 builds for unnamed requirements (#2600)
## Summary

This PR enables the source distribution database to be used with unnamed
requirements (i.e., URLs without a package name). The (significant)
upside here is that we can now use PEP 517 hooks to resolve unnamed
requirement metadata _and_ reuse any computation in the cache.

The changes to `crates/uv-distribution/src/source/mod.rs` are quite
extensive, but mostly mechanical. The core idea is that we introduce a
new `BuildableSource` abstraction, which can either be a distribution,
or an unnamed URL:

```rust
/// A reference to a source that can be built into a built distribution.
///
/// This can either be a distribution (e.g., a package on a registry) or a direct URL.
///
/// Distributions can _also_ point to URLs in lieu of a registry; however, the primary distinction
/// here is that a distribution will always include a package name, while a URL will not.
#[derive(Debug, Clone, Copy)]
pub enum BuildableSource<'a> {
    Dist(&'a SourceDist),
    Url(SourceUrl<'a>),
}
```

All the methods on the source distribution database now accept
`BuildableSource`. `BuildableSource` has a `name()` method, but it
returns `Option<&PackageName>`, and everything is required to work with
and without a package name.

The main drawback of this approach (which isn't a terrible one) is that
we can no longer include the package name in the cache. (We do continue
to use the package name for registry-based distributions, since those
always have a name.). The package name was included in the cache route
for two reasons: (1) it's nice for debugging; and (2) we use it to power
`uv cache clean flask`, to identify the entries that are relevant for
Flask.

To solve this, I changed the `uv cache clean` code to look one level
deeper. So, when we want to determine whether to remove the cache entry
for a given URL, we now look into the directory to see if there are any
wheels that match the package name. This isn't as nice, but it does work
(and we have test coverage for it -- all passing).

I also considered removing the package name from the cache routes for
non-registry _wheels_, for consistency... But, it would require a cache
bump, and it didn't feel important enough to merit that.
2024-03-21 22:46:39 -04:00
Zanie Blue
9c27f92203
Introduce a BaseClient for construction of canonical configured client (#2431)
In preparation for support of
https://github.com/astral-sh/uv/issues/2357 (see
https://github.com/astral-sh/uv/pull/2434)
2024-03-15 12:07:38 -05:00
Charlie Marsh
d9b160b405
Add backoff for transient Windows failures (#2419)
## Summary

This may be required elsewhere, but all the traces in that issue are
related to persisting the temporary directory to our persistent cache,
so lets start there.

See: https://github.com/astral-sh/uv/issues/1491.
2024-03-13 13:16:26 -04:00
Charlie Marsh
a267a501b6
Add Seek fallback for zip files (#2320)
## Summary

Some zip files can't be streamed; in particular, `rs-async-zip` doesn't
support data descriptors right now (though it may in the future). This
PR adds a fallback path for such zips that downloads the entire zip file
to disk, then unzips it from disk (which gives us `Seek`).

Closes https://github.com/astral-sh/uv/issues/2216.

## Test Plan

`cargo run pip install --extra-index-url https://buf.build/gen/python
hashb_foxglove_protocolbuffers_python==25.3.0.1.20240226043130+465630478360
--force-reinstall -n`
2024-03-10 11:39:28 -04:00
Charlie Marsh
6dcd00e031
Move wheel download into a shared method (#2324)
## Summary

No behavioral changes; just taking code that's duplicated between two
branches in `distribution_database.rs` and pulling it into its own
method.
2024-03-09 22:40:00 -05:00
Charlie Marsh
6866a55f20
Add Accept-Encoding: identity to remaining stream paths (#2321)
## Summary

Like #2319, there are a few other places where we attempt to stream a
file.
2024-03-10 02:42:53 +00:00
Jonathon Belotti
e16140a849
Address #2220 (slow download perf against PyPi mirror) (#2319)
## Summary

Addressing the extremely slow performance detailed in
https://github.com/astral-sh/uv/issues/2220. There are two changes to
increase download performance:

1. setting `accept-encoding: identity`, in the spirit of
https://github.com/pypa/pip/pull/1688
2. increasing buffer from 8KiB to 128KiB. 

### 1. accept-encoding: identity

I think this related `pip` PR has a good explanation of what's going on:
https://github.com/pypa/pip/pull/1688

```
  # We use Accept-Encoding: identity here because requests
  # defaults to accepting compressed responses. This breaks in
  # a variety of ways depending on how the server is configured.
  # - Some servers will notice that the file isn't a compressible
  #   file and will leave the file alone and with an empty
  #   Content-Encoding
  # - Some servers will notice that the file is already
  #   compressed and will leave the file alone and will add a
  #   Content-Encoding: gzip header
  # - Some servers won't notice anything at all and will take
  #   a file that's already been compressed and compress it again
  #   and set the Content-Encoding: gzip header
```

The `files.pythonhosted.org` server is the 1st kind. Example debug log I
added in `uv` when installing against PyPI:

<img width="1459" alt="image"
src="ef10d758-46aa-4c8e-9dba-47f33437401b">

(there is no `content-encoding` header in this response, the `whl`
hasn't been compressed, and there is a content-length header)

Our internal mirror is the third case. It does seem sensible that our
mirror should be modified to act like the 1st kind. But `uv` should
handle all three cases like `pip` does.

### 2. buffer increase

In https://github.com/astral-sh/uv/issues/2220 I observed that `pip`'s
downloading was causing up-to 128KiB flushes in our mirror.

After fix 1, `uv` was still only causing up-to 8KiB flushes, and was
slower to download than `pip`. Increasing this buffer from the default
8KiB led to a download performance improvement against our mirror and
the expected observed 128KiB flushes.

## Test Plan

Ran benchmarking as instructed by @charliermarsh 

<img width="1447" alt="image"
src="840d9c8d-4b98-4bfa-89f3-073a2dec1f23">

No performance improvement or regression.
2024-03-09 19:49:29 -05:00
Charlie Marsh
2e9678e5d3
Add support for Metadata 2.2 (#2293)
## Summary

PyPI now supports Metadata 2.2, which means distributions with Metadata
2.2-compliant metadata will start to appear. The upside is that if a
source distribution includes a `PKG-INFO` file with (1) a metadata
version of 2.2 or greater, and (2) no dynamic fields (at least, of the
fields we rely on), we can read the metadata from the `PKG-INFO` file
directly rather than running _any_ of the PEP 517 build hooks.

Closes https://github.com/astral-sh/uv/issues/2009.
2024-03-08 16:02:32 +00:00
Charlie Marsh
6f94dc3c77
Centralize up-to-date checking for path installations (#2168)
## Summary

Internal-only refactor to consolidate multiple codepaths we have for
checking whether a cached or installed entry is up-to-date with a local
requirement.
2024-03-04 14:10:51 -05:00
samypr100
757f8e2f60
feat: improved msg for network timeouts (#1961)
## Summary

Closes #1922

When a timeout occurs, it hints to the user to configure the
`UV_HTTP_TIMEOUT` env var.

Before
```
error: Failed to download distributions
  Caused by: Failed to fetch wheel: torch==2.2.0 
  Caused by: Failed to extract source distribution
  Caused by: request or response body error: operation timed out
  Caused by: operation timed out
```

After
```
error: Failed to download distributions
  Caused by: Failed to fetch wheel: torch==2.2.0 
  Caused by: Failed to extract source distribution
  Caused by: Failed to download distribution due to network timeout. Try increasing UV_HTTP_TIMEOUT.
```

## Test Plan

<!-- How was it tested? -->
Wasn't sure if we'd want a test. If we do, is there a existing mechanism
or preferred approach to force a timeout to occur in tests? Maybe set
the timeout to 1 and add torch as an install check (although it's
possible that could become flaky)?
2024-02-25 21:13:28 +00:00