Commit graph

632 commits

Author SHA1 Message Date
Ibraheem Ahmed
39af09f09b
Parallelize resolver (#3627)
## Summary

This PR introduces parallelism to the resolver. Specifically, we can
perform PubGrub resolution on a separate thread, while keeping all I/O
on the tokio thread. We already have the infrastructure set up for this
with the channel and `OnceMap`, which makes this change relatively
simple. The big change needed to make this possible is removing the
lifetimes on some of the types that need to be shared between the
resolver and pubgrub thread.

A related PR, https://github.com/astral-sh/uv/pull/1163, found that
adding `yield_now` calls improved throughput. With optimal scheduling we
might be able to get away with everything on the same thread here.
However, in the ideal pipeline with perfect prefetching, the resolution
and prefetching can run completely in parallel without depending on one
another. While this would be very difficult to achieve, even with our
current prefetching pattern we see a consistent performance improvement
from parallelism.

This does also require reverting a few of the changes from
https://github.com/astral-sh/uv/pull/3413, but not all of them. The
sharing is isolated to the resolver task.

## Test Plan

On smaller tasks performance is mixed with ~2% improvements/regressions
on both sides. However, on medium-large resolution tasks we see the
benefits of parallelism, with improvements anywhere from 10-50%.

```
./scripts/requirements/jupyter.in
Benchmark 1: ./target/profiling/baseline (resolve-warm)
  Time (mean ± σ):      29.2 ms ±   1.8 ms    [User: 20.3 ms, System: 29.8 ms]
  Range (min … max):    26.4 ms …  36.0 ms    91 runs
 
Benchmark 2: ./target/profiling/parallel (resolve-warm)
  Time (mean ± σ):      25.5 ms ±   1.0 ms    [User: 19.5 ms, System: 25.5 ms]
  Range (min … max):    23.6 ms …  27.8 ms    99 runs
 
Summary
  ./target/profiling/parallel (resolve-warm) ran
    1.15 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm)
```
```
./scripts/requirements/boto3.in   
Benchmark 1: ./target/profiling/baseline (resolve-warm)
  Time (mean ± σ):     487.1 ms ±   6.2 ms    [User: 464.6 ms, System: 61.6 ms]
  Range (min … max):   480.0 ms … 497.3 ms    10 runs
 
Benchmark 2: ./target/profiling/parallel (resolve-warm)
  Time (mean ± σ):     430.8 ms ±   9.3 ms    [User: 529.0 ms, System: 77.2 ms]
  Range (min … max):   417.1 ms … 442.5 ms    10 runs
 
Summary
  ./target/profiling/parallel (resolve-warm) ran
    1.13 ± 0.03 times faster than ./target/profiling/baseline (resolve-warm)
```
```
./scripts/requirements/airflow.in 
Benchmark 1: ./target/profiling/baseline (resolve-warm)
  Time (mean ± σ):     478.1 ms ±  18.8 ms    [User: 482.6 ms, System: 205.0 ms]
  Range (min … max):   454.7 ms … 508.9 ms    10 runs
 
Benchmark 2: ./target/profiling/parallel (resolve-warm)
  Time (mean ± σ):     308.7 ms ±  11.7 ms    [User: 428.5 ms, System: 209.5 ms]
  Range (min … max):   287.8 ms … 323.1 ms    10 runs
 
Summary
  ./target/profiling/parallel (resolve-warm) ran
    1.55 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm)
```
2024-05-17 11:47:30 -04:00
Charlie Marsh
70a1782745
Run cargo update to pull in h2 (#3638) 2024-05-17 11:09:54 -04:00
Charlie Marsh
d76b023cd6
Remove resolve_cli.rs and resolve_many.rs (#3606)
## Summary

These depend on some APIs that I want to be internal-only for the
resolver crate, and we're no longer using them.
2024-05-15 16:04:47 +00:00
Charlie Marsh
913fc91afe
Remove binary from uv-virtualenv crate (#3605)
## Summary

I'm doing some refactoring and it requires updating this binary, but I
doubt we really use it?
2024-05-15 16:02:44 +00:00
Zanie Blue
d417daad7e
Bump version to v0.1.44 (#3577) 2024-05-14 09:05:31 -05:00
konsti
c22c7cad4c
Add parsed URL fields to Dist variants (#3429)
Avoid reparsing urls by storing the parsed parts across resolution on
`Dist`.

Part 2 of https://github.com/astral-sh/uv/issues/3408 and part of #3409

Closes #3408
2024-05-14 01:23:27 +00:00
konsti
0010954ca7
Add parsed URL to PubGrubPackage (#3426)
Avoid reparsing urls by storing the parsed parts across resolution on
`PubGrubPackage`.

Part 1 of #3408
2024-05-14 00:55:21 +00:00
Charlie Marsh
5132c6a6e2
Bump version to v0.1.43 (#3564) 2024-05-14 00:38:11 +00:00
Charlie Marsh
e0da977fc9
Get cargo shear passing (#3533)
## Summary

Remove a few unused deps, and ignore `flate2` (which we include for
feature control).
2024-05-13 01:43:45 +00:00
renovate[bot]
84602c467b
Update Rust crate async-channel to v2.3.0 (#3532) 2024-05-13 01:30:55 +00:00
renovate[bot]
a81752a0cf
Update Rust crate zip to v1.2.3 (#3530) 2024-05-12 20:17:34 -04:00
renovate[bot]
3422bea1a7
Update Rust crate axoupdater to v0.6.2 (#3529) 2024-05-12 20:17:18 -04:00
Charlie Marsh
c2452957f9
Remove unused dependencies (#3527)
Surfaced with `cargo shear`.
2024-05-11 13:33:49 -04:00
Ibraheem Ahmed
783df8f657
Consolidate concurrency limits (#3493)
## Summary

This PR consolidates the concurrency limits used throughout `uv` and
exposes two limits, `UV_CONCURRENT_DOWNLOADS` and
`UV_CONCURRENT_BUILDS`, as environment variables.

Currently, `uv` has a number of concurrent streams that it buffers using
relatively arbitrary limits for backpressure. However, many of these
limits are conflated. We run a relatively small number of tasks overall
and should start most things as soon as possible. What we really want to
limit are three separate operations:
- File I/O. This is managed by tokio's blocking pool and we should not
really have to worry about it.
- Network I/O.
- Python build processes.

Because the current limits span a broad range of tasks, it's possible
that a limit meant for network I/O is occupied by tasks performing
builds, reading from the file system, or even waiting on a `OnceMap`. We
also don't limit build processes that end up being required to perform a
download. While this may not pose a performance problem because our
limits are relatively high, it does mean that the limits do not do what
we want, making it tricky to expose them to users
(https://github.com/astral-sh/uv/issues/1205,
https://github.com/astral-sh/uv/issues/3311).

After this change, the limits on network I/O and build processes are
centralized and managed by semaphores. All other tasks are unbuffered
(note that these tasks are still bounded, so backpressure should not be
a problem).
2024-05-10 12:43:08 -04:00
Andrew Gallant
eab2b832a6
uv-resolver: make hashes optional (#3505)
This only makes hashes optional for wheels/sdists that come from
registires or direct URLs. For wheels/sdists that come from other
sources, a hash should not be present.

For path dependencies, a hash should not be present because the state of
the path dependency is not intended to be tracked in the lock file. This
is consistent with how other tools deal with path dependencies, and if
it were otherwise, the hash would I believe need to be updated for every
change to the path dependency.

For git dependencies (source dists only), a hash should not be present
because the lock will contain the specific commit revision hash. This is
functionally equivalent to a hash, and so a hash is redundant.

As part of this change, we validate the presence or absence of a hash
based on the dependency source. We also add our first regression tests.
2024-05-10 10:32:30 -04:00
Charlie Marsh
5e3260933c
Run cargo update (#3488) 2024-05-09 15:09:13 -04:00
Charlie Marsh
367958e6b2
Bump version to v0.1.42 (#3472) 2024-05-08 17:47:16 -04:00
Charlie Marsh
7d41e7d260
Respect MACOSX_DEPLOYMENT_TARGET in --python-platform (#3470)
## Summary

This is universal environment variable used to determine the mac OS
deployment target. We now respect it in `--python-platform` -- so we
default to 12.0, but users can override it as needed.
2024-05-08 20:03:41 +00:00
Charlie Marsh
18d229e2bb
Upgrade async_http_range_reader to v0.8.0 (#3460)
## Summary

Closes #2025.
Closes https://github.com/astral-sh/uv/issues/3255.
Closes https://github.com/astral-sh/uv/pull/2843.
2024-05-08 10:54:08 -04:00
konsti
1ad6aa8a23
Use generic pubgrub incompatibility reason (#3335)
Pubgrub got a new feature where all unavailability is a custom, instead
of the reasonless `UnavailableDependencies` and our custom `String` type
previously (https://github.com/pubgrub-rs/pubgrub/pull/208). This PR
introduces a `UnavailableReason` that tracks either an entire version
being unusable, or a specific version. The error messages now also track
this difference properly.

The pubgrub commit is our main rebased onto the merged
https://github.com/pubgrub-rs/pubgrub/pull/208, i'll push
`konsti/main-rebase-generic-reason` to `main` after checking for rebase
problems.
2024-05-08 08:40:15 +00:00
Charlie Marsh
b0d7a26431
Bump version to v0.1.41 (#3446) 2024-05-08 01:05:47 +00:00
Charlie Marsh
7fe9f0a3dd
Bump version to v0.1.40 (#3433) 2024-05-07 17:29:29 +00:00
renovate[bot]
1efaf3fd31
Update Rust crate junction to v1.1.0 (#3393) 2024-05-06 02:41:15 +00:00
renovate[bot]
aec9573d42
Update Rust crate mailparse to 0.15.0 (#3394) 2024-05-06 02:29:55 +00:00
renovate[bot]
c125ce394a
Update Rust crate rmp-serde to v1.3.0 (#3395) 2024-05-06 02:29:24 +00:00
renovate[bot]
5a8d13a05a
Update Rust crate axoupdater to 0.6.0 (#3392) 2024-05-06 02:12:42 +00:00
renovate[bot]
1cf07ebf08
Update Rust crate zip to v1.1.4 (#3390) 2024-05-06 02:12:21 +00:00
renovate[bot]
409f38c7f5
Update Rust crate tokio-util to v0.7.11 (#3389) 2024-05-06 02:09:04 +00:00
renovate[bot]
cce8e0dd7d
Update Rust crate test-log to v0.2.16 (#3388) 2024-05-06 02:06:27 +00:00
renovate[bot]
c06266874e
Update Rust crate serde to v1.0.200 (#3387) 2024-05-06 02:04:30 +00:00
renovate[bot]
e6efe48d52
Update Rust crate cargo-util to v0.2.11 (#3386) 2024-05-06 02:02:11 +00:00
renovate[bot]
0042533dc3
Update Rust crate base64 to v0.22.1 (#3385) 2024-05-06 02:01:09 +00:00
renovate[bot]
a9756872e6
Update Rust crate anstream to v0.6.14 (#3384) 2024-05-06 02:00:17 +00:00
Charlie Marsh
37635fda56
Update activation scripts from virtualenv (#3376)
## Summary

Refreshes some of the activation scripts, and fixes some bugs in
`activate_this.py` that were likely the rest of some erroneous
copy-pasting.

Closes https://github.com/astral-sh/uv/issues/3346.

## Test Plan

```
❯ python
Python 3.12.0 (main, Feb 28 2024, 09:44:16) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import httpx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'httpx'
>>> activator = '.venv/bin/activate_this.py'
>>> with open(activator) as f:
...     exec(f.read(), {'__file__': activator})
...
>>> import httpx
```
2024-05-04 23:30:00 +00:00
Charlie Marsh
69e99b3502
Use canonical URLs in satisfaction check (#3373)
## Summary

Closes https://github.com/astral-sh/uv/issues/3367.
2024-05-04 12:44:25 +00:00
Charlie Marsh
100935f4f1
Respect editable = true setting in sources map (#3363)
## Summary

We need to partition the editable and non-editable requirements. As-is,
`editable = true` requirements were still being installed as
non-editable.
2024-05-04 02:17:55 +00:00
konsti
4f87edbe66
Add basic tool.uv.sources support (#3263)
## Introduction

PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification

The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.

## `tool.uv.source`

Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:

```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
  "tqdm >=4.66.2,<5",
  "torch ==2.2.2",
  "transformers[torch] >=4.39.3,<5",
  "importlib_metadata >=7.1.0,<8; python_version < '3.10'",
  "mollymawk ==0.1.0"
]

[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }

[tool.uv.workspace]
include = [
  "packages/mollymawk"
]

[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```

See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.

This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.

## Changes

This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
  * Git dependencies
  * Url dependencies
  * Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)

It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies

One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.

## Implementation

The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.

We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.

I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.

I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.

## Demo

Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:


![pycharm](599082c7-6be5-41c1-a3cd-516092382f8d)


[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
2024-05-03 21:10:50 +00:00
Zanie Blue
630d3fde5c
Merge uv-toolchain and uv-interpreter (#3265)
Moves all of `uv-toolchain` into `uv-interpreter`. We may split these
out in the future, but the refactoring I want to do for interpreter
discovery is easier if I don't have to deal with entanglement. Includes
some restructuring of `uv-interpreter`.

Part of #2386
2024-04-30 17:49:46 +00:00
Ibraheem Ahmed
1d2c57a259
Run resolve/install benchmarks in ci (#3281)
## Summary

Runs resolver benchmarks in CI with CodSpeed.
2024-04-30 13:39:42 -04:00
Andrew Gallant
d2e7c0554b
uv-resolver: add initial version of universal lock file format (#3314)
This is meant to be a base on which to build. There are some parts
which are implicitly incomplete and others which are explicitly
incomplete. The latter are indicated by TODO comments.

Here is a non-exhaustive list of incomplete things. In many cases, these
are incomplete simply because the data isn't present in a
`ResolutionGraph`. Future work will need to refactor our resolver so
that this data is correctly passed down.

* Not all wheels are included. Only the "selected" wheel for the current
  distribution is included.
* Marker expressions are always absent.
* We don't emit hashes for certainly kinds of distributions (direct
  URLs, git, and path).
* We don't capture git information from a dependency specification.
  Right now, we just always emit "default branch."

There are perhaps also other changes we might want to make to the format
of a more cosmetic nature. Right now, all arrays are encoded using
whatever the `toml` crate decides to do. But we might want to exert more
control over this. For example, by using inline tables or squashing more
things into strings (like I did for `Source` and `Hash`). I think the
main trade-off here is that table arrays are somewhat difficult to read
(especially without indentation), where as squashing things down into a
more condensed format potentially makes future compatible additions
harder.

I also went pretty light on the documentation here than what I would
normally do. That's primarily because I think this code is going to
go through some evolution and I didn't want to spend too much time
documenting something that is likely to change.

Finally, here's an example of the lock file format in TOML for the
`anyio` dependency. I generated it with the following command:

```
cargo run -p uv -- pip compile -p3.10 ~/astral/tmp/reqs/anyio.in --unstable-uv-lock-file
```

And that writes out a `uv.lock` file:

```toml
version = 1

[[distribution]]
name = "anyio"
version = "4.3.0"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "2f20c40b45/anyio-4.3.0-py3-none-any.whl"
hash = "sha256:048e05d0f6caeed70d731f3db756d35dcc1f35747c8c403364a8332c630441b8"

[[distribution.dependencies]]
name = "exceptiongroup"
version = "1.2.1"
source = "registry+https://pypi.org/simple"

[[distribution.dependencies]]
name = "idna"
version = "3.7"
source = "registry+https://pypi.org/simple"

[[distribution.dependencies]]
name = "sniffio"
version = "1.3.1"
source = "registry+https://pypi.org/simple"

[[distribution.dependencies]]
name = "typing-extensions"
version = "4.11.0"
source = "registry+https://pypi.org/simple"

[[distribution]]
name = "exceptiongroup"
version = "1.2.1"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "79fe92dd41/exceptiongroup-1.2.1-py3-none-any.whl"
hash = "sha256:5258b9ed329c5bbdd31a309f53cbfb0b155341807f6ff7606a1e801a891b29ad"

[[distribution]]
name = "idna"
version = "3.7"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "741d8c8280/idna-3.7-py3-none-any.whl"
hash = "sha256:82fee1fc78add43492d3a1898bfa6d8a904cc97d8427f683ed8e798d07761aa0"

[[distribution]]
name = "sniffio"
version = "1.3.1"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "75a9c94214/sniffio-1.3.1-py3-none-any.whl"
hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2"

[[distribution]]
name = "typing-extensions"
version = "4.11.0"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "936e209267/typing_extensions-4.11.0-py3-none-any.whl"
hash = "sha256:c1f94d72897edaf4ce775bb7558d5b79d8126906a14ea5ed1635921406c0387a"
```
2024-04-29 14:03:17 -04:00
konsti
1344cfae4b
Use fs_err for cachedir errors (#3304)
When running

```
set UV_CACHE_DIR=%LOCALAPPDATA%\uv\cache-foo && uv venv venv
```

in windows CMD, the error would be just

```
error: The system cannot find the path specified. (os error 3)
```

The problem is that the first action in the cache dir is adding the tag,
and the `cachedir` crate is using `std::fs` instead of `fs_err`. I've
copied the two functions we use from the crate and changed the import
from `std::fs` to `fs_err`.

The new error is

```
error: failed to open file `C:\Users\Konstantin\AppData\Local\uv\cache-foo \CACHEDIR.TAG`
  Caused by: The system cannot find the path specified. (os error 3)
```

which correctly explains the problem.

Closes #3280
2024-04-29 16:33:10 +02:00
renovate[bot]
22d8619c37
Update Rust crate data-encoding to v2.6.0 (#3302) 2024-04-29 09:31:59 -05:00
renovate[bot]
f014bb60ce
Update Rust crate zip to v1.1.2 (#3300) 2024-04-29 09:31:06 -05:00
renovate[bot]
f5214a2f73
Update Rust crate unicode-width to v0.1.12 (#3299) 2024-04-29 09:30:35 -05:00
renovate[bot]
d66f5590f6
Update Rust crate serde to v1.0.199 (#3298) 2024-04-29 09:30:01 -05:00
renovate[bot]
e25dbd9cfb
Update Rust crate schemars to v0.8.17 (#3297) 2024-04-29 09:29:45 -05:00
renovate[bot]
4f8017a0c6
Update Rust crate mimalloc to v0.1.41 (#3296) 2024-04-29 09:29:32 -05:00
renovate[bot]
8f3bc97146
Update Rust crate async-compression to v0.4.9 (#3294) 2024-04-29 09:29:15 -05:00
renovate[bot]
af3be68be3
Update Rust crate flate2 to v1.0.30 (#3295) 2024-04-29 09:26:23 -05:00
Charlie Marsh
028b407411
Bump version to v0.1.39 (#3286) 2024-04-27 07:18:16 -04:00