Commit graph

371 commits

Author SHA1 Message Date
konsti
d0c3146ef6
Restore verbatim in error message (#3402)
Fixup for
https://github.com/astral-sh/uv/pull/3263#discussion_r1589718035
2024-05-06 11:17:06 +00:00
konsti
9de49c8a60
Make pubgrub an allowed ident (#3399)
Followup to #3361, fix some backtick-quoting.
2024-05-06 09:10:37 +00:00
konsti
4f87edbe66
Add basic tool.uv.sources support (#3263)
## Introduction

PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification

The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.

## `tool.uv.source`

Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:

```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
  "tqdm >=4.66.2,<5",
  "torch ==2.2.2",
  "transformers[torch] >=4.39.3,<5",
  "importlib_metadata >=7.1.0,<8; python_version < '3.10'",
  "mollymawk ==0.1.0"
]

[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }

[tool.uv.workspace]
include = [
  "packages/mollymawk"
]

[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```

See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.

This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.

## Changes

This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
  * Git dependencies
  * Url dependencies
  * Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)

It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies

One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.

## Implementation

The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.

We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.

I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.

I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.

## Demo

Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:


![pycharm](599082c7-6be5-41c1-a3cd-516092382f8d)


[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
2024-05-03 21:10:50 +00:00
samypr100
2ffb252498
Update Rust to v1.78 (#3361)
## Summary

Updates rust to 1.78 in `rust-toolchain.toml`

See: https://blog.rust-lang.org/2024/05/02/Rust-1.78.0.html

### Potential blockers

* homebre still on 1.77 -
https://github.com/Homebrew/homebrew-core/pull/170649
* conda-forge still on 1.77 - https://anaconda.org/conda-forge/rust
2024-05-03 20:07:13 +00:00
Andrew Gallant
1089abda3f
require serde and rkyv everywhere; remove optional serde and rkyv features (#3345)
In *some* places in our crates, `serde` (and `rkyv`) are optional
dependencies. I believe this was done out of reasons of "good sense,"
that is, it follows a Rust ecosystem pattern where serde integration
tends to be an opt-in crate feature. (And similarly for `rkyv`.)

However, ultimately, `uv` itself requires `serde` and `rkyv` to
function. Since our crates are strictly internal, there are limited
consumers for our crates without `serde` (and `rkyv`) enabled. I think
one possibility is that optional `serde` (and `rkyv`) integration means
that someone can do this:

    cargo test -p pep440_rs

And this will run tests _without_ `serde` or `rkyv` enabled. That in
turn could lead to faster iteration time by reducing compile times. But,
I'm not sure this is worth supporting. The iterative compilation times
of
individual crates are probably fast enough in debug mode, even with
`serde` and `rkyv` enabled. Namely, `serde` and `rkyv` themselves
shouldn't need to be re-compiled in most cases. On `main`:

```
from-scratch: `cargo test -p pep440_rs --lib` 0.685
incremental: `cargo test -p pep440_rs --lib` 0.278s
from-scratch: `cargo test -p pep440_rs --features serde,rkyv --lib` 3.948s
incremental: `cargo test -p pep440_rs --features serde,rkyv --lib` 0.321s
```

So while a from-scratch build does take significantly longer, an
incremental build is about the same.

The benefit of doing this change is two-fold:

1. It brings out crates into alignment with "reality." In particular,
   some crates were _implicitly_ relying on `serde` being enabled
   without explicitly declaring it. This technically means that our
   `Cargo.toml`s were wrong in some cases, but it is hard to observe it
   because of feature unification in a Cargo workspace.
2. We no longer need to deal with the cognitive burden of writing
   `#[cfg_attr(feature = "serde", ...)]` everywhere.
2024-05-03 10:21:03 -04:00
Andrew Gallant
7772e6249f
add basic "install from lock file" operation (#3340)
This PR principally adds a routine for converting a `Lock` to a
`Resolution`, where a `Resolution` is a map of package names pinned to
a specific version.

I'm not sure that a `Resolution` is ultimately what we want here (we
might need more stuff), but this was the quickest route I could find to
plug a `Lock` into our existing `uv pip install` infrastructure.

This commit also does a little refactoring of the `Lock` types. The
main thing is to permit extra state on some of the types (like a
`by_id` map on `Lock` for quick lookups of distributions) that aren't
included in the serialization format of a `Lock`. We achieve this
by defining separate `Wire` types that are automatically converted
to-and-from via `serde`.

Note that like with the lock file format types themselves, we leave a
few `todo!()` expressions around. The main idea is to get something
minimally working without spending too much effort here. (A fair bit
of refactoring will be required to generate a lock file, and it's
not clear how much this code will wind up needing to change anyway.)
In particular, we only handle the case of installing wheels from a
registry.

A demonstration of the full flow:

```
$ requirements.in
anyio
$ cargo run -p uv -- pip compile -p3.10 requirements.in --unstable-uv-lock-file
$ uv venv
$ cargo run -p uv -- pip install --unstable-uv-lock-file anyio -r requirements.in
Installed 5 packages in 7ms
 + anyio==4.3.0
 + exceptiongroup==1.2.1
 + idna==3.7
 + sniffio==1.3.1
 + typing-extensions==4.11.0
```

In order to install from a lock file, we start from the root and do a
breadth first traversal over its dependencies. We aren't yet filtering
on marker expressions (since they aren't in the lock file yet), but we
should be able to add that in the future. In so doing, the traversal
should select only the subset of distributions relevant for the current
platform.
2024-05-03 08:18:36 -04:00
Ibraheem Ahmed
c5cd808876
Remove uncondtional serde usage in uv-resolver (#3317)
## Summary

Makes the `serde` implementations added in https://github.com/astral-sh/uv/pull/3314 conditional on uv-resolver's `serde` feature.
2024-04-29 16:31:37 -04:00
Andrew Gallant
d2e7c0554b
uv-resolver: add initial version of universal lock file format (#3314)
This is meant to be a base on which to build. There are some parts
which are implicitly incomplete and others which are explicitly
incomplete. The latter are indicated by TODO comments.

Here is a non-exhaustive list of incomplete things. In many cases, these
are incomplete simply because the data isn't present in a
`ResolutionGraph`. Future work will need to refactor our resolver so
that this data is correctly passed down.

* Not all wheels are included. Only the "selected" wheel for the current
  distribution is included.
* Marker expressions are always absent.
* We don't emit hashes for certainly kinds of distributions (direct
  URLs, git, and path).
* We don't capture git information from a dependency specification.
  Right now, we just always emit "default branch."

There are perhaps also other changes we might want to make to the format
of a more cosmetic nature. Right now, all arrays are encoded using
whatever the `toml` crate decides to do. But we might want to exert more
control over this. For example, by using inline tables or squashing more
things into strings (like I did for `Source` and `Hash`). I think the
main trade-off here is that table arrays are somewhat difficult to read
(especially without indentation), where as squashing things down into a
more condensed format potentially makes future compatible additions
harder.

I also went pretty light on the documentation here than what I would
normally do. That's primarily because I think this code is going to
go through some evolution and I didn't want to spend too much time
documenting something that is likely to change.

Finally, here's an example of the lock file format in TOML for the
`anyio` dependency. I generated it with the following command:

```
cargo run -p uv -- pip compile -p3.10 ~/astral/tmp/reqs/anyio.in --unstable-uv-lock-file
```

And that writes out a `uv.lock` file:

```toml
version = 1

[[distribution]]
name = "anyio"
version = "4.3.0"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "2f20c40b45/anyio-4.3.0-py3-none-any.whl"
hash = "sha256:048e05d0f6caeed70d731f3db756d35dcc1f35747c8c403364a8332c630441b8"

[[distribution.dependencies]]
name = "exceptiongroup"
version = "1.2.1"
source = "registry+https://pypi.org/simple"

[[distribution.dependencies]]
name = "idna"
version = "3.7"
source = "registry+https://pypi.org/simple"

[[distribution.dependencies]]
name = "sniffio"
version = "1.3.1"
source = "registry+https://pypi.org/simple"

[[distribution.dependencies]]
name = "typing-extensions"
version = "4.11.0"
source = "registry+https://pypi.org/simple"

[[distribution]]
name = "exceptiongroup"
version = "1.2.1"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "79fe92dd41/exceptiongroup-1.2.1-py3-none-any.whl"
hash = "sha256:5258b9ed329c5bbdd31a309f53cbfb0b155341807f6ff7606a1e801a891b29ad"

[[distribution]]
name = "idna"
version = "3.7"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "741d8c8280/idna-3.7-py3-none-any.whl"
hash = "sha256:82fee1fc78add43492d3a1898bfa6d8a904cc97d8427f683ed8e798d07761aa0"

[[distribution]]
name = "sniffio"
version = "1.3.1"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "75a9c94214/sniffio-1.3.1-py3-none-any.whl"
hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2"

[[distribution]]
name = "typing-extensions"
version = "4.11.0"
source = "registry+https://pypi.org/simple"

[[distribution.wheel]]
url = "936e209267/typing_extensions-4.11.0-py3-none-any.whl"
hash = "sha256:c1f94d72897edaf4ce775bb7558d5b79d8126906a14ea5ed1635921406c0387a"
```
2024-04-29 14:03:17 -04:00
Yorick
43181f1933
Implement --index-strategy unsafe-best-match (#3138)
## Summary

This index strategy resolves every package to the latest possible
version across indexes. If a version is in multiple indexes, the first
available index is selected.

Implements #3137 

This closely matches pip.

## Test Plan

Good question. I'm hesitant to use my certifi example here, since that
would inevitably break when torch removes this package. Please comment!
2024-04-27 01:24:54 +00:00
Ibraheem Ahmed
20e9589662
Combine dependency clauses with the same root (#3225)
## Summary

Simplifies dependency errors of the form `you require package-a and you
require package-b` to `you require package-a and package-b`. Resolves
https://github.com/astral-sh/uv/issues/1009.
2024-04-24 12:34:32 -04:00
Andrew Gallant
0b84eb0140
once-map: avoid hard-coding Arc (#3242)
The only thing a `OnceMap` really needs to be able to do with the value
is to clone it. All extant uses benefited from having this done for them
by automatically wrapping values in an `Arc`. But this isn't necessarily
true for all things. For example, a value might have an `Arc` internally
to making cloning cheap in other contexts, and it doesn't make sense to
re-wrap it in an `Arc` just to use it with a `OnceMap`. Or
alternatively, cloning might just be cheap enough on its own that an
`Arc` isn't worth it.
2024-04-24 11:11:46 -04:00
Charlie Marsh
84989a3f49
Unroll self-dependencies via extras (#3230)
## Summary

We now recursively expand any self-dependencies via extras, which lets
us detect conflicts sooner and avoid building unnecessary versions of
packages that are excluded via the extra.

Closes https://github.com/astral-sh/uv/issues/3135.
2024-04-24 07:51:56 -04:00
Charlie Marsh
8b711d2e4d
Avoid adding extras when expanding constraints (#3232)
## Summary

See the diff in the tests. If you have a constraint with an extra, we
should respect it, but we shouldn't _add_ the extra to the requirements.
2024-04-24 02:00:27 +00:00
Charlie Marsh
14f05f27b3
Add ticks around error messages more consistently (#3004)
## Summary

I found some of these too bare (e.g., when they _just_ show a package
name with no other information). For me, this makes it easier to
differentiate error message copy from data. But open to other opinions.
Take a look at the fixture changes and LMK!
2024-04-22 23:58:36 +00:00
Charlie Marsh
792a917a97
Restrict observed requirements to direct when --no-deps is specified (#3191)
## Summary

This PR avoids: (1) using the lookahead resolver when `--no-deps` is
specified (we'll never use those requirements), and (2) including any
transitive requirements when searching for allowed URLs, etc., when
`--no-deps` is specified.

Closes https://github.com/astral-sh/uv/issues/3183.
2024-04-22 17:17:58 +00:00
Charlie Marsh
a4f125ca34
Avoid waiting for metadata for --no-deps editables (#3188)
## Summary

We don't emit a request for this, so we shouldn't wait for it either --
we already have the metadata!

Closes https://github.com/astral-sh/uv/issues/3184.
2024-04-22 16:29:19 +00:00
konsti
82c4772e89
Move unnamed requirements to their own pep508_rs module and requirements-txt (#3186)
Another refactoring in preparation of using a richer requirements type.
No functional changes, only moves code around
2024-04-22 14:02:39 +00:00
Charlie Marsh
fda378fd29
Avoid preferring constrained over unconstrained packages (#3148)
## Summary

pip prefers somewhat-constrained over unconstrained packages... but only
if they're at equal depths in the tree. We don't have a way to track the
latter property yet (I've added a TODO), so for now, we should remove
this constraint -- it seems to be counter-productive.

I've filed https://github.com/astral-sh/uv/issues/3149 as a follow-up.

Closes https://github.com/astral-sh/uv/issues/3143

## Test Plan

- `git clone https://github.com/drivendataorg/zamba.git`
- `cat "-e .[tests]" > req.in`
- `cargo run venv && cargo run pip compile req.in --refresh -n
--python-platform linux --python-version 3.8`
2024-04-19 23:30:08 +00:00
Charlie Marsh
a241bc79b1
Add priorities for editables (#3133)
## Summary

We weren't setting a priority for editables, so they were being visited
last.

I think there's still a problem whereby we're not aggressive enough in
visiting recursive extras (and, in fact, that's making it really hard to
write a test -- I wrote a test, but the most-reduced case still fails,
and I'd need to add a layer of indirection to make it
fail-on-main-but-pass-on-this-branch), but that problem likely already
existed on main prior to #3087, so I just want to get this quick fix out
now.

Closes https://github.com/astral-sh/uv/issues/3127.

## Test Plan

- `git clone https://github.com/cda-tum/mqt-core.git`
- `cargo run venv`
- `cargo run pip install 'scikit-build-core[pyproject]>=0.8.1'
'setuptools_scm>=7' 'pybind11>=2.12' --resolution=lowest-direct`
- `cargo run pip install --no-build-isolation
'-ve.[test,qiskit,evaluation,coverage]' --resolution=lowest-direct`
2024-04-19 02:04:58 +00:00
Charlie Marsh
2e88bb6f1b
Add a proxy layer for extras (#3100)
Given requirements like:

```
black==23.1.0
black[colorama]
```

The resolver will (on `main`) add a dependency on Black, and then try to
use the most recent version of Black to satisfy `black[colorama]`. For
sake of example, assume `black==24.0.0` is the most recent version. Once
the selects this most recent version, it'll fetch the metadata, then
return the dependencies for `black==24.0.0` with the `colorama` extra
enabled. Finally, it will tack on `black==24.0.0` (a dependency on the
base package). The resolver will then detect a conflict between
`black==23.1.0` and `black==24.0.0`, and throw out
`black[colorama]==24.0.0`, trying to next most-recent version.

This is both wasteful and can cause problems, since we're fetching
metadata for versions that will _never_ satisfy the resolver. In the
`apache-airflow[all]` case, I also ran into an issue whereby we were
attempting to build very old versions of `apache-airflow` due to
`apache-airflow[pandas]`, which in turn led to resolution failures.

The solution proposed here is that we create a new proxy package with
exactly two dependencies: one on `black` and one of `black[colorama]`.
Both of these packages must be at the same version as the proxy package,
so the resolver knows much _earlier_ that (in the above example) the
extra variant _must_ match `23.1.0`.
2024-04-19 01:04:59 +00:00
Chan Kang
8c7d0a31e6
Hide password in the index printed via --emit-index-annotation (#3112)
## Summary

resolves https://github.com/astral-sh/uv/issues/3106

## Test Plan

added a simple test where the password provided in `UV_INDEX_URL` is
hidden in the output as expected.
2024-04-18 03:59:44 +00:00
Charlie Marsh
7fb2bf816f
Add JSON Schema support (#3046)
## Summary

This PR adds JSON Schema support. The setup mirrors Ruff's own.
2024-04-17 17:24:41 +00:00
Charlie Marsh
b456fa2939
Incorporate heuristics to improve package prioritization (#3087)
See: https://github.com/astral-sh/uv/issues/3078
2024-04-17 14:21:42 +00:00
konsti
d1b07a3f49
Log versions tried from batch prefetch (#3090)
This is required for evaluating #3087.

This also removes tracking of virtual packages from extras from the
batch prefetcher (we only track real packages).

Let's look at some stats:
* jupyter: Tried 100 versions: anyio 1, argon2-cffi 1,
argon2-cffi-bindings 1, arrow 1, asttokens 1, async-lru 1, attrs 1,
babel 1, beautifulsoup4 1, bleach 1, certifi 1, cffi 1,
charset-normalizer 1, comm 1, debugpy 1, decorator 1, defusedxml 1,
exceptiongroup 1, executing 1, fastjsonschema 1, fqdn 1, h11 1, httpcore
1, httpx 1, idna 1, ipykernel 1, ipython 1, ipywidgets 1, isoduration 1,
jedi 1, jinja2 1, json5 1, jsonpointer 1, jsonschema 1,
jsonschema-specifications 1, jupyter 1, jupyter-client 1,
jupyter-console 1, jupyter-core 1, jupyter-events 1, jupyter-lsp 1,
jupyter-server 1, jupyter-server-terminals 1, jupyterlab 1,
jupyterlab-pygments 1, jupyterlab-server 1, jupyterlab-widgets 1,
markupsafe 1, matplotlib-inline 1, mistune 1, nbclient 1, nbconvert 1,
nbformat 1, nest-asyncio 1, notebook 1, notebook-shim 1, overrides 1,
packaging 1, pandocfilters 1, parso 1, pexpect 1, platformdirs 1,
prometheus-client 1, prompt-toolkit 1, psutil 1, ptyprocess 1, pure-eval
1, pycparser 1, pygments 1, python-dateutil 1, python-json-logger 1,
pyyaml 1, pyzmq 1, qtconsole 1, qtpy 1, referencing 1, requests 1,
rfc3339-validator 1, rfc3986-validator 1, root 1, rpds-py 1, send2trash
1, six 1, sniffio 1, soupsieve 1, stack-data 1, terminado 1, tinycss2 1,
tomli 1, tornado 1, traitlets 1, types-python-dateutil 1,
typing-extensions 1, uri-template 1, urllib3 1, wcwidth 1, webcolors 1,
webencodings 1, websocket-client 1, widgetsnbextension 1
* boto3: botocore 1697, boto3 849, urllib3 2, jmespath 1,
python-dateutil 1, root 1, s3transfer 1, six 1
* transformers-extras: Tried 1191 versions: sagemaker 152, hypothesis
67, tensorflow 21, jsonschema 19, tensorflow-cpu 18, multiprocess 10,
pathos 10, tensorflow-text 10, chex 8, tf-keras 8, tf2onnx 8, aiohttp 6,
aiosignal 6, alembic 6, annotated-types 6, apscheduler 6, attrs 6,
backoff 6, binaryornot 6, black 6, boto3 6, click 6, coloredlogs 6,
colorlog 6, dash 6, dash-bootstrap-components 6, dlinfo 6,
exceptiongroup 6, execnet 6, fire 6, frozenlist 6, gitdb 6, google-auth
6, google-auth-oauthlib 6, hjson 6, iniconfig 6, jinja2-time 6, markdown
6, markdown-it-py 6, markupsafe 6, mpmath 6, namex 6, nbformat 6, ninja
6, nvidia-nvjitlink-cu12 6, onnxconverter-common 6, pandas 6, plac 6,
platformdirs 6, pluggy 6, portalocker 6, poyo 6, protobuf3-to-dict 6,
py-cpuinfo 6, py3nvml 6, pyarrow 6, pyarrow-hotfix 6, pydantic-core 6,
pygments 6, pynvml 6, pypng 6, python-slugify 6, responses 6,
smdebug-rulesconfig 6, soupsieve 6, sqlalchemy 6,
tensorboard-data-server 6, tensorboard-plugin-wit 6, tensorboardx 6,
threadpoolctl 6, tomli 6, wasabi 6, wcwidth 6, werkzeug 6, wheel 6,
xxhash 6, zipp 6, etils 5, tensorboard 5, beautifulsoup4 4, cffi 4,
clldutils 4, codecarbon 4, datasets 4, dill 4, evaluate 4, gitpython 4,
hf-doc-builder 4, kenlm 4, librosa 4, llvmlite 4, nest-asyncio 4, nltk
4, optuna 4, parameterized 4, phonemizer 4, psutil 4, pyctcdecode 4,
pytest 4, pytest-timeout 4, pytest-xdist 4, ray 4, rjieba 4, rouge-score
4, ruff 4, sacrebleu 4, sacremoses 4, sigopt 4, sortedcontainers 4,
tensorstore 4, timeout-decorator 4, toolz 4, torchaudio 4, accelerate 3,
audioread 3, cookiecutter 3, decorator 3, deepspeed 3, faiss-cpu 3, flax
3, fugashi 3, ipadic 3, isort 3, jax 3, jaxlib 3, joblib 3, keras-nlp 3,
lazy-loader 3, numba 3, optax 3, pooch 3, pydantic 3, pygtrie 3, rhoknp
3, scikit-learn 3, segments 3, soundfile 3, soxr 3, sudachidict-core 3,
sudachipy 3, torch 3, unidic 3, unidic-lite 3, urllib3 3, absl-py 2,
arrow 2, astunparse 2, async-timeout 2, botocore 2, cachetools 2,
certifi 2, chardet 2, charset-normalizer 2, csvw 2, dash-core-components
2, dash-html-components 2, dash-table 2, diffusers 2, dm-tree 2,
fastjsonschema 2, flask 2, flatbuffers 2, fsspec 2, ftfy 2, gast 2,
google-pasta 2, greenlet 2, grpcio 2, h5py 2, humanfriendly 2, idna 2,
importlib-metadata 2, importlib-resources 2, jinja2 2, jmespath 2,
jupyter-core 2, kagglehub 2, keras 2, keras-core 2, keras-preprocessing
2, libclang 2, mako 2, mdurl 2, ml-dtypes 2, msgpack 2, multidict 2,
mypy-extensions 2, networkx 2, nvidia-cublas-cu12 2,
nvidia-cuda-cupti-cu12 2, nvidia-cuda-nvrtc-cu12 2,
nvidia-cuda-runtime-cu12 2, nvidia-cudnn-cu12 2, nvidia-cufft-cu12 2,
nvidia-curand-cu12 2, nvidia-cusolver-cu12 2, nvidia-cusparse-cu12 2,
nvidia-nccl-cu12 2, nvidia-nvtx-cu12 2, onnx 2, onnxruntime 2,
onnxruntime-tools 2, opencv-python 2, opt-einsum 2, orbax-checkpoint 2,
pathspec 2, plotly 2, pox 2, ppft 2, pyasn1-modules 2, pycparser 2,
pyrsistent 2, python-dateutil 2, pytz 2, requests-oauthlib 2, retrying
2, rich 2, rsa 2, s3transfer 2, scipy 2, setuptools 2, six 2, smmap 2,
sympy 2, tabulate 2, tensorflow-estimator 2, tensorflow-hub 2,
tensorflow-io-gcs-filesystem 2, termcolor 2, text-unidecode 2, traitlets
2, triton 2, typing-extensions 2, tzdata 2, tzlocal 2, wrapt 2,
xmltodict 2, yarl 2, Python 1, av 1, babel 1, bibtexparser 1, blinker 1,
colorama 1, decord 1, filelock 1, huggingface-hub 1, isodate 1,
itsdangerous 1, language-tags 1, lxml 1, numpy 1, oauthlib 1, packaging
1, pillow 1, protobuf 1, pyasn1 1, pylatexenc 1, pyparsing 1, pyyaml 1,
rdflib 1, regex 1, requests 1, rfc3986 1, root 1, safetensors 1,
sentencepiece 1, tenacity 1, timm 1, tokenizers 1, torchvision 1, tqdm
1, transformers 1, types-python-dateutil 1, uritemplate 1


You can reproduce them with python 3.10 and:
```
RUST_LOG=uv_resolver=debug cargo run pip compile -o /dev/null -q scripts/requirements/<input>.in 2>&1 | tail -n 1
```

Closes #2270 - This is less invasive compared to the other PR, we can
revisit number of network/build request tracking later.
2024-04-17 09:08:21 +00:00
Charlie Marsh
b3f98d5e05
Use kebab-case for serde enums (#3080)
By default, these serialize as (e.g.) `LowestDirect`. This now matches
the format we use in Ruff.
2024-04-17 01:13:39 +00:00
Charlie Marsh
295b58ad37
Add uv-workspace crate with settings discovery and deserialization (#3007)
## Summary

This PR adds basic struct definitions along with a "workspace" concept
for discovering settings. (The "workspace" terminology is used to match
Ruff; I did not invent it.)

A few notes:

- We discover any `pyproject.toml` or `uv.toml` file in any parent
directory of the current working directory. (We could adjust this to
look at the directories of the input files.)
- We don't actually do anything with the configuration yet; but those
PRs are large and I want this to be reviewed in isolation.
2024-04-16 13:56:47 -04:00
konsti
88d6a55dbf
Show package name in no version for direct dependency error (#3056)
Fixes #3053
2024-04-16 07:57:13 +00:00
Charlie Marsh
1f626bfc73
Move ExcludeNewer into its own type (#3041)
## Summary

This makes it easier to add (e.g.) JSON Schema derivations to the type.

If we have support for other dates in the future, we can generalize it
to a `UserDate` or similar.
2024-04-15 20:24:08 +00:00
Charlie Marsh
96c3c2e774
Support unnamed requirements in --require-hashes (#2993)
## Summary

This PR enables `--require-hashes` with unnamed requirements. The key
change is that `PackageId` becomes `VersionId` (since it refers to a
package at a specific version), and the new `PackageId` consists of
_either_ a package name _or_ a URL. The hashes are keyed by `PackageId`,
so we can generate the `RequiredHashes` before we have names for all
packages, and enforce them throughout.

Closes #2979.
2024-04-11 11:26:50 -04:00
Charlie Marsh
006379c50c
Add support for URL requirements in --generate-hashes (#2952)
## Summary

This PR enables hash generation for URL requirements when the user
provides `--generate-hashes` to `pip compile`. While we include the
hashes from the registry already, today, we omit hashes for URLs.

To power hash generation, we introduce a `HashPolicy` abstraction:

```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum HashPolicy<'a> {
    /// No hash policy is specified.
    None,
    /// Hashes should be generated (specifically, a SHA-256 hash), but not validated.
    Generate,
    /// Hashes should be validated against a pre-defined list of hashes. If necessary, hashes should
    /// be generated so as to ensure that the archive is valid.
    Validate(&'a [HashDigest]),
}
```

All of the methods on the distribution database now accept this policy,
instead of accepting `&'a [HashDigest]`.

Closes #2378.
2024-04-10 20:02:45 +00:00
Charlie Marsh
8513d603b4
Return computed hashes from metadata requests (#2951)
## Summary

This PR modifies the distribution database to return both the
`Metadata23` and the computed hashes when clients request metadata.

No behavior changes, but this will be necessary to power
`--generate-hashes`.
2024-04-10 19:31:41 +00:00
Charlie Marsh
c18551fd3c
Fall back to distributions without hashes in resolver (#2949)
## Summary

This represents a change to `--require-hashes` in the event that we
don't find a matching hash from the registry. The behavior in this PR is
closer to pip's.

Prior to this PR, if a distribution had no reported hash, or only
mismatched hashes, we would mark it as incompatible. Now, we mark it as
compatible, but we use the hash-agreement as part of the ordering, such
that we prefer any distribution with a matching hash, then any
distribution with no hash, then any distribution with a mismatched hash.

As a result, if an index reports incorrect hashes, but the user provides
the correct one, resolution now succeeds, where it would've failed.

Similarly, if an index omits hashes altogether, but the user provides
the correct one, resolution now succeeds, where it would've failed.

If we end up picking a distribution whose hash ultimately doesn't match,
we'll reject it later, after resolution.
2024-04-10 19:19:47 +00:00
Charlie Marsh
1f3b5bb093
Add hash-checking support to install and sync (#2945)
## Summary

This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.

Here are some of the most important highlights:

- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.

There are a few notable TODOs:

- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.

Closes #474.

## Test Plan

I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
2024-04-10 19:09:03 +00:00
Charlie Marsh
48ba7df98a
Move FlatIndex into the uv-resolver crate (#2972)
## Summary

This lets us remove circular dependencies (in the future, e.g., #2945)
that arise from `FlatIndex` needing a bunch of resolver-specific
abstractions (like incompatibilities, required hashes, etc.) that aren't
necessary to _fetch_ the flat index entries.
2024-04-10 14:38:42 -04:00
Charlie Marsh
a9d554fa90
Add a --require-hashes command-line setting (#2824)
## Summary

I'll likely only merge this once the PR chain is further along, but this
PR wires up the setting fro the CLI.
2024-04-10 14:07:03 -04:00
Chan Kang
7cd98d2499
Implement --emit-index-annotation to annotate source index for each package (#2926)
## Summary
resolves https://github.com/astral-sh/uv/issues/2852

## Test Plan
add a couple of tests:
- one covering the simplest case with all packages pulled from a single
index.
- another where packages are pull from two distinct indices.

tested manually as well:
```
$ (echo 'pandas'; echo 'torch') | UV_EXTRA_INDEX_URL='https://download.pytorch.org/whl/cpu' cargo run pip compile - --include-indices 
    Finished dev [unoptimized + debuginfo] target(s) in 0.60s
     Running `target/debug/uv pip compile - --include-indices`
Resolved 15 packages in 686ms
# This file was autogenerated by uv via the following command:
#    uv pip compile - --include-indices
filelock==3.9.0
    # via torch
    # from https://download.pytorch.org/whl/cpu
fsspec==2023.4.0
    # via torch
    # from https://download.pytorch.org/whl/cpu
jinja2==3.1.2
    # via torch
    # from https://download.pytorch.org/whl/cpu
markupsafe==2.1.3
    # via jinja2
    # from https://download.pytorch.org/whl/cpu
mpmath==1.3.0
    # via sympy
    # from https://download.pytorch.org/whl/cpu
networkx==3.2.1
    # via torch
    # from https://download.pytorch.org/whl/cpu
numpy==1.26.3
    # via pandas
    # from https://download.pytorch.org/whl/cpu
pandas==2.2.1
    # from https://pypi.org/simple
python-dateutil==2.9.0.post0
    # via pandas
    # from https://pypi.org/simple
pytz==2024.1
    # via pandas
    # from https://pypi.org/simple
six==1.16.0
    # via python-dateutil
    # from https://pypi.org/simple
sympy==1.12
    # via torch
    # from https://download.pytorch.org/whl/cpu
torch==2.2.2
    # from https://download.pytorch.org/whl/cpu
typing-extensions==4.8.0
    # via torch
    # from https://download.pytorch.org/whl/cpu
tzdata==2024.1
    # via pandas
    # from https://pypi.org/simple
```
2024-04-10 16:05:58 +00:00
Charlie Marsh
d8323551f8
Update hashes without --upgrade if not present (#2966)
## Summary

If the user runs with `--generate-hashes`, and the lockfile doesn't
contain _any_ hashes for a package (despite being pinned), we should add
new hashes. This mirrors running `uv pip compile --generate-hashes` for
the first time with an existing lockfile.

Closes #2962.
2024-04-10 14:56:34 +00:00
Charlie Marsh
c4472ebbb9
Enforce and backtrack on invalid versions in source metadata (#2954)
## Summary

If we build a source distribution from the registry, and the version
doesn't match that of the filename, we should error, just as we do for
mismatched package names. However, we should also backtrack here, which
we didn't previously.

Closes https://github.com/astral-sh/uv/issues/2953.

## Test Plan

Verified that `cargo run pip install docutils --verbose --no-cache
--reinstall` installs `docutils==0.21` instead of the invalid
`docutils==0.21.post1`.

In the logs, I see:

```
WARN Unable to extract metadata for docutils: Package metadata version `0.21` does not match given version `0.21.post1`
```
2024-04-10 05:13:33 +00:00
Charlie Marsh
7ae06b3b46
Surface invalid metadata as hints in error reports (#2850)
## Summary

Closes #2847.
2024-04-09 23:12:10 -04:00
Charlie Marsh
f9c0632953
Ignore direct URL distributions in prefetcher (#2943)
## Summary

The prefetcher tallies the number of times we tried a given package, and
then once we hit a threshold, grabs the version map, assuming it's
already been fetched. For direct URL distributions, though, we don't
have a version map! And there's no need to prefetch.

Closes https://github.com/astral-sh/uv/issues/2941.
2024-04-09 14:09:41 -05:00
Charlie Marsh
13ae5ac8dc
Replace PyPI-internal Hashes representation with flat vector (#2925)
## Summary

Right now, we have a `Hashes` representation that looks like:

```rust
/// A dictionary mapping a hash name to a hex encoded digest of the file.
///
/// PEP 691 says multiple hashes can be included and the interpretation is left to the client.
#[derive(Debug, Clone, Eq, PartialEq, Default, Deserialize)]
pub struct Hashes {
    pub md5: Option<Box<str>>,
    pub sha256: Option<Box<str>>,
    pub sha384: Option<Box<str>>,
    pub sha512: Option<Box<str>>,
}
```

It stems from the PyPI API, which returns a dictionary of hashes.

We tend to pass these around as a vector of `Vec<Hashes>`. But it's a
bit strange because each entry in that vector could contain multiple
hashes. And it makes it difficult to ask questions like "Is
`sha256:ab21378ca980a8` in the set of hashes"?

This PR instead treats `Hashes` as the PyPI-internal type, and uses a
new `Vec<HashDigest>` everywhere in our own APIs.
2024-04-09 16:56:16 +00:00
Zanie Blue
1512e07a2e
Split configuration options out of uv-types (#2924)
Needed to prevent circular dependencies in my toolchain work (#2931). I
think this is probably a reasonable change as we move towards persistent
configuration too?

Unfortunately `BuildIsolation` needs to be in `uv-types` to avoid
circular dependencies still. We might be able to resolve that in the
future.
2024-04-09 11:35:53 -05:00
konsti
fb4ba2bbc2
Speed up cold cache urllib3/boto3/botocore with batched prefetching (#2452)
With pubgrub being fast for complex ranges, we can now compute the next
n candidates without taking a performance hit. This speeds up cold cache
`urllib3<1.25.4` `boto3` from maybe 40s - 50s to ~2s. See docstrings for
details on the heuristics.

**Before**


![image](b7b06950-e45b-4c49-b65e-ae19fe9888cc)

**After**


![image](1c749248-850e-49c1-9d57-a7d78f87b3aa)

---

We need two parts of the prefetching, first looking for compatible
version and then falling back to flat next versions. After we selected a
boto3 version, there is only one compatible botocore version remaining,
so when won't find other compatible candidates for prefetching. We see
this as a pattern where we only prefetch boto3 (stack bars), but not
botocore (sequential requests between the stacked bars).


![image](e5186800-23ac-4ed1-99b9-4d1046fbd03a)

The risk is that we're completely wrong with the guess and cause a lot
of useless network requests. I think this is acceptable since this
mechanism only triggers when we're already on the bad path and we should
simply have fetched all versions after some seconds (assuming a fast
index like pypi).

---

It would be even better if the pubgrub state was copy-on-write so we
could simulate more progress than we actually have; currently we're
guessing what the next version is which could be completely wrong, but i
think this is still a valuable heuristic.

Fixes #170.
2024-04-08 14:28:56 +00:00
Charlie Marsh
35940cb885
Remove additional 'because' (#2849)
## Summary

Is this, perhaps, not totally necessary? It doesn't show up in any
fixtures beyond those that I added recently.

Closes https://github.com/astral-sh/uv/issues/2846.
2024-04-06 10:58:23 -04:00
Charlie Marsh
00934044aa
Backtrack on distributions with invalid metadata (#2834)
## Summary

Closes https://github.com/astral-sh/uv/issues/2821.
2024-04-05 18:00:48 -04:00
konsti
5474c61fed
Refactor candidate selector for batch prefetching (#2832)
Batch prefetching needs more information from the candidate selector, so
i've split `select` into methods. Split out from #2452. No functional
changes.
2024-04-05 16:51:01 +02:00
Charlie Marsh
34341bd6e9
Allow package lookups across multiple indexes via explicit opt-in (#2815)
## Summary

This partially revives https://github.com/astral-sh/uv/pull/2135 (with
some modifications) to enable users to opt-in to looking for packages
across multiple indexes.

The behavior is such that, in version selection, we take _any_
compatible version from a "higher-priority" index over the compatible
versions of a "lower-priority" index, even if that means we might accept
an "older" version.

Closes https://github.com/astral-sh/uv/issues/2775.
2024-04-03 23:23:37 +00:00
Charlie Marsh
189d0d41d0
Remove redirects from the resolver (#2792)
## Summary

Rather than storing the `redirects` on the resolver, this PR just
re-uses the "convert this URL to precise" logic when we convert to a
`Resolution` after-the-fact. I think this is a lot simpler: it removes
state from the resolver, and simplifies a lot of the hooks around
distribution fetching (e.g., `get_or_build_wheel_metadata` no longer
returns `(Metadata23, Option<Url>)`).
2024-04-03 02:43:57 +00:00
Charlie Marsh
c30a65ee0c
Allow conflicting Git URLs that refer to the same commit SHA (#2769)
## Summary

This PR leverages our lookahead direct URL resolution to significantly
improve the range of Git URLs that we can accept (e.g., if a user
provides the same requirement, once as a direct dependency, and once as
a tag). We did some of this in #2285, but the solution here is more
general and works for arbitrary transitive URLs.

Closes https://github.com/astral-sh/uv/issues/2614.
2024-04-02 23:36:35 +00:00
Zanie Blue
1ac9672b95
Resolve non-determistic behavior in preferences due to site-packages ordering (#2780)
Originally a regression test for #2779 but we found out that there's
some weird behavior where different `anyio` versions were preferred
based on the platform.
2024-04-02 13:48:33 -05:00