mirrors/uv - Forgejo: Beyond coding. We Forge.

mirrors/uv

mirror of https://github.com/astral-sh/uv.git synced 2025-07-07 21:35:00 +00:00

Author	SHA1	Message	Date
Charlie Marsh	1a60368ce4	Use `PubGrubPython` type in Python incompatibility reporting (#3992 ) ## Summary Rather than re-testing compatibility, I think we can just rely on the types directly.	2024-06-03 14:32:22 -04:00
Charlie Marsh	c500b78936	Avoid re-adding solutions to forked state (#3967 ) ## Summary Running a resolution that required forking was failing due to breaking an invariant in PubGrub. It looks like we were adding the same incompatibility multiple times, or something like that. The issue appears to be that when forking, we modify the current state, then clone it as the "next state", then push to the "forked states" -- but that means we're cloning the _modified_ state. This PR changes the order of operations such that we clone, then modify. It shouldn't introduce any additional clones though.	2024-06-02 17:58:25 -04:00
Charlie Marsh	11324646cb	Remove some `anyhow` usages (#3962 )	2024-06-01 20:11:23 +00:00
Charlie Marsh	b7d77c04cc	Add Git resolver in lieu of static hash map (#3954 ) ## Summary This PR removes the static resolver map: ```rust static RESOLVED_GIT_REFS: Lazy<Mutex<FxHashMap<RepositoryReference, GitSha>>> = Lazy::new(Mutex::default); ``` With a `GitResolver` struct that we now pass around on the `BuildContext`. There should be no behavior changes here; it's purely an internal refactor with an eye towards making it cleaner for us to "pre-populate" the list of resolved SHAs.	2024-05-31 22:44:42 -04:00
konsti	081f20c53e	Add support for `tool.uv` into distribution building (#3904 ) With the change, we remove the special casing of workspace dependencies and resolve `tool.uv` for all git and directory distributions. This gives us support for non-editable workspace dependencies and path dependencies in other workspaces. It removes a lot of special casing around workspaces. These changes are the groundwork for supporting `tool.uv` with dynamic metadata. The basis for this change is moving `Requirement` from `distribution-types` to `pypi-types` and the lowering logic from `uv-requirements` to `uv-distribution`. This changes should be split out in separate PRs. I've included an example workspace `albatross-root-workspace2` where `bird-feeder` depends on `a` from another workspace `ab`. There's a bunch of failing tests and regressed error messages that still need fixing. It does fix the audited package count for the workspace tests.	2024-05-31 02:42:03 +00:00
Andrew Gallant	d3b7d800ea	uv-resolver: fix perf regression We significantly regressed performance in some cases because we were cloning the resolver state one more time than we needed to. That doesn't sound like a lot, but in the case where there are no forks, it implies we were cloning the state for every `get_dependencies` called when we shouldn't have been cloning it at all. Avoiding the clone results in somewhat tortured code. This can probably be refactored by moving bits out to a helper routine, but that also seemed non-trivial. So we let this suffice for now.	2024-05-30 14:23:14 -04:00
Andrew Gallant	17c043536b	uv-resolver: thread markers through the resolver and into the lock file This addresses the lack of marker support in prior commits. Specifically, we add them as a new field to `AnnotatedDist`, and from there, they get added to a `Distribution` in a `Lock`.	2024-05-30 14:23:14 -04:00
Andrew Gallant	f865406ab4	uv-resolver: implement merging of forked resolutions This commit is a pretty invasive change that implements the merging of resolutions created by each fork of the resolver. The main idea here is that each `SolveState` is converted into a `Resolution` (a new type) and stored on the heap after its fork completes. When all forks complete, they are all merged into a single `Resolution`. This `Resolution` is then used to build a `ResolutionGraph`. Construction of `ResolutionGraph` mostly stays the same (despite the gnarly diff due to an indent change) with one exception: the code to extract dependency edges out of PubGrub's state has been moved to `SolveState::into_resolution`. The idea here is that once a fork completes, we extract what we need from the PubGrub state and then throw it away. We store these edges in our own intermediate type which is then converted into petgraph edges in the `ResolutionGraph` constructor. One interesting change we make here is that our edge data is now a `Version` instead of a `Range<Version>`. I don't think `Range<Version>` was actually being used anywhere, so this seems okay? In any case, I think `Version` here is correct because a resolution corresponds to specific dependencies of each package. Moreover, I didn't see an easy way to make things work with `Range<Version>`. Notably, since we no longer have the guarantee that there is only one version of each package, we need to use `(PackageName, Version)` instead of just `PackageName` for inverted lookups in `ResolutionGraph::from_state`. Finally, the main resolver loop itself is changed a bit to track all forked resolutions and then merge them at the end. Note that we don't really have any dealings with markers in this commit. We'll get to that in a subsequent commit.	2024-05-30 14:23:14 -04:00
Andrew Gallant	9e977aa1be	uv-resolver: slightly simplify ResolutionGraph::from_state This changes the constructor to just take an `InMemoryIndex` directly instead of the constituent parts. No real reason other than it seems a little simpler.	2024-05-30 14:23:14 -04:00
Andrew Gallant	6f76a66510	uv-resolver: implement basic resolver forking There are still some TODOs/FIXMEs here, but this makes represents a chunk of the resolver refactoring to enable forking. We don't do any merging of resolutions yet, so crucially, this code is broken when no marker environment is provided. But when a marker environment is provided, this should behave the same as a non-forking resolver. In particular, `get_dependencies_forking` is just `get_dependencies` whenever there's a marker environment.	2024-05-30 14:23:14 -04:00
Charlie Marsh	4859a27948	Add extra dependency annotations to lockfile and sync commands (#3913 ) ## Summary This PR adds extras to the lockfile, and enables users to selectively sync extras in `uv sync` and `uv run`. The end result here was fairly simple, though it required a few refactors to get here. The basic idea is that `DistributionId` now includes `extra: Option<ExtraName>`, so we effectively treat extras as separate packages. Generating the lockfile, and generating the resolution from the lockfile, fall out of this naturally with no special-casing or additional changes. The main downside here is that it bloats the lockfile significantly. Specifically: - We include _all_ distribution URLs and hashes for _every_ extra variant. - We include all dependencies for the extra variant, even though that are dependencies of the base package. We could normalize this representation by changing each distribution have an `optional-dependencies` hash map that keys on extras, but we actually don't have the information we need to create that right now (specifically, we can't differentiate between dependencies that _require_ the extra and dependencies on the base package). Closes #3700.	2024-05-29 19:25:58 +00:00
Charlie Marsh	1fc6a59707	Remove special-casing for editable requirements (#3869 ) ## Summary There are a few behavior changes in here: - We now enforce `--require-hashes` for editables, like pip. So if you use `--require-hashes` with an editable requirement, we'll reject it. I could change this if it seems off. - We now treat source tree requirements, editable or not (e.g., both `-e ./black` and `./black`) as if `--refresh` is always enabled. This doesn't mean that we _always_ rebuild them; but if you pass `--reinstall`, then yes, we always rebuild them. I think this is an improvement and is close to how editables work today. Closes #3844. Closes #2695.	2024-05-28 15:49:34 +00:00
Charlie Marsh	14fa49b7ba	Move availability enums into their own module (#3858 )	2024-05-27 00:12:53 -04:00
konsti	4db468e27f	Use `VerbatimParsedUrl` in `pep508_rs` (#3758 ) When parsing requirements from any source, directly parse the url parts (and reject unsupported urls) instead of parsing url parts at a later stage. This removes a bunch of error branches and concludes the work parsing url parts once and passing them around everywhere. Many usages of the assembled `VerbatimUrl` remain, but these can be removed incrementally. Please review commit-by-commit.	2024-05-23 19:52:47 +00:00
konsti	76418f5bdf	Arc-wrap `PubGrubPackage` for cheap cloning in pubgrub (#3688 ) Pubgrub stores incompatibilities as (package name, version range) tuples, meaning it needs to clone the package name for each incompatibility, and each non-borrowed operation on incompatibilities. https://github.com/astral-sh/uv/pull/3673 made me realize that `PubGrubPackage` has gotten large (expensive to copy), so like `Version` and other structs, i've added an `Arc` wrapper around it. It's a pity clippy forbids `.deref()`, it's less opaque than `&**` and has IDE support (clicking on `.deref()` jumps to the right impl). ## Benchmarks It looks like this matters most for complex resolutions which, i assume because they carry larger `PubGrubPackageInner::Package` and `PubGrubPackageInner::Extra` types. ```bash hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/jupyter.in" "./uv-branch pip compile -q ./scripts/requirements/jupyter.in" hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/airflow.in" "./uv-branch pip compile -q ./scripts/requirements/airflow.in" hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/boto3.in" "./uv-branch pip compile -q ./scripts/requirements/boto3.in" ``` ``` Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/jupyter.in Time (mean ± σ): 18.2 ms ± 1.6 ms [User: 14.4 ms, System: 26.0 ms] Range (min … max): 15.8 ms … 22.5 ms 181 runs Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/jupyter.in Time (mean ± σ): 17.8 ms ± 1.4 ms [User: 14.4 ms, System: 25.3 ms] Range (min … max): 15.4 ms … 23.1 ms 159 runs Summary ./uv-branch pip compile -q ./scripts/requirements/jupyter.in ran 1.02 ± 0.12 times faster than ./uv-main pip compile -q ./scripts/requirements/jupyter.in ``` ``` Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/airflow.in Time (mean ± σ): 153.7 ms ± 3.5 ms [User: 165.2 ms, System: 157.6 ms] Range (min … max): 150.4 ms … 163.0 ms 19 runs Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/airflow.in Time (mean ± σ): 123.9 ms ± 4.6 ms [User: 152.4 ms, System: 133.8 ms] Range (min … max): 118.4 ms … 138.1 ms 24 runs Summary ./uv-branch pip compile -q ./scripts/requirements/airflow.in ran 1.24 ± 0.05 times faster than ./uv-main pip compile -q ./scripts/requirements/airflow.in ``` ``` Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/boto3.in Time (mean ± σ): 327.0 ms ± 3.8 ms [User: 344.5 ms, System: 71.6 ms] Range (min … max): 322.7 ms … 334.6 ms 10 runs Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/boto3.in Time (mean ± σ): 311.2 ms ± 3.1 ms [User: 339.3 ms, System: 63.1 ms] Range (min … max): 307.8 ms … 317.0 ms 10 runs Summary ./uv-branch pip compile -q ./scripts/requirements/boto3.in ran 1.05 ± 0.02 times faster than ./uv-main pip compile -q ./scripts/requirements/boto3.in ``` <!-- Thank you for contributing to uv! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? -->	2024-05-21 13:49:35 +02:00
Andrew Gallant	776a7e47f3	uv-resolver: add `Option<MarkerTree>` to `PubGrubPackage` This just adds the field to the type and always sets it to `None`. There are semantic changes in this commit. Closes #3359	2024-05-20 19:56:24 -04:00
Andrew Gallant	eac8221718	uv-resolver: use named fields for some PubGrubPackage variants I'm planning to add another field here (markers), which puts a lot of stress on the positional approach. So let's just switch over to named fields.	2024-05-20 19:56:24 -04:00
konsti	95c9621541	Refactor editables for supporting them in bluejay commands (#3639 ) This is split out from workspaces support, which needs editables in the bluejay commands. It consists mainly of refactorings: * Move the `editable` module one level up. * Introduce a `BuiltEditableMetadata` type for `(LocalEditable, Metadata23, Requirements)`. * Add editables to `InstalledPackagesProvider` so we can use `EmptyInstalledPackages` for them.	2024-05-20 16:22:12 +00:00
Charlie Marsh	657eebd50b	Remove `SourceDistFilename` from `RegistrySourceDist` (#3650 ) ## Summary Uncertain about this, but we don't actually need the full `SourceDistFilename`, only the name and version -- and we often have that information already (as in the lockfile routines). So by flattening the fields onto `RegistrySourceDist`, we can avoid re-parsing for information we already have.	2024-05-20 09:25:23 -04:00
Ibraheem Ahmed	0f67a6ceea	Use `FxHasher` in resolver (#3641 ) ## Summary We can use `FxHasher` in a few more places for string and version keys. This gives a consistent ~2% improvement to warm resolves.	2024-05-17 15:04:22 -04:00
Ibraheem Ahmed	39af09f09b	Parallelize resolver (#3627 ) ## Summary This PR introduces parallelism to the resolver. Specifically, we can perform PubGrub resolution on a separate thread, while keeping all I/O on the tokio thread. We already have the infrastructure set up for this with the channel and `OnceMap`, which makes this change relatively simple. The big change needed to make this possible is removing the lifetimes on some of the types that need to be shared between the resolver and pubgrub thread. A related PR, https://github.com/astral-sh/uv/pull/1163, found that adding `yield_now` calls improved throughput. With optimal scheduling we might be able to get away with everything on the same thread here. However, in the ideal pipeline with perfect prefetching, the resolution and prefetching can run completely in parallel without depending on one another. While this would be very difficult to achieve, even with our current prefetching pattern we see a consistent performance improvement from parallelism. This does also require reverting a few of the changes from https://github.com/astral-sh/uv/pull/3413, but not all of them. The sharing is isolated to the resolver task. ## Test Plan On smaller tasks performance is mixed with ~2% improvements/regressions on both sides. However, on medium-large resolution tasks we see the benefits of parallelism, with improvements anywhere from 10-50%. ``` ./scripts/requirements/jupyter.in Benchmark 1: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 29.2 ms ± 1.8 ms [User: 20.3 ms, System: 29.8 ms] Range (min … max): 26.4 ms … 36.0 ms 91 runs Benchmark 2: ./target/profiling/parallel (resolve-warm) Time (mean ± σ): 25.5 ms ± 1.0 ms [User: 19.5 ms, System: 25.5 ms] Range (min … max): 23.6 ms … 27.8 ms 99 runs Summary ./target/profiling/parallel (resolve-warm) ran 1.15 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm) ``` ``` ./scripts/requirements/boto3.in Benchmark 1: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 487.1 ms ± 6.2 ms [User: 464.6 ms, System: 61.6 ms] Range (min … max): 480.0 ms … 497.3 ms 10 runs Benchmark 2: ./target/profiling/parallel (resolve-warm) Time (mean ± σ): 430.8 ms ± 9.3 ms [User: 529.0 ms, System: 77.2 ms] Range (min … max): 417.1 ms … 442.5 ms 10 runs Summary ./target/profiling/parallel (resolve-warm) ran 1.13 ± 0.03 times faster than ./target/profiling/baseline (resolve-warm) ``` ``` ./scripts/requirements/airflow.in Benchmark 1: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 478.1 ms ± 18.8 ms [User: 482.6 ms, System: 205.0 ms] Range (min … max): 454.7 ms … 508.9 ms 10 runs Benchmark 2: ./target/profiling/parallel (resolve-warm) Time (mean ± σ): 308.7 ms ± 11.7 ms [User: 428.5 ms, System: 209.5 ms] Range (min … max): 287.8 ms … 323.1 ms 10 runs Summary ./target/profiling/parallel (resolve-warm) ran 1.55 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm) ```	2024-05-17 11:47:30 -04:00
Andrew Gallant	018a7150d6	uv-distribution: include all wheels in distribution types (#3595 ) Our current flow of data from "simple registry package" to "final resolved distribution" goes through a number of types: * `SimpleMetadata` is the API response from a registry that includes all published versions for a package. Each version has an assortment of metadata associated with it. * `VersionFiles` is the aforementioned metadata. It is split in two: a group of files for source distributions and a group of files for wheels. * `PrioritizedDist` collects a subset of the files from `VersionFiles` to form a selection of the "best" sdist and the "best" wheel for the current environment. * `CompatibleDist` is created from a borrowed `PrioritizedDist` that, perhaps among other things, encapsulates the decision of whether to pick an sdist or a wheel. (This decision depends both on compatibility and the action being performed. e.g., When doing installation, a `CompatibleDist` will sometimes select an sdist over a wheel.) * `ResolvedDistRef` is like a `ResolvedDist`, but borrows a `Dist`. * `ResolvedDist` is the almost-final-form of a distribution in a resolution and is created from a `ResolvedDistRef`. * `AnnotatedResolvedDist` is a new data type that is the actual final form of a distribution that a universal lock file cares about. It bundles a `ResolvedDist` with some metadata needed to generate a lock file. One of the requirements of a universal lock file is that we include all wheels (and maybe all source distributions? but at least one if it's present) associated with a distribution. But the above flow of data (in the step from `VersionFiles` to `PrioritizedDist`) drops all wheels except for the best one. To remedy this, in this PR, we rejigger `PrioritizedDist`, `CompatibleDist` and `ResolvedDistRef` so that all wheel data is preserved. And when a `ResolvedDistRef` is finally turned into a `ResolvedDist`, we copy all of the wheel data. And finally, we adjust the `Lock` constructor to read this new data and include it in the lock file. To make this work, we also modify `RegistryBuiltDist` so that it can contain one or more wheels instead of just one. One shortcoming here (called out in the code as a FIXME) is that if a source distribution is selected as the "best" thing to use (perhaps there are no compatible wheels), then the wheels won't end up in the lock file. I plan to fix this in a follow-up PR. We also aren't totally consistent on source distribution naming. Sometimes we use `sdist`. Sometimes `source`. Sometimes `source_dist`. I think it'd be nice to just use `sdist` everywhere, but I do prefer the type names to be `SourceDist`. And sometimes you want function names to match the type names (i.e., `from_source_dist`), which in turn leads to an appearance of inconsistency. I'm open to ideas. Closes #3351	2024-05-15 15:07:28 -04:00
Charlie Marsh	4a42730cae	Add hashes and versions to all distributions (#3589 ) ## Summary In `ResolutionGraph::from_state`, we have mechanisms to grab the hashes and metadata for all distributions -- but we then throw that information away. This PR preserves it on a new `AnnotatedDist` (yikes, open to suggestions) that wraps `ResolvedDist` and includes (1) the hashes (computed or from the registry) and (2) the `Metadata23`, which lets us extract the version. Closes https://github.com/astral-sh/uv/issues/3356. Closes https://github.com/astral-sh/uv/issues/3357.	2024-05-14 23:07:24 +00:00
konsti	0010954ca7	Add parsed URL to `PubGrubPackage` (#3426 ) Avoid reparsing urls by storing the parsed parts across resolution on `PubGrubPackage`. Part 1 of #3408	2024-05-14 00:55:21 +00:00
Charlie Marsh	8cec217eff	Avoid attempting to build editables when fetching metadata (#3563 ) ## Summary If we see an editable as a dependency, we currently attempt to fetch its metadata, when we shouldn't. Closes https://github.com/astral-sh/uv/issues/3562.	2024-05-14 00:03:53 +00:00
Charlie Marsh	44363d25c2	Respect constraints on editable dependencies (#3554 ) ## Summary Ensures that constraints are enforced for editable requirements. Closes #3548.	2024-05-13 17:06:27 +00:00
Charlie Marsh	eb8e733790	Rename "constraints" to "dependencies" in resolver (#3552 ) ## Summary It's confusing that we use `constraints` here because constraints mean something else for us (e.g., `--constraint constraints.txt`). These are really the dependencies of a given `PubGrubPackage` -- the type is even called `PubGrubDependencies`.	2024-05-13 16:30:16 +00:00
Charlie Marsh	42c3bfa351	Make `Directory` its own distribution kind (#3519 ) ## Summary I think this is overall good change because it explicitly encodes (in the type system) something that was previously implicit. I'm not a huge fan of the names here, open to input. It covers some of https://github.com/astral-sh/uv/issues/3506 but I don't think it _closes_ it.	2024-05-13 10:03:14 -04:00
Dimitri Papadopoulos Orfanos	d2ee567fe7	Fix a few typos found by codespell (#3543 ) <!-- Thank you for contributing to uv! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary Just fix typos. While `alpha-numeric` is not really a misspelling: - it is missing from mainstream curated dictionaries, all of them suggest `alphanumeric`; - it is less used than `alphanumeric` (more than ⨉10 less) according to the Google [Ngram Viewer](https://books.google.com/ngrams/graph?content=alpha-numeric%2Calphanumeric&year_start=1900&year_end=2019&corpus=en-2019); - it is [missing from SCOWL](http://app.aspell.net/lookup?dict=en_US-large;words=alpha-numeric). ## Test Plan CI jobs.	2024-05-13 11:55:10 +00:00
Ibraheem Ahmed	783df8f657	Consolidate concurrency limits (#3493 ) ## Summary This PR consolidates the concurrency limits used throughout `uv` and exposes two limits, `UV_CONCURRENT_DOWNLOADS` and `UV_CONCURRENT_BUILDS`, as environment variables. Currently, `uv` has a number of concurrent streams that it buffers using relatively arbitrary limits for backpressure. However, many of these limits are conflated. We run a relatively small number of tasks overall and should start most things as soon as possible. What we really want to limit are three separate operations: - File I/O. This is managed by tokio's blocking pool and we should not really have to worry about it. - Network I/O. - Python build processes. Because the current limits span a broad range of tasks, it's possible that a limit meant for network I/O is occupied by tasks performing builds, reading from the file system, or even waiting on a `OnceMap`. We also don't limit build processes that end up being required to perform a download. While this may not pose a performance problem because our limits are relatively high, it does mean that the limits do not do what we want, making it tricky to expose them to users (https://github.com/astral-sh/uv/issues/1205, https://github.com/astral-sh/uv/issues/3311). After this change, the limits on network I/O and build processes are centralized and managed by semaphores. All other tasks are unbuffered (note that these tasks are still bounded, so backpressure should not be a problem).	2024-05-10 12:43:08 -04:00
Andrew Gallant	ad01a768bc	uv-resolver: push resolver state to its own type (#3492 ) This still keeps the resolver state on the stack, but it organizes it into a more structured representation. This is a precursor to implementing resolver forking, where we will ultimately put this state on the heap. The idea is that this will let us maintain multiple independent resolver states that will all produce their own resolution (and potentially other forked states). Closes #3354	2024-05-09 14:16:43 -04:00
Andrew Gallant	8b0fad3560	uv-resolver: make MarkerEnvironment optional This commit touches a lot of code, but the conceptual change here is pretty simple: make it so we can run the resolver without providing a `MarkerEnvironment`. This also indicates that the resolver should run in universal mode. That is, the effect of a missing marker environment is that all marker expressions that reference the marker environment are evaluated to `true`. That is, they are ignored. (The only markers we evaluate in that context are extras, which are the only markers that aren't dependent on the environment.) One interesting change here is that a `Resolver` no longer needs an `Interpreter`. Previously, it had only been using it to construct a `PythonRequirement`, by filling in the installed version from the `Interpreter` state. But we now construct a `PythonRequirement` explicitly since its `target` Python version should no longer be tied to the `MarkerEnvironment`. (Currently, the marker environment is mutated such that its `python_full_version` is derived from multiple sources, including the CLI, which I found a touch confusing.) The change in behavior can now be observed through the `--unstable-uv-lock-file` flag. First, without it: ``` $ cat requirements.in anyio>=4.3.0 ; sys_platform == "linux" anyio<4 ; sys_platform == "darwin" $ cargo run -qp uv -- pip compile -p3.10 requirements.in anyio==4.3.0 exceptiongroup==1.2.1 # via anyio idna==3.7 # via anyio sniffio==1.3.1 # via anyio typing-extensions==4.11.0 # via anyio ``` And now with it: ``` $ cargo run -qp uv -- pip compile -p3.10 requirements.in --unstable-uv-lock-file x No solution found when resolving dependencies: `-> Because you require anyio>=4.3.0 and anyio<4, we can conclude that the requirements are unsatisfiable. ``` This is expected at this point because the marker expressions are being explicitly ignored, and there is no forking done yet to account for the conflict.	2024-05-09 09:24:37 -04:00
konsti	1ad6aa8a23	Use generic pubgrub incompatibility reason (#3335 ) Pubgrub got a new feature where all unavailability is a custom, instead of the reasonless `UnavailableDependencies` and our custom `String` type previously (https://github.com/pubgrub-rs/pubgrub/pull/208). This PR introduces a `UnavailableReason` that tracks either an entire version being unusable, or a specific version. The error messages now also track this difference properly. The pubgrub commit is our main rebased onto the merged https://github.com/pubgrub-rs/pubgrub/pull/208, i'll push `konsti/main-rebase-generic-reason` to `main` after checking for rebase problems.	2024-05-08 08:40:15 +00:00
Ibraheem Ahmed	94cf604574	Remove unnecessary uses of `DashMap` and `Arc` (#3413 ) ## Summary All of the resolver code is run on the main thread, so a lot of the `Send` bounds and uses of `DashMap` and `Arc` are unnecessary. We could also switch to using single-threaded versions of `Mutex` and `Notify` in some places, but there isn't really a crate that provides those I would be comfortable with using. The `Arc` in `OnceMap` can't easily be removed because of the uv-auth code which uses the [reqwest-middleware](https://docs.rs/reqwest-middleware/latest/reqwest_middleware/trait.Middleware.html) crate, that seems to adds unnecessary `Send` bounds because of `async-trait`. We could duplicate the code and create a `OnceMapLocal` variant, but I don't feel that's worth it.	2024-05-06 22:30:43 -04:00
konsti	2c84af15b8	Rename `distribution_types::VersionOrUrl` to `VersionOrUrlRef` (#3254 ) This is more consistent with the other `*Ref` types and reduces confusion with the real `VersionOrUrl` type.	2024-05-06 14:15:56 -04:00
konsti	098944fc7d	Improve non-git error message (#3403 ) The boxing changes are due to clippy	2024-05-06 13:28:05 +02:00
konsti	9de49c8a60	Make pubgrub an allowed ident (#3399 ) Followup to #3361, fix some backtick-quoting.	2024-05-06 09:10:37 +00:00
konsti	4f87edbe66	Add basic `tool.uv.sources` support (#3263 ) ## Introduction PEP 621 is limited. Specifically, it lacks * Relative path support * Editable support * Workspace support * Index pinning or any sort of index specification The semantics of urls are a custom extension, PEP 440 does not specify how to use git references or subdirectories, instead pip has a custom stringly format. We need to somehow support these while still stying compatible with PEP 621. ## `tool.uv.source` Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources` or (for now stub only) `tool.uv.workspace`: ```toml [project] name = "albatross" version = "0.1.0" dependencies = [ "tqdm >=4.66.2,<5", "torch ==2.2.2", "transformers[torch] >=4.39.3,<5", "importlib_metadata >=7.1.0,<8; python_version < '3.10'", "mollymawk ==0.1.0" ] [tool.uv.sources] tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" } importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" } torch = { index = "torch-cu118" } mollymawk = { workspace = true } [tool.uv.workspace] include = [ "packages/mollymawk" ] [tool.uv.indexes] torch-cu118 = "https://download.pytorch.org/whl/cu118" ``` See `docs/specifying_dependencies.md` for a detailed explanation of the format. The basic gist is that `project.dependencies` is what ends up on pypi, while `tool.uv.sources` are your non-published additions. We do support the full range or PEP 508, we just hide it in the docs and prefer the exploded table for easier readability and less confusing with actual url parts. This format should eventually be able to subsume requirements.txt's current use cases. While we will continue to support the legacy `uv pip` interface, this is a piece of the uv's own top level interface. Together with `uv run` and a lockfile format, you should only need to write `pyproject.toml` and do `uv run`, which generates/uses/updates your lockfile behind the scenes, no more pip-style requirements involved. It also lays the groundwork for implementing index pinning. ## Changes This PR implements: * Reading and lowering `project.dependencies`, `project.optional-dependencies` and `tool.uv.sources` into a new requirements format, including: * Git dependencies * Url dependencies * Path dependencies, including relative and editable * `pip install` integration * Error reporting for invalid `tool.uv.sources` * Json schema integration (works in pycharm, see below) * Draft user-level docs (see `docs/specifying_dependencies.md`) It does not implement: * No `pip compile` testing, deprioritizing towards our own lockfile * Index pinning (stub definitions only) * Development dependencies * Workspace support (stub definitions only) * Overrides in pyproject.toml * Patching/replacing dependencies One technically breaking change is that we now require user provided pyproject.toml to be valid wrt to PEP 621. Included files still fall back to PEP 517. That means `pip install -r requirements.txt` requires it to be valid while `pip install -r requirements.txt` with `-e .` as content falls back to PEP 517 as before. ## Implementation The `pep508` requirement is replaced by a new `UvRequirement` (name up for bikeshedding, not particularly attached to the uv prefix). The still existing `pep508_rs::Requirement` type is a url format copied from pip's requirements.txt and doesn't appropriately capture all features we want/need to support. The bulk of the diff is changing the requirement type throughout the codebase. We still use `VerbatimUrl` in many places, where we would expect a parsed/decomposed url type, specifically: * Reading core metadata except top level pyproject.toml files, we fail a step later instead if the url isn't supported. * Allowed `Urls`. * `PackageId` with a custom `CanonicalUrl` comparison, instead of canonicalizing urls eagerly. * `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a `Dist` (`Dist::from_url`), instead of remembering the url. * Source dist types: We use verbatim url even though we know and require that these are supported urls we can and have parsed. I tried to make improve the situation be replacing `VerbatimUrl`, but these changes would require massive invasive changes (see e.g. https://github.com/astral-sh/uv/pull/3253). A main problem is the ref `VersionOrUrl` and applying overrides, which assume the same requirement/url type everywhere. In its current form, this PR increases this tech debt. I've tried to split off PRs and commits, but the main refactoring is still a single monolith commit to make it compile and the tests pass. ## Demo Adding `d1ae3b85d5/pyproject.json` as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE support already: ![pycharm](`599082c7`-6be5-41c1-a3cd-516092382f8d) [dove.webm](`c293c272`-c80b-459d-8c95-8c46a8d198a1)	2024-05-03 21:10:50 +00:00
Andrew Gallant	0b84eb0140	once-map: avoid hard-coding `Arc` (#3242 ) The only thing a `OnceMap` really needs to be able to do with the value is to clone it. All extant uses benefited from having this done for them by automatically wrapping values in an `Arc`. But this isn't necessarily true for all things. For example, a value might have an `Arc` internally to making cloning cheap in other contexts, and it doesn't make sense to re-wrap it in an `Arc` just to use it with a `OnceMap`. Or alternatively, cloning might just be cheap enough on its own that an `Arc` isn't worth it.	2024-04-24 11:11:46 -04:00
Charlie Marsh	792a917a97	Restrict observed requirements to direct when `--no-deps` is specified (#3191 ) ## Summary This PR avoids: (1) using the lookahead resolver when `--no-deps` is specified (we'll never use those requirements), and (2) including any transitive requirements when searching for allowed URLs, etc., when `--no-deps` is specified. Closes https://github.com/astral-sh/uv/issues/3183.	2024-04-22 17:17:58 +00:00
Charlie Marsh	a4f125ca34	Avoid waiting for metadata for `--no-deps` editables (#3188 ) ## Summary We don't emit a request for this, so we shouldn't wait for it either -- we already have the metadata! Closes https://github.com/astral-sh/uv/issues/3184.	2024-04-22 16:29:19 +00:00
Charlie Marsh	a241bc79b1	Add priorities for editables (#3133 ) ## Summary We weren't setting a priority for editables, so they were being visited last. I think there's still a problem whereby we're not aggressive enough in visiting recursive extras (and, in fact, that's making it really hard to write a test -- I wrote a test, but the most-reduced case still fails, and I'd need to add a layer of indirection to make it fail-on-main-but-pass-on-this-branch), but that problem likely already existed on main prior to #3087, so I just want to get this quick fix out now. Closes https://github.com/astral-sh/uv/issues/3127. ## Test Plan - `git clone https://github.com/cda-tum/mqt-core.git` - `cargo run venv` - `cargo run pip install 'scikit-build-core[pyproject]>=0.8.1' 'setuptools_scm>=7' 'pybind11>=2.12' --resolution=lowest-direct` - `cargo run pip install --no-build-isolation '-ve.[test,qiskit,evaluation,coverage]' --resolution=lowest-direct`	2024-04-19 02:04:58 +00:00
Charlie Marsh	2e88bb6f1b	Add a proxy layer for extras (#3100 ) Given requirements like: ``` black==23.1.0 black[colorama] ``` The resolver will (on `main`) add a dependency on Black, and then try to use the most recent version of Black to satisfy `black[colorama]`. For sake of example, assume `black==24.0.0` is the most recent version. Once the selects this most recent version, it'll fetch the metadata, then return the dependencies for `black==24.0.0` with the `colorama` extra enabled. Finally, it will tack on `black==24.0.0` (a dependency on the base package). The resolver will then detect a conflict between `black==23.1.0` and `black==24.0.0`, and throw out `black[colorama]==24.0.0`, trying to next most-recent version. This is both wasteful and can cause problems, since we're fetching metadata for versions that will _never_ satisfy the resolver. In the `apache-airflow[all]` case, I also ran into an issue whereby we were attempting to build very old versions of `apache-airflow` due to `apache-airflow[pandas]`, which in turn led to resolution failures. The solution proposed here is that we create a new proxy package with exactly two dependencies: one on `black` and one of `black[colorama]`. Both of these packages must be at the same version as the proxy package, so the resolver knows much _earlier_ that (in the above example) the extra variant _must_ match `23.1.0`.	2024-04-19 01:04:59 +00:00
Charlie Marsh	b456fa2939	Incorporate heuristics to improve package prioritization (#3087 ) See: https://github.com/astral-sh/uv/issues/3078	2024-04-17 14:21:42 +00:00
konsti	d1b07a3f49	Log versions tried from batch prefetch (#3090 ) This is required for evaluating #3087. This also removes tracking of virtual packages from extras from the batch prefetcher (we only track real packages). Let's look at some stats: * jupyter: Tried 100 versions: anyio 1, argon2-cffi 1, argon2-cffi-bindings 1, arrow 1, asttokens 1, async-lru 1, attrs 1, babel 1, beautifulsoup4 1, bleach 1, certifi 1, cffi 1, charset-normalizer 1, comm 1, debugpy 1, decorator 1, defusedxml 1, exceptiongroup 1, executing 1, fastjsonschema 1, fqdn 1, h11 1, httpcore 1, httpx 1, idna 1, ipykernel 1, ipython 1, ipywidgets 1, isoduration 1, jedi 1, jinja2 1, json5 1, jsonpointer 1, jsonschema 1, jsonschema-specifications 1, jupyter 1, jupyter-client 1, jupyter-console 1, jupyter-core 1, jupyter-events 1, jupyter-lsp 1, jupyter-server 1, jupyter-server-terminals 1, jupyterlab 1, jupyterlab-pygments 1, jupyterlab-server 1, jupyterlab-widgets 1, markupsafe 1, matplotlib-inline 1, mistune 1, nbclient 1, nbconvert 1, nbformat 1, nest-asyncio 1, notebook 1, notebook-shim 1, overrides 1, packaging 1, pandocfilters 1, parso 1, pexpect 1, platformdirs 1, prometheus-client 1, prompt-toolkit 1, psutil 1, ptyprocess 1, pure-eval 1, pycparser 1, pygments 1, python-dateutil 1, python-json-logger 1, pyyaml 1, pyzmq 1, qtconsole 1, qtpy 1, referencing 1, requests 1, rfc3339-validator 1, rfc3986-validator 1, root 1, rpds-py 1, send2trash 1, six 1, sniffio 1, soupsieve 1, stack-data 1, terminado 1, tinycss2 1, tomli 1, tornado 1, traitlets 1, types-python-dateutil 1, typing-extensions 1, uri-template 1, urllib3 1, wcwidth 1, webcolors 1, webencodings 1, websocket-client 1, widgetsnbextension 1 * boto3: botocore 1697, boto3 849, urllib3 2, jmespath 1, python-dateutil 1, root 1, s3transfer 1, six 1 * transformers-extras: Tried 1191 versions: sagemaker 152, hypothesis 67, tensorflow 21, jsonschema 19, tensorflow-cpu 18, multiprocess 10, pathos 10, tensorflow-text 10, chex 8, tf-keras 8, tf2onnx 8, aiohttp 6, aiosignal 6, alembic 6, annotated-types 6, apscheduler 6, attrs 6, backoff 6, binaryornot 6, black 6, boto3 6, click 6, coloredlogs 6, colorlog 6, dash 6, dash-bootstrap-components 6, dlinfo 6, exceptiongroup 6, execnet 6, fire 6, frozenlist 6, gitdb 6, google-auth 6, google-auth-oauthlib 6, hjson 6, iniconfig 6, jinja2-time 6, markdown 6, markdown-it-py 6, markupsafe 6, mpmath 6, namex 6, nbformat 6, ninja 6, nvidia-nvjitlink-cu12 6, onnxconverter-common 6, pandas 6, plac 6, platformdirs 6, pluggy 6, portalocker 6, poyo 6, protobuf3-to-dict 6, py-cpuinfo 6, py3nvml 6, pyarrow 6, pyarrow-hotfix 6, pydantic-core 6, pygments 6, pynvml 6, pypng 6, python-slugify 6, responses 6, smdebug-rulesconfig 6, soupsieve 6, sqlalchemy 6, tensorboard-data-server 6, tensorboard-plugin-wit 6, tensorboardx 6, threadpoolctl 6, tomli 6, wasabi 6, wcwidth 6, werkzeug 6, wheel 6, xxhash 6, zipp 6, etils 5, tensorboard 5, beautifulsoup4 4, cffi 4, clldutils 4, codecarbon 4, datasets 4, dill 4, evaluate 4, gitpython 4, hf-doc-builder 4, kenlm 4, librosa 4, llvmlite 4, nest-asyncio 4, nltk 4, optuna 4, parameterized 4, phonemizer 4, psutil 4, pyctcdecode 4, pytest 4, pytest-timeout 4, pytest-xdist 4, ray 4, rjieba 4, rouge-score 4, ruff 4, sacrebleu 4, sacremoses 4, sigopt 4, sortedcontainers 4, tensorstore 4, timeout-decorator 4, toolz 4, torchaudio 4, accelerate 3, audioread 3, cookiecutter 3, decorator 3, deepspeed 3, faiss-cpu 3, flax 3, fugashi 3, ipadic 3, isort 3, jax 3, jaxlib 3, joblib 3, keras-nlp 3, lazy-loader 3, numba 3, optax 3, pooch 3, pydantic 3, pygtrie 3, rhoknp 3, scikit-learn 3, segments 3, soundfile 3, soxr 3, sudachidict-core 3, sudachipy 3, torch 3, unidic 3, unidic-lite 3, urllib3 3, absl-py 2, arrow 2, astunparse 2, async-timeout 2, botocore 2, cachetools 2, certifi 2, chardet 2, charset-normalizer 2, csvw 2, dash-core-components 2, dash-html-components 2, dash-table 2, diffusers 2, dm-tree 2, fastjsonschema 2, flask 2, flatbuffers 2, fsspec 2, ftfy 2, gast 2, google-pasta 2, greenlet 2, grpcio 2, h5py 2, humanfriendly 2, idna 2, importlib-metadata 2, importlib-resources 2, jinja2 2, jmespath 2, jupyter-core 2, kagglehub 2, keras 2, keras-core 2, keras-preprocessing 2, libclang 2, mako 2, mdurl 2, ml-dtypes 2, msgpack 2, multidict 2, mypy-extensions 2, networkx 2, nvidia-cublas-cu12 2, nvidia-cuda-cupti-cu12 2, nvidia-cuda-nvrtc-cu12 2, nvidia-cuda-runtime-cu12 2, nvidia-cudnn-cu12 2, nvidia-cufft-cu12 2, nvidia-curand-cu12 2, nvidia-cusolver-cu12 2, nvidia-cusparse-cu12 2, nvidia-nccl-cu12 2, nvidia-nvtx-cu12 2, onnx 2, onnxruntime 2, onnxruntime-tools 2, opencv-python 2, opt-einsum 2, orbax-checkpoint 2, pathspec 2, plotly 2, pox 2, ppft 2, pyasn1-modules 2, pycparser 2, pyrsistent 2, python-dateutil 2, pytz 2, requests-oauthlib 2, retrying 2, rich 2, rsa 2, s3transfer 2, scipy 2, setuptools 2, six 2, smmap 2, sympy 2, tabulate 2, tensorflow-estimator 2, tensorflow-hub 2, tensorflow-io-gcs-filesystem 2, termcolor 2, text-unidecode 2, traitlets 2, triton 2, typing-extensions 2, tzdata 2, tzlocal 2, wrapt 2, xmltodict 2, yarl 2, Python 1, av 1, babel 1, bibtexparser 1, blinker 1, colorama 1, decord 1, filelock 1, huggingface-hub 1, isodate 1, itsdangerous 1, language-tags 1, lxml 1, numpy 1, oauthlib 1, packaging 1, pillow 1, protobuf 1, pyasn1 1, pylatexenc 1, pyparsing 1, pyyaml 1, rdflib 1, regex 1, requests 1, rfc3986 1, root 1, safetensors 1, sentencepiece 1, tenacity 1, timm 1, tokenizers 1, torchvision 1, tqdm 1, transformers 1, types-python-dateutil 1, uritemplate 1 You can reproduce them with python 3.10 and: ``` RUST_LOG=uv_resolver=debug cargo run pip compile -o /dev/null -q scripts/requirements/<input>.in 2>&1 \| tail -n 1 ``` Closes #2270 - This is less invasive compared to the other PR, we can revisit number of network/build request tracking later.	2024-04-17 09:08:21 +00:00
Charlie Marsh	96c3c2e774	Support unnamed requirements in `--require-hashes` (#2993 ) ## Summary This PR enables `--require-hashes` with unnamed requirements. The key change is that `PackageId` becomes `VersionId` (since it refers to a package at a specific version), and the new `PackageId` consists of _either_ a package name _or_ a URL. The hashes are keyed by `PackageId`, so we can generate the `RequiredHashes` before we have names for all packages, and enforce them throughout. Closes #2979.	2024-04-11 11:26:50 -04:00
Charlie Marsh	006379c50c	Add support for URL requirements in `--generate-hashes` (#2952 ) ## Summary This PR enables hash generation for URL requirements when the user provides `--generate-hashes` to `pip compile`. While we include the hashes from the registry already, today, we omit hashes for URLs. To power hash generation, we introduce a `HashPolicy` abstraction: ```rust #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub enum HashPolicy<'a> { /// No hash policy is specified. None, /// Hashes should be generated (specifically, a SHA-256 hash), but not validated. Generate, /// Hashes should be validated against a pre-defined list of hashes. If necessary, hashes should /// be generated so as to ensure that the archive is valid. Validate(&'a [HashDigest]), } ``` All of the methods on the distribution database now accept this policy, instead of accepting `&'a [HashDigest]`. Closes #2378.	2024-04-10 20:02:45 +00:00
Charlie Marsh	8513d603b4	Return computed hashes from metadata requests (#2951 ) ## Summary This PR modifies the distribution database to return both the `Metadata23` and the computed hashes when clients request metadata. No behavior changes, but this will be necessary to power `--generate-hashes`.	2024-04-10 19:31:41 +00:00
Charlie Marsh	1f3b5bb093	Add hash-checking support to `install` and `sync` (#2945 ) ## Summary This PR adds support for hash-checking mode in `pip install` and `pip sync`. It's a large change, both in terms of the size of the diff and the modifications in behavior, but it's also one that's hard to merge in pieces (at least, with any test coverage) since it needs to work end-to-end to be useful and testable. Here are some of the most important highlights: - We store hashes in the cache. Where we previously stored pointers to unzipped wheels in the `archives` directory, we now store pointers with a set of known hashes. So every pointer to an unzipped wheel also includes its known hashes. - By default, we don't compute any hashes. If the user runs with `--require-hashes`, and the cache doesn't contain those hashes, we invalidate the cache, redownload the wheel, and compute the hashes as we go. For users that don't run with `--require-hashes`, there will be no change in performance. For users that _do_, the only change will be if they don't run with `--generate-hashes` -- then they may see some repeated work between resolution and installation, if they use `pip compile` then `pip sync`. - Many of the distribution types now include a `hashes` field, like `CachedDist` and `LocalWheel`. - Our behavior is similar to pip, in that we enforce hashes when pulling any remote distributions, and when pulling from our own cache. Like pip, though, we _don't_ enforce hashes if a distribution is _already_ installed. - Hash validity is enforced in a few different places: 1. During resolution, we enforce hash validity based on the hashes reported by the registry. If we need to access a source distribution, though, we then enforce hash validity at that point too, prior to running any untrusted code. (This is enforced in the distribution database.) 2. In the install plan, we _only_ add cached distributions that have matching hashes. If a cached distribution is missing any hashes, or the hashes don't match, we don't return them from the install plan. 3. In the downloader, we _only_ return distributions with matching hashes. 4. The final combination of "things we install" are: (1) the wheels from the cache, and (2) the downloaded wheels. So this ensures that we never install any mismatching distributions. - Like pip, if `--require-hashes` is provided, we require that _all_ distributions are pinned with either `==` or a direct URL. We also require that _all_ distributions have hashes. There are a few notable TODOs: - We don't support hash-checking mode for unnamed requirements. These should be _somewhat_ rare, though? Since `pip compile` never outputs unnamed requirements. I can fix this, it's just some additional work. - We don't automatically enable `--require-hashes` with a hash exists in the requirements file. We require `--require-hashes`. Closes #474. ## Test Plan I'd like to add some tests for registries that report incorrect hashes, but otherwise: `cargo test`	2024-04-10 19:09:03 +00:00
Charlie Marsh	48ba7df98a	Move `FlatIndex` into the `uv-resolver` crate (#2972 ) ## Summary This lets us remove circular dependencies (in the future, e.g., #2945) that arise from `FlatIndex` needing a bunch of resolver-specific abstractions (like incompatibilities, required hashes, etc.) that aren't necessary to _fetch_ the flat index entries.	2024-04-10 14:38:42 -04:00

1 2 3

138 commits