## Summary
These were moved as part of a broader refactor to create a single
integration test module. That "single integration test module" did
indeed have a big impact on compile times, which is great! But we aren't
seeing any benefit from moving these tests into their own files (despite
the claim in [this blog
post](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
I see the same compilation pattern regardless of where the tests are
located). Plus, we don't have many of these, and same-file tests is such
a strong Rust convention.
## Summary
This is part of making
https://github.com/astral-sh/uv/issues/7299#issuecomment-2385286341
better. You can now use `tool.uv.dependency-metadata` for direct URL
requirements. Unfortunately, you _must_ include a version, since we need
one to perform resolution.
## Summary
Cache the path to git executable in a `LazyLock` and reuse it throughout
the process. This might reduce some costs on finding the git executable.
## Summary
This PR declares and documents all environment variables that are used
in one way or another in `uv`, either internally, or externally, or
transitively under a common struct.
I think over time as uv has grown there's been many environment
variables introduced. Its harder to know which ones exists, which ones
are missing, what they're used for, or where are they used across the
code. The docs only documents a handful of them, for others you'd have
to dive into the code and inspect across crates to know which crates
they're used on or where they're relevant.
This PR is a starting attempt to unify them, make it easier to discover
which ones we have, and maybe unlock future posibilities in automating
generating documentation for them.
I think we can split out into multiple structs later to better organize,
but given the high influx of PR's and possibly new environment variables
introduced/re-used, it would be hard to try to organize them all now
into their proper namespaced struct while this is all happening given
merge conflicts and/or keeping up to date.
I don't think this has any impact on performance as they all should
still be inlined, although it may affect local build times on changes to
the environment vars as more crates would likely need a rebuild. Lastly,
some of them are declared but not used in the code, for example those in
`build.rs`. I left them declared because I still think it's useful to at
least have a reference.
Did I miss any? Are their initial docs cohesive?
Note, `uv-static` is a terrible name for a new crate, thoughts? Others
considered `uv-vars`, `uv-consts`.
## Test Plan
Existing tests
As per
https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html
Before that, there were 91 separate integration tests binary.
(As discussed on Discord — I've done the `uv` crate, there's still a few
more commits coming before this is mergeable, and I want to see how it
performs in CI and locally).
## Summary
When syncing a lockfile, we need to respect credentials defined in the
`pyproject.toml`, even if they won't be used for resolution.
Unfortunately, this includes credentials in `tool.uv.sources`,
`tool.uv.dev-dependencies`, `project.dependencies`, and
`project.optional-dependencies`.
Closes https://github.com/astral-sh/uv/issues/7453.
## Summary
This PR adds a more flexible cache invalidation abstraction for uv, and
uses that new abstraction to improve support for dynamic metadata.
Specifically, instead of relying solely on a timestamp, we now pass
around a `CacheInfo` struct which (as of now) contains
`Option<Timestamp>` and `Option<Commit>`. The `CacheInfo` is saved in
`dist-info` as `uv_cache.json`, so we can test already-installed
distributions for cache validity (along with testing _cached_
distributions for cache validity).
Beyond the defaults (`pyproject.toml`, `setup.py`, and `setup.cfg`
changes), users can also specify additional cache keys, and it's easy
for us to extend support in the future. Right now, cache keys can either
be instructions to include the current commit (for `setuptools_scm` and
similar) or file paths (for `hatch-requirements-txt` and similar):
```toml
[tool.uv]
cache-keys = [{ file = "requirements.txt" }, { git = true }]
```
This change should be fully backwards compatible.
Closes https://github.com/astral-sh/uv/issues/6964.
Closes https://github.com/astral-sh/uv/issues/6255.
Closes https://github.com/astral-sh/uv/issues/6860.
This is achieved by updating the `LockedFile::acquire` API to be async —
as in some cases we were attempting to acquire the lock synchronously,
i.e., without yielding, which blocked the runtime.
Closes https://github.com/astral-sh/uv/issues/6691 — I tested with the
reproduction there and a local release build and no longer reproduce the
deadlock with these changes.
Some additional context in the [internal Discord
thread](1278478941)
## Summary
The strategy here is: if the user provides supported environments, we
use those as the initial forks when resolving. As a result, we never add
or explore branches that are disjoint with the supported environments.
(If the supported environments change, we ignore the lockfile entirely,
so we don't have to worry about any interactions between supported
environments and the preference forks.)
Closes https://github.com/astral-sh/uv/issues/6184.
## Summary
We retain them if you use `--raw-sources`, but otherwise they're
removed. We still respect them in the subsequent `uv.lock` via an
in-process store.
Closes#6056.
## Summary
Whenever we call `resolve`, we immediately call `fetch` after. And in
some cases `resolve` actually calls `fetch` internally. It seems a lot
simpler to just merge these into one method that returns a `Fetch`
(which itself contains the fully-resolved URL).
Closes https://github.com/astral-sh/uv/issues/5876.
## Summary
Very subtle bug. The scenario is as follows:
- We resolve: `elmer-circuitbuilder = { git =
"https://github.com/ElmerCSC/elmer_circuitbuilder.git" }`
- The user then changes the request to: `elmer-circuitbuilder = { git =
"https://github.com/ElmerCSC/elmer_circuitbuilder.git", rev =
"44d2f4b19d6837ea990c16f494bdf7543d57483d" }`
- When we go to re-lock, we note two facts:
1. The "default branch" resolves to
`44d2f4b19d6837ea990c16f494bdf7543d57483d`.
2. The metadata for `44d2f4b19d6837ea990c16f494bdf7543d57483d` is
(whatever we grab from the lockfile).
- In the resolver, we then ask for the metadata for
`44d2f4b19d6837ea990c16f494bdf7543d57483d`. It's already in the cache,
so we return it; thus, we never add the
`44d2f4b19d6837ea990c16f494bdf7543d57483d` ->
`44d2f4b19d6837ea990c16f494bdf7543d57483d` mapping to the Git resolver,
because we never have to resolve it.
This would apply for any case in which a requested tag or branch was
replaced by its precise SHA. Replacing with a different commit is fine.
It only applied to `tool.uv.sources`, and not PEP 508 URLs, because the
underlying issue is that we aren't consistent about "automatically"
extracting the precise commit from a Git reference.
Closes https://github.com/astral-sh/uv/issues/5860.
## Summary
The current receipt doesn't capture quite enough information. For
example, it doesn't differentiate between editable and non-editable
requirements. This PR instead uses the full `Requirement` type. I think
we should use a custom representation like we do in the lockfile, but
I'm just using the default representation to demonstrate the idea.
Fixes a concurrency issue when multiple processes are installing the
same package in different virtual environments from Git ref (not a
specific Git commit).
## Symptoms
That's how some of symptoms looked like in our case:
```
DEBUG uv 0.2.21
DEBUG Checking for Python interpreter at path `/tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/37bf51bfba4699a940ce31349422b24a5bc55a2b179ed7aec74459a9ae8d57b7/bin/python`
DEBUG Using Python 3.12.4 environment at /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/37bf51bfba4699a940ce31349422b24a5bc55a2b179ed7aec74459a9ae8d57b7/bin/python
DEBUG Acquired lock for `/tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/37bf51bfba4699a940ce31349422b24a5bc55a2b179ed7aec74459a9ae8d57b7`
DEBUG At least one requirement is not satisfied: torch
DEBUG Using request timeout of 300s
DEBUG Found 37 packages in `--find-links` entry: /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/pip/wheels
DEBUG Updating git source `Url { scheme: "https", cannot_be_a_base: false, username: "***", password: None, host: Some(Domain("github.com")), port: None, path: "/iterative/datachain", query: None, fragment: None }`
DEBUG Attempting GitHub fast path for: https://api.github.com/repos/iterative/datachain/commits/fix-distributed-test
DEBUG failed to check github HTTP status client error (404 Not Found) for url (https://api.github.com/repos/iterative/datachain/commits/fix-distributed-test)
DEBUG Performing a Git fetch for: https://***@github.com/iterative/datachain
error: Failed to download and build: `datachain @ git+https://***@github.com/iterative/datachain@fix-distributed-test`
Caused by: Git operation failed
Caused by: process didn't exit successfully: `git clone --local /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/uv/git-v0/db/9d45a3e6f56b0a69 /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/uv/git-v0/checkouts/9d45a3e6f56b0a69/56b15b8` (exit status: 128)
--- stderr
fatal: destination path '/tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/uv/git-v0/checkouts/9d45a3e6f56b0a69/56b15b8' already exists and is not an empty directory.
```
## Cause of the issue
It is the same command that is failing - `git clone`, and I think it's
happening because it was trying to first get the repo to dereference the
`fix-distributed-test` branch:
`Given a remote source distribution, return a precise variant, if
possible.`
And it's happening w/i acquiring a lock around cache.
## Fix
I thinks we can reuse the existing `fetch` method that has already lock
around cache:
https://github.com/astral-sh/uv/pull/5051/files#diff-f58bb99dee2c4922d156ace3e7de651f0d9a81fc8e9447a2ad865de5c53543fcR61-R68
```python
// Avoid races between different processes, too.
let lock_dir = cache.join("locks");
....
```
## Questions
- Are there any tests that cover concurrency? I'm quite new to Rust and
if someone can point me to some examples and I can create a similar test
or a new one.
- Is error handling done correctly in this PR (again, I'm new to Rust -
I'll review and read about it, but it's better also for someone else to
review this)
## Summary
`GitDatabase::contains` previously only parsed the commit to see if it
was a valid hash and didn't verify if the commit existed in the object
database. This led to the database never being updated.
Resolves https://github.com/astral-sh/uv/issues/4378.
## Test Plan
Added a test that fails without this change.
## Summary
It turns out that the Git fetch implementation is initializing its own
client, which can be really expensive on macOS (due to loading native
certificates) _and_ bypasses any of our middleware. This PR modifies the
Git implementation to accept a shared client.
## Summary
We unconditionally update the submodules in our Git code, but AFAICT it
shouldn't be necessary if we already have a complete, up-to-date fetch
available.
## Summary
This PR ensures that if a lockfile already contains a resolved reference
(e.g., you locked with `main` previously, and it locked to a specific
commit), and you run `uv lock`, we use the same SHA, even if it's not
the latest SHA for that tag. This avoids upgrading Git dependencies
without `--upgrade`.
Closes#3920.
## Summary
This PR removes the static resolver map:
```rust
static RESOLVED_GIT_REFS: Lazy<Mutex<FxHashMap<RepositoryReference, GitSha>>> =
Lazy::new(Mutex::default);
```
With a `GitResolver` struct that we now pass around on the
`BuildContext`. There should be no behavior changes here; it's purely an
internal refactor with an eye towards making it cleaner for us to
"pre-populate" the list of resolved SHAs.
## Summary
We currently rely on libgit2 for most git-related functionality.
However, libgit2 has long-standing performance issues, as well as lags
significantly behind git in terms of new features. For these reasons we
now use the git CLI by default for fetching repositories
(https://github.com/astral-sh/uv/pull/1781). This PR completely drops
libgit2 in favor of the git CLI for all git-related functionality, which
should allow us to use features such as partial clones and sparse
checkouts in the future for performance.
There is also a lot of technical debt in the current git code as it's
mostly taken from Cargo. Switching to the git CLI *vastly* simplifies
the `uv-git` codebase.
Eventually we might want to look into switching to
[`gitoxide`](https://github.com/Byron/gitoxide), but it's currently too
immature for our use case.
It turns out that we use PubGrubPackage as the key in hashmaps in a fair
few places. And when we iterate over hashmaps, the order is unspecified.
This can in turn result in changes in output as a result of changes in
the PubGrubPackage definition, purely as a function of its changing
hash. This is confusing as there should be no semantic difference.
Thus, this is a precursor to introducing some more determinism to places
I found in the error reporting whose output depending on hashmap
iteration order.
## Summary
This PR adds lossless deserialization for `GitSourceDist` distributions
in the lockfile. Specifically, we now properly preserve the requested
revision, the subdirectory, and the precise Git commit SHA.
## Test Plan
`cargo test`
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Just fix typos.
While `alpha-numeric` is not really a misspelling:
- it is missing from mainstream curated dictionaries, all of them
suggest `alphanumeric`;
- it is less used than `alphanumeric` (more than ⨉10 less) according to
the Google [Ngram
Viewer](https://books.google.com/ngrams/graph?content=alpha-numeric%2Calphanumeric&year_start=1900&year_end=2019&corpus=en-2019);
- it is [missing from
SCOWL](http://app.aspell.net/lookup?dict=en_US-large;words=alpha-numeric).
## Test Plan
CI jobs.
## Introduction
PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification
The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.
## `tool.uv.source`
Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:
```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
"tqdm >=4.66.2,<5",
"torch ==2.2.2",
"transformers[torch] >=4.39.3,<5",
"importlib_metadata >=7.1.0,<8; python_version < '3.10'",
"mollymawk ==0.1.0"
]
[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }
[tool.uv.workspace]
include = [
"packages/mollymawk"
]
[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```
See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.
This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.
## Changes
This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
* Git dependencies
* Url dependencies
* Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)
It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies
One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.
## Implementation
The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.
We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.
I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.
I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.
## Demo
Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:

[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
Add a dedicated error type for direct url parsing. This change is broken
out from the new uv requirement type, which uses direct url parsing
internally.
The sync scenarios script is broken, so i did the updates manually
```
$ ./scripts/sync_scenarios.sh
Setting up a temporary environment...
Using Python 3.12.1 interpreter at: /home/konsti/projects/uv/.venv/bin/python3
Creating virtualenv at: .venv
Activate with: source .venv/bin/activate
× No solution found when resolving dependencies:
╰─▶ Because docutils==0.21.post1 is unusable because the package metadata was inconsistent and you require docutils==0.21.post1, we can conclude that the requirements are unsatisfiable.
hint: Metadata for docutils==0.21.post1 was inconsistent:
Package metadata version `0.21` does not match given version `0.21.post1`
```
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
If we're given a Git reference like `20240222`, we currently treat it as
a short commit hash. However... it _could_ be a branch or a tag. This PR
improves the Git reference logic to ensure that ambiguous references
like `20240222` are handled appropriately, by attempting to extract it
as a branch, then a tag, then a short commit hash.
Closes https://github.com/astral-sh/uv/issues/2772.
## Summary
I noticed in #2769 that I was now stripping `.git` suffixes from Git
URLs after resolving to a precise commit. This PR cleans up the internal
caching to use a better canonical representation: a `RepositoryUrl`
along with a `GitReference`, instead of a `GitUrl` which can contain
non-canonical data. This gives us both better fidelity (preserving the
`.git`, along with any casing that the user provided when defining the
URL) and is overall cleaner and more robust.
## Summary
This PR leverages our lookahead direct URL resolution to significantly
improve the range of Git URLs that we can accept (e.g., if a user
provides the same requirement, once as a direct dependency, and once as
a tag). We did some of this in #2285, but the solution here is more
general and works for arbitrary transitive URLs.
Closes https://github.com/astral-sh/uv/issues/2614.
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
## Summary
This PR changes our user-facing representation for paths to use relative
paths, when the path is within the current working directory. This
mirrors what we do in Ruff. (If the path is _outside_ the current
working directory, we print an absolute path.)
Before:
```shell
❯ uv venv .venv2
Using Python 3.12.2 interpreter at: /Users/crmarsh/workspace/uv/.venv/bin/python3
Creating virtualenv at: .venv2
Activate with: source .venv2/bin/activate
```
After:
```shell
❯ cargo run venv .venv2
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/uv venv .venv2`
Using Python 3.12.2 interpreter at: .venv/bin/python3
Creating virtualenv at: .venv2
Activate with: source .venv2/bin/activate
```
Note that we still want to use the existing `.simplified_display()`
anywhere that the path is being simplified, but _still_ intended for
machine consumption (e.g., when passing to `.current_dir()`).
## Summary
I tried out `cargo shear` to see if there are any unused dependencies
that `cargo udeps` isn't reporting. It turned out, there are a few. This
PR removes those dependencies.
## Test Plan
`cargo build`