## Summary
It turns out that the Git fetch implementation is initializing its own
client, which can be really expensive on macOS (due to loading native
certificates) _and_ bypasses any of our middleware. This PR modifies the
Git implementation to accept a shared client.
## Summary
We unconditionally update the submodules in our Git code, but AFAICT it
shouldn't be necessary if we already have a complete, up-to-date fetch
available.
## Summary
This PR ensures that if a lockfile already contains a resolved reference
(e.g., you locked with `main` previously, and it locked to a specific
commit), and you run `uv lock`, we use the same SHA, even if it's not
the latest SHA for that tag. This avoids upgrading Git dependencies
without `--upgrade`.
Closes#3920.
## Summary
This PR removes the static resolver map:
```rust
static RESOLVED_GIT_REFS: Lazy<Mutex<FxHashMap<RepositoryReference, GitSha>>> =
Lazy::new(Mutex::default);
```
With a `GitResolver` struct that we now pass around on the
`BuildContext`. There should be no behavior changes here; it's purely an
internal refactor with an eye towards making it cleaner for us to
"pre-populate" the list of resolved SHAs.
## Summary
We currently rely on libgit2 for most git-related functionality.
However, libgit2 has long-standing performance issues, as well as lags
significantly behind git in terms of new features. For these reasons we
now use the git CLI by default for fetching repositories
(https://github.com/astral-sh/uv/pull/1781). This PR completely drops
libgit2 in favor of the git CLI for all git-related functionality, which
should allow us to use features such as partial clones and sparse
checkouts in the future for performance.
There is also a lot of technical debt in the current git code as it's
mostly taken from Cargo. Switching to the git CLI *vastly* simplifies
the `uv-git` codebase.
Eventually we might want to look into switching to
[`gitoxide`](https://github.com/Byron/gitoxide), but it's currently too
immature for our use case.
It turns out that we use PubGrubPackage as the key in hashmaps in a fair
few places. And when we iterate over hashmaps, the order is unspecified.
This can in turn result in changes in output as a result of changes in
the PubGrubPackage definition, purely as a function of its changing
hash. This is confusing as there should be no semantic difference.
Thus, this is a precursor to introducing some more determinism to places
I found in the error reporting whose output depending on hashmap
iteration order.
## Summary
This PR adds lossless deserialization for `GitSourceDist` distributions
in the lockfile. Specifically, we now properly preserve the requested
revision, the subdirectory, and the precise Git commit SHA.
## Test Plan
`cargo test`
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Just fix typos.
While `alpha-numeric` is not really a misspelling:
- it is missing from mainstream curated dictionaries, all of them
suggest `alphanumeric`;
- it is less used than `alphanumeric` (more than ⨉10 less) according to
the Google [Ngram
Viewer](https://books.google.com/ngrams/graph?content=alpha-numeric%2Calphanumeric&year_start=1900&year_end=2019&corpus=en-2019);
- it is [missing from
SCOWL](http://app.aspell.net/lookup?dict=en_US-large;words=alpha-numeric).
## Test Plan
CI jobs.
## Introduction
PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification
The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.
## `tool.uv.source`
Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:
```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
"tqdm >=4.66.2,<5",
"torch ==2.2.2",
"transformers[torch] >=4.39.3,<5",
"importlib_metadata >=7.1.0,<8; python_version < '3.10'",
"mollymawk ==0.1.0"
]
[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }
[tool.uv.workspace]
include = [
"packages/mollymawk"
]
[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```
See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.
This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.
## Changes
This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
* Git dependencies
* Url dependencies
* Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)
It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies
One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.
## Implementation
The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.
We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.
I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.
I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.
## Demo
Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:

[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
Add a dedicated error type for direct url parsing. This change is broken
out from the new uv requirement type, which uses direct url parsing
internally.
The sync scenarios script is broken, so i did the updates manually
```
$ ./scripts/sync_scenarios.sh
Setting up a temporary environment...
Using Python 3.12.1 interpreter at: /home/konsti/projects/uv/.venv/bin/python3
Creating virtualenv at: .venv
Activate with: source .venv/bin/activate
× No solution found when resolving dependencies:
╰─▶ Because docutils==0.21.post1 is unusable because the package metadata was inconsistent and you require docutils==0.21.post1, we can conclude that the requirements are unsatisfiable.
hint: Metadata for docutils==0.21.post1 was inconsistent:
Package metadata version `0.21` does not match given version `0.21.post1`
```
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
If we're given a Git reference like `20240222`, we currently treat it as
a short commit hash. However... it _could_ be a branch or a tag. This PR
improves the Git reference logic to ensure that ambiguous references
like `20240222` are handled appropriately, by attempting to extract it
as a branch, then a tag, then a short commit hash.
Closes https://github.com/astral-sh/uv/issues/2772.
## Summary
I noticed in #2769 that I was now stripping `.git` suffixes from Git
URLs after resolving to a precise commit. This PR cleans up the internal
caching to use a better canonical representation: a `RepositoryUrl`
along with a `GitReference`, instead of a `GitUrl` which can contain
non-canonical data. This gives us both better fidelity (preserving the
`.git`, along with any casing that the user provided when defining the
URL) and is overall cleaner and more robust.
## Summary
This PR leverages our lookahead direct URL resolution to significantly
improve the range of Git URLs that we can accept (e.g., if a user
provides the same requirement, once as a direct dependency, and once as
a tag). We did some of this in #2285, but the solution here is more
general and works for arbitrary transitive URLs.
Closes https://github.com/astral-sh/uv/issues/2614.
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
## Summary
This PR changes our user-facing representation for paths to use relative
paths, when the path is within the current working directory. This
mirrors what we do in Ruff. (If the path is _outside_ the current
working directory, we print an absolute path.)
Before:
```shell
❯ uv venv .venv2
Using Python 3.12.2 interpreter at: /Users/crmarsh/workspace/uv/.venv/bin/python3
Creating virtualenv at: .venv2
Activate with: source .venv2/bin/activate
```
After:
```shell
❯ cargo run venv .venv2
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/uv venv .venv2`
Using Python 3.12.2 interpreter at: .venv/bin/python3
Creating virtualenv at: .venv2
Activate with: source .venv2/bin/activate
```
Note that we still want to use the existing `.simplified_display()`
anywhere that the path is being simplified, but _still_ intended for
machine consumption (e.g., when passing to `.current_dir()`).
## Summary
I tried out `cargo shear` to see if there are any unused dependencies
that `cargo udeps` isn't reporting. It turned out, there are a few. This
PR removes those dependencies.
## Test Plan
`cargo build`
I previously add `spawn_blocking` to the version map construction as it
had become a bottleneck
(https://github.com/astral-sh/uv/pull/1163/files#diff-704ceeaedada99f90369eac535713ec82e19550bff166cd44745d7277ecae527R116).
With the zero copy deserialization, this has become so fast we don't
need to move it to the thread pool anymore. I've also checked
`DataWithCachePolicy` but it seems to still take a significant amount of
time. Span visualization:
Resolving jupyter warm:

Resolving jupyter cold:


I've also updated the instrumentation a little.
We don't seem cpu bound for the cold cache (top) and refresh case
(bottom) from jupyter:


Address a few pedantic lints
lints are separated into separate commits so they can be reviewed
individually.
I've not added enforcement for any of these lints, but that could be
added if desirable.
Closes https://github.com/astral-sh/uv/issues/1775
Closes https://github.com/astral-sh/uv/issues/1452
Closes https://github.com/astral-sh/uv/issues/1514
Follows https://github.com/astral-sh/uv/pull/1717
libgit2 does not support host names with extra identifiers during SSH
lookup (e.g. [`github.com-some_identifier`](
https://docs.github.com/en/authentication/connecting-to-github-with-ssh/managing-deploy-keys#using-multiple-repositories-on-one-server))
so we use the `git` command instead for fetching. This is required for
`pip` parity.
See the [Cargo
documentation](https://doc.rust-lang.org/nightly/cargo/reference/config.html#netgit-fetch-with-cli)
for more details on using the `git` CLI instead of libgit2. We may want
to try to use libgit2 first in the future, as it is more performant
(#1786).
We now support authentication with:
```
git+ssh://git@<hostname>/...
git+ssh://git@<hostname>-<identifier>/...
```
Tested with a deploy key e.g.
```
cargo run -- \
pip install uv-private-pypackage@git+ssh://git@github.com-test-uv-private-pypackage/astral-test/uv-private-pypackage.git \
--reinstall --no-cache -v
```
and
```
cargo run -- \
pip install uv-private-pypackage@git+ssh://git@github.com/astral-test/uv-private-pypackage.git \
--reinstall --no-cache -v
```
with a ssh config like
```
Host github.com
Hostname github.com
IdentityFile=/Users/mz/.ssh/id_ed25519
Host github.com-test-uv-private-pypackage
Hostname github.com
IdentityFile=/Users/mz/.ssh/id_ed25519
```
It seems quite hard to add test coverage for this to the test suite, as
we'd need to add the SSH key and I don't know how to isolate that from
affecting other developer's machines.
Fixes handling of GitHub PATs in HTTPS URLs, which were otherwise
dropped. We now supporting the following authentication schemes:
```
git+https://<user>:<token>/...
git+https://<token>/...
```
On Windows, the username is required. We can consider adding a
special-case for this in the future, but this just matches libgit2's
behavior.
I tested with fine-grained tokens, OAuth tokens, and "classic" tokens.
There's test coverage for fine-grained tokens in CI where we use a real
private repository and PAT. Yes, the PAT is committed to make this test
usable by anyone. It has read-only permissions to the single repository,
expires Feb 1 2025, and is in an isolated organization and GitHub
account.
Does not yet address SSH authentication.
Related:
- https://github.com/astral-sh/uv/issues/1514
- https://github.com/astral-sh/uv/issues/1452
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.
Then, update directory and file names:
```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```
Then add all the files again
```
# Add all the files again
git add crates
git add python/uv
# This one needs a force-add
git add -f crates/uv-trampoline
```