Commit graph

33 commits

Author SHA1 Message Date
Charlie Marsh
9905521957
Use shared client in Git fetch implementation (#4487)
## Summary

It turns out that the Git fetch implementation is initializing its own
client, which can be really expensive on macOS (due to loading native
certificates) _and_ bypasses any of our middleware. This PR modifies the
Git implementation to accept a shared client.
2024-06-24 17:09:29 -04:00
Charlie Marsh
7d3fb4330f
Skip submodule update for fresh clones (#4482)
## Summary

We unconditionally update the submodules in our Git code, but AFAICT it
shouldn't be necessary if we already have a complete, up-to-date fetch
available.
2024-06-24 17:09:14 -04:00
Charlie Marsh
11324646cb
Remove some anyhow usages (#3962) 2024-06-01 20:11:23 +00:00
Charlie Marsh
a70e33d947
Move reference check into uv-git (#3961) 2024-06-01 16:02:25 -04:00
Charlie Marsh
c04a95e037
Respect resolved Git SHAs in uv lock (#3956)
## Summary

This PR ensures that if a lockfile already contains a resolved reference
(e.g., you locked with `main` previously, and it locked to a specific
commit), and you run `uv lock`, we use the same SHA, even if it's not
the latest SHA for that tag. This avoids upgrading Git dependencies
without `--upgrade`.

Closes #3920.
2024-06-01 12:40:11 +00:00
Charlie Marsh
b7d77c04cc
Add Git resolver in lieu of static hash map (#3954)
## Summary

This PR removes the static resolver map:

```rust
static RESOLVED_GIT_REFS: Lazy<Mutex<FxHashMap<RepositoryReference, GitSha>>> =
    Lazy::new(Mutex::default);
```

With a `GitResolver` struct that we now pass around on the
`BuildContext`. There should be no behavior changes here; it's purely an
internal refactor with an eye towards making it cleaner for us to
"pre-populate" the list of resolved SHAs.
2024-05-31 22:44:42 -04:00
Ibraheem Ahmed
261aa2c70a
Port all git functionality to use git CLI (#3833)
## Summary

We currently rely on libgit2 for most git-related functionality.
However, libgit2 has long-standing performance issues, as well as lags
significantly behind git in terms of new features. For these reasons we
now use the git CLI by default for fetching repositories
(https://github.com/astral-sh/uv/pull/1781). This PR completely drops
libgit2 in favor of the git CLI for all git-related functionality, which
should allow us to use features such as partial clones and sparse
checkouts in the future for performance.

There is also a lot of technical debt in the current git code as it's
mostly taken from Cargo. Switching to the git CLI *vastly* simplifies
the `uv-git` codebase.

Eventually we might want to look into switching to
[`gitoxide`](https://github.com/Byron/gitoxide), but it's currently too
immature for our use case.
2024-05-30 15:28:48 -04:00
Charlie Marsh
cedd18e4c6
Remove some unused pub functions (#3872)
## Summary

I wrote a bad Python script to find these.
2024-05-28 15:58:13 +00:00
Andrew Gallant
976bc9ba0e uv-resolver: make PubGrubPackage orderable
It turns out that we use PubGrubPackage as the key in hashmaps in a fair
few places. And when we iterate over hashmaps, the order is unspecified.
This can in turn result in changes in output as a result of changes in
the PubGrubPackage definition, purely as a function of its changing
hash. This is confusing as there should be no semantic difference.

Thus, this is a precursor to introducing some more determinism to places
I found in the error reporting whose output depending on hashmap
iteration order.
2024-05-20 19:56:24 -04:00
Charlie Marsh
18b095ce28
Make from_rev take an owned value (#3631)
## Summary

We always clone internally, and in most case we're already passing
`&String`.
2024-05-18 17:26:15 +00:00
Charlie Marsh
0d512be46c
Support lossless serialization for Git dependencies in lockfile (#3630)
## Summary

This PR adds lossless deserialization for `GitSourceDist` distributions
in the lockfile. Specifically, we now properly preserve the requested
revision, the subdirectory, and the precise Git commit SHA.

## Test Plan

`cargo test`
2024-05-17 21:23:40 +00:00
konsti
0010954ca7
Add parsed URL to PubGrubPackage (#3426)
Avoid reparsing urls by storing the parsed parts across resolution on
`PubGrubPackage`.

Part 1 of #3408
2024-05-14 00:55:21 +00:00
Dimitri Papadopoulos Orfanos
d2ee567fe7
Fix a few typos found by codespell (#3543)
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Just fix typos.

While `alpha-numeric` is not really a misspelling:
- it is missing from mainstream curated dictionaries, all of them
suggest `alphanumeric`;
- it is less used than `alphanumeric` (more than ⨉10 less) according to
the Google [Ngram
Viewer](https://books.google.com/ngrams/graph?content=alpha-numeric%2Calphanumeric&year_start=1900&year_end=2019&corpus=en-2019);
- it is [missing from
SCOWL](http://app.aspell.net/lookup?dict=en_US-large;words=alpha-numeric).

## Test Plan

CI jobs.
2024-05-13 11:55:10 +00:00
Charlie Marsh
2a212eb6a9
Add branch and tag variants to Git reference (#3374)
## Summary

Closes https://github.com/astral-sh/uv/issues/3368.
2024-05-04 21:13:11 +00:00
konsti
4f87edbe66
Add basic tool.uv.sources support (#3263)
## Introduction

PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification

The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.

## `tool.uv.source`

Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:

```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
  "tqdm >=4.66.2,<5",
  "torch ==2.2.2",
  "transformers[torch] >=4.39.3,<5",
  "importlib_metadata >=7.1.0,<8; python_version < '3.10'",
  "mollymawk ==0.1.0"
]

[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }

[tool.uv.workspace]
include = [
  "packages/mollymawk"
]

[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```

See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.

This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.

## Changes

This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
  * Git dependencies
  * Url dependencies
  * Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)

It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies

One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.

## Implementation

The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.

We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.

I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.

I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.

## Demo

Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:


![pycharm](599082c7-6be5-41c1-a3cd-516092382f8d)


[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
2024-05-03 21:10:50 +00:00
konsti
f29c991e21
Dedicated error type for direct url parsing (#3181)
Add a dedicated error type for direct url parsing. This change is broken
out from the new uv requirement type, which uses direct url parsing
internally.
2024-04-22 11:57:36 +00:00
konsti
c85c52d4ce
Unify packse find links urls (#2969)
The sync scenarios script is broken, so i did the updates manually

```
$ ./scripts/sync_scenarios.sh
Setting up a temporary environment...
Using Python 3.12.1 interpreter at: /home/konsti/projects/uv/.venv/bin/python3
Creating virtualenv at: .venv
Activate with: source .venv/bin/activate
  × No solution found when resolving dependencies:
  ╰─▶ Because docutils==0.21.post1 is unusable because the package metadata was inconsistent and you require docutils==0.21.post1, we can conclude that the requirements are unsatisfiable.

      hint: Metadata for docutils==0.21.post1 was inconsistent:
        Package metadata version `0.21` does not match given version `0.21.post1`
```

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
2024-04-11 08:35:22 +00:00
Charlie Marsh
dd3009ad84
Respect Git tags and branches that look like short commits (#2795)
## Summary

If we're given a Git reference like `20240222`, we currently treat it as
a short commit hash. However... it _could_ be a branch or a tag. This PR
improves the Git reference logic to ensure that ambiguous references
like `20240222` are handled appropriately, by attempting to extract it
as a branch, then a tag, then a short commit hash.

Closes https://github.com/astral-sh/uv/issues/2772.
2024-04-03 22:05:54 -04:00
Charlie Marsh
684f790d5d
Preserve .git suffixes and casing in Git dependencies (#2789)
## Summary

I noticed in #2769 that I was now stripping `.git` suffixes from Git
URLs after resolving to a precise commit. This PR cleans up the internal
caching to use a better canonical representation: a `RepositoryUrl`
along with a `GitReference`, instead of a `GitUrl` which can contain
non-canonical data. This gives us both better fidelity (preserving the
`.git`, along with any casing that the user provided when defining the
URL) and is overall cleaner and more robust.
2024-04-03 00:24:29 +00:00
Charlie Marsh
c30a65ee0c
Allow conflicting Git URLs that refer to the same commit SHA (#2769)
## Summary

This PR leverages our lookahead direct URL resolution to significantly
improve the range of Git URLs that we can accept (e.g., if a user
provides the same requirement, once as a direct dependency, and once as
a tag). We did some of this in #2285, but the solution here is more
general and works for arbitrary transitive URLs.

Closes https://github.com/astral-sh/uv/issues/2614.
2024-04-02 23:36:35 +00:00
Charlie Marsh
ffd78d0821
Add an in-memory cache for Git references (#2682)
## Summary

Ensures that, even if we try to resolve the same Git reference twice
within an invocation, it always returns a (cached) consistent result.

Closes https://github.com/astral-sh/uv/issues/2673.

## Test Plan

```
❯ cargo run pip install git+https://github.com/pallets/flask.git --reinstall --no-cache
   Compiling uv-distribution v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-distribution)
   Compiling uv-resolver v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-resolver)
   Compiling uv-installer v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-installer)
   Compiling uv-dispatch v0.0.1 (/Users/crmarsh/workspace/uv/crates/uv-dispatch)
   Compiling uv-requirements v0.1.0 (/Users/crmarsh/workspace/uv/crates/uv-requirements)
   Compiling uv v0.1.24 (/Users/crmarsh/workspace/uv/crates/uv)
    Finished dev [unoptimized + debuginfo] target(s) in 3.95s
     Running `target/debug/uv pip install 'git+https://github.com/pallets/flask.git' --reinstall --no-cache`
 Updated https://github.com/pallets/flask.git (b90a4f1)
Resolved 7 packages in 280ms
   Built flask @ git+https://github.com/pallets/flask.git@b90a4f1f4a370e92054b9cc9db0efcb864f87ebe                                                                                                                                            Downloaded 7 packages in 212ms
Installed 7 packages in 9ms
```
2024-03-27 01:39:01 +00:00
veryyet
d6dad57fab
chore: fix some typos (#2581) 2024-03-21 04:09:37 +00:00
konsti
70e0967dbd
Avoid repeating paths of workspace packages (#2573)
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
2024-03-20 16:16:02 -04:00
Charlie Marsh
00fc44012c
Use relative paths for user display (#2559)
## Summary

This PR changes our user-facing representation for paths to use relative
paths, when the path is within the current working directory. This
mirrors what we do in Ruff. (If the path is _outside_ the current
working directory, we print an absolute path.)

Before:

```shell
❯ uv venv .venv2
Using Python 3.12.2 interpreter at: /Users/crmarsh/workspace/uv/.venv/bin/python3
Creating virtualenv at: .venv2
Activate with: source .venv2/bin/activate
```

After:

```shell
❯ cargo run venv .venv2
    Finished dev [unoptimized + debuginfo] target(s) in 0.15s
     Running `target/debug/uv venv .venv2`
Using Python 3.12.2 interpreter at: .venv/bin/python3
Creating virtualenv at: .venv2
Activate with: source .venv2/bin/activate
```

Note that we still want to use the existing `.simplified_display()`
anywhere that the path is being simplified, but _still_ intended for
machine consumption (e.g., when passing to `.current_dir()`).
2024-03-20 09:52:50 -04:00
Micha Reiser
acbee166c0
Remove unused dependencies (#2543)
## Summary

I tried out `cargo shear` to see if there are any unused dependencies
that `cargo udeps` isn't reporting. It turned out, there are a few. This
PR removes those dependencies.

## Test Plan

`cargo build`
2024-03-19 13:10:10 -04:00
Charlie Marsh
ef15098288
Use Simplified instead of Normalized for path prefix stripping (#2071)
## Summary

This directly matches the naming of the `dunce` methods.
2024-02-29 01:44:50 +00:00
Taniguchi Yasufumi
70e877d11c
Add fs_err to disallowed_method in clippy.toml (#1950)
## Summary

Resolve #1916

---------

Co-authored-by: konsti <konstin@mailbox.org>
2024-02-26 14:15:07 +00:00
konsti
70dad51cd9
Remove spawn_blocking from version map (#1966)
I previously add `spawn_blocking` to the version map construction as it
had become a bottleneck
(https://github.com/astral-sh/uv/pull/1163/files#diff-704ceeaedada99f90369eac535713ec82e19550bff166cd44745d7277ecae527R116).
With the zero copy deserialization, this has become so fast we don't
need to move it to the thread pool anymore. I've also checked
`DataWithCachePolicy` but it seems to still take a significant amount of
time. Span visualization:

Resolving jupyter warm:

![image](692b03da-61c5-4f96-b413-199c14aa47c4)

Resolving jupyter cold:

![image](a6893155-d327-40c9-a83a-7c537b7c99c4)

![image](213556a3-a331-42db-aaf5-bdef5e0205dd)

I've also updated the instrumentation a little.

We don't seem cpu bound for the cold cache (top) and refresh case
(bottom) from jupyter:

![image](cb976add-3d30-465a-a470-8490b7b6caea)

![image](d7ecb745-dd2d-4f91-939c-2e46b7c812dd)
2024-02-26 09:44:24 +00:00
danieleades
8d721830db
Clippy pedantic (#1963)
Address a few pedantic lints

lints are separated into separate commits so they can be reviewed
individually.

I've not added enforcement for any of these lints, but that could be
added if desirable.
2024-02-25 14:04:05 -05:00
Zanie Blue
10be62e9d3
Improve error message when git ref cannot be fetched (#1826)
Follow-up to #1781 improving the error message when a ref cannot be
fetched
2024-02-22 01:22:00 +00:00
Zanie Blue
71ec568d0f
Use git command to fetch repositories instead of libgit2 for robust SSH support (#1781)
Closes https://github.com/astral-sh/uv/issues/1775
Closes https://github.com/astral-sh/uv/issues/1452
Closes https://github.com/astral-sh/uv/issues/1514
Follows https://github.com/astral-sh/uv/pull/1717

libgit2 does not support host names with extra identifiers during SSH
lookup (e.g. [`github.com-some_identifier`](

https://docs.github.com/en/authentication/connecting-to-github-with-ssh/managing-deploy-keys#using-multiple-repositories-on-one-server))
so we use the `git` command instead for fetching. This is required for
`pip` parity.

See the [Cargo
documentation](https://doc.rust-lang.org/nightly/cargo/reference/config.html#netgit-fetch-with-cli)
for more details on using the `git` CLI instead of libgit2. We may want
to try to use libgit2 first in the future, as it is more performant
(#1786).

We now support authentication with:

```
git+ssh://git@<hostname>/...
git+ssh://git@<hostname>-<identifier>/...
```

Tested with a deploy key e.g.

```
cargo run -- \
    pip install uv-private-pypackage@git+ssh://git@github.com-test-uv-private-pypackage/astral-test/uv-private-pypackage.git \
    --reinstall --no-cache -v
```

and

```
cargo run -- \
    pip install uv-private-pypackage@git+ssh://git@github.com/astral-test/uv-private-pypackage.git \
    --reinstall --no-cache -v     
```

with a ssh config like

```
Host github.com
        Hostname github.com
        IdentityFile=/Users/mz/.ssh/id_ed25519

Host github.com-test-uv-private-pypackage
        Hostname github.com
        IdentityFile=/Users/mz/.ssh/id_ed25519
```

It seems quite hard to add test coverage for this to the test suite, as
we'd need to add the SSH key and I don't know how to isolate that from
affecting other developer's machines.
2024-02-21 12:44:32 -06:00
Zanie Blue
d07b587f3f
Retain passwords in Git URLs (#1717)
Fixes handling of GitHub PATs in HTTPS URLs, which were otherwise
dropped. We now supporting the following authentication schemes:

```
git+https://<user>:<token>/...
git+https://<token>/...
```

On Windows, the username is required. We can consider adding a
special-case for this in the future, but this just matches libgit2's
behavior.

I tested with fine-grained tokens, OAuth tokens, and "classic" tokens.
There's test coverage for fine-grained tokens in CI where we use a real
private repository and PAT. Yes, the PAT is committed to make this test
usable by anyone. It has read-only permissions to the single repository,
expires Feb 1 2025, and is in an isolated organization and GitHub
account.

Does not yet address SSH authentication.

Related:
- https://github.com/astral-sh/uv/issues/1514
- https://github.com/astral-sh/uv/issues/1452
2024-02-21 00:12:56 +00:00
Zanie Blue
2586f655bb
Rename to uv (#1302)
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.

Then, update directory and file names:

```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g

# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```

Then add all the files again

```
# Add all the files again
git add crates
git add python/uv

# This one needs a force-add
git add -f crates/uv-trampoline
```
2024-02-15 11:19:46 -06:00