Commit graph

1622 commits

Author SHA1 Message Date
Zanie Blue
44e39bdca3
Replace Python bootstrapping script with Rust implementation (#2842)
See https://github.com/astral-sh/uv/issues/2617

Note this also includes:
- #2918 
- #2931 (pending)

A first step towards Python toolchain management in Rust.

First, we add a new crate to manage Python download metadata:

- Adds a new `uv-toolchain` crate
- Adds Rust structs for Python version download metadata
- Duplicates the script which downloads Python version metadata
- Adds a script to generate Rust code from the JSON metadata
- Adds a utility to download and extract the Python version

I explored some alternatives like a build script using things like
`serde` and `uneval` to automatically construct the code from our
structs but deemed it to heavy. Unlike Rye, I don't generate the Rust
directly from the web requests and have an intermediate JSON layer to
speed up iteration on the Rust types.

Next, we add add a `uv-dev` command `fetch-python` to download Python
versions per the bootstrapping script.

- Downloads a requested version or reads from `.python-versions`
- Extracts to `UV_BOOTSTRAP_DIR`
- Links executables for path extension

This command is not really intended to be user facing, but it's a good
PoC for the `uv-toolchain` API. Hash checking (via the sha256) isn't
implemented yet, we can do that in a follow-up.

Finally, we remove the `scripts/bootstrap` directory, update CI to use
the new command, and update the CONTRIBUTING docs.

<img width="1023" alt="Screenshot 2024-04-08 at 17 12 15"
src="57bd3cf1-7477-4bb8-a8e9-802a00d772cb">
2024-04-10 11:22:41 -05:00
Chan Kang
7cd98d2499
Implement --emit-index-annotation to annotate source index for each package (#2926)
## Summary
resolves https://github.com/astral-sh/uv/issues/2852

## Test Plan
add a couple of tests:
- one covering the simplest case with all packages pulled from a single
index.
- another where packages are pull from two distinct indices.

tested manually as well:
```
$ (echo 'pandas'; echo 'torch') | UV_EXTRA_INDEX_URL='https://download.pytorch.org/whl/cpu' cargo run pip compile - --include-indices 
    Finished dev [unoptimized + debuginfo] target(s) in 0.60s
     Running `target/debug/uv pip compile - --include-indices`
Resolved 15 packages in 686ms
# This file was autogenerated by uv via the following command:
#    uv pip compile - --include-indices
filelock==3.9.0
    # via torch
    # from https://download.pytorch.org/whl/cpu
fsspec==2023.4.0
    # via torch
    # from https://download.pytorch.org/whl/cpu
jinja2==3.1.2
    # via torch
    # from https://download.pytorch.org/whl/cpu
markupsafe==2.1.3
    # via jinja2
    # from https://download.pytorch.org/whl/cpu
mpmath==1.3.0
    # via sympy
    # from https://download.pytorch.org/whl/cpu
networkx==3.2.1
    # via torch
    # from https://download.pytorch.org/whl/cpu
numpy==1.26.3
    # via pandas
    # from https://download.pytorch.org/whl/cpu
pandas==2.2.1
    # from https://pypi.org/simple
python-dateutil==2.9.0.post0
    # via pandas
    # from https://pypi.org/simple
pytz==2024.1
    # via pandas
    # from https://pypi.org/simple
six==1.16.0
    # via python-dateutil
    # from https://pypi.org/simple
sympy==1.12
    # via torch
    # from https://download.pytorch.org/whl/cpu
torch==2.2.2
    # from https://download.pytorch.org/whl/cpu
typing-extensions==4.8.0
    # via torch
    # from https://download.pytorch.org/whl/cpu
tzdata==2024.1
    # via pandas
    # from https://pypi.org/simple
```
2024-04-10 16:05:58 +00:00
Charlie Marsh
a01143980a
Upgrade reqwest to v0.12.3 (#2817)
## Summary

Closes #2814.
2024-04-10 11:20:44 -04:00
Charlie Marsh
d8323551f8
Update hashes without --upgrade if not present (#2966)
## Summary

If the user runs with `--generate-hashes`, and the lockfile doesn't
contain _any_ hashes for a package (despite being pinned), we should add
new hashes. This mirrors running `uv pip compile --generate-hashes` for
the first time with an existing lockfile.

Closes #2962.
2024-04-10 14:56:34 +00:00
Zanie Blue
bbe46c074c
Upgrade packse (#2963)
Should improve test performance with
https://github.com/astral-sh/packse/pull/169 thanks @konstin !
2024-04-10 09:30:57 -05:00
Charlie Marsh
38ab39c439
Strip query string when parsing filename from HTML index (#2961)
## Summary

Closes https://github.com/astral-sh/uv/issues/2958.
2024-04-10 09:25:29 -05:00
Zanie Blue
c345a79b9b
Add python-patch feature to isolate tests that require Python patch versions to match our suite (#2940)
Closes https://github.com/astral-sh/uv/issues/2165
Follows https://github.com/astral-sh/uv/pull/2930
2024-04-10 09:01:25 -05:00
konsti
273de456ea
Remove rust 1.75 workaround (#2959) 2024-04-10 13:26:18 +00:00
konsti
15f0be8f04
Allow profiling tests with tracing instrumentation (#2957)
To get more insights into test performance, allow instrumenting tests
with tracing-durations-export.

Usage:

```shell
# A single test
TRACING_DURATIONS_TEST_ROOT=$(pwd)/target/test-traces cargo test --features tracing-durations-export --test pip_install_scenarios no_binary -- --exact
# All tests
TRACING_DURATIONS_TEST_ROOT=$(pwd)/target/test-traces cargo nextest run --features tracing-durations-export
```

Then we can e.g. look at
`target/test-traces/pip_install_scenarios::no_binary.svg` and see the
builds it performs:


![image](40b4e094-debc-4b22-8aa3-9471998674af)
2024-04-10 10:15:27 +00:00
Charlie Marsh
c4472ebbb9
Enforce and backtrack on invalid versions in source metadata (#2954)
## Summary

If we build a source distribution from the registry, and the version
doesn't match that of the filename, we should error, just as we do for
mismatched package names. However, we should also backtrack here, which
we didn't previously.

Closes https://github.com/astral-sh/uv/issues/2953.

## Test Plan

Verified that `cargo run pip install docutils --verbose --no-cache
--reinstall` installs `docutils==0.21` instead of the invalid
`docutils==0.21.post1`.

In the logs, I see:

```
WARN Unable to extract metadata for docutils: Package metadata version `0.21` does not match given version `0.21.post1`
```
2024-04-10 05:13:33 +00:00
Aria Beingessner
997f3c9161
chore: update axoupdater to 0.4.0 and add a test (#2938)
## Summary

This updates to the version of axoupdater used in cargo-dist 0.13.0's
own selfupdate command, with all relevant fixes for platforms. It also
tentatively introduces a mildly dangerous self-runtest that runs `uv
self update` and checks that the binary is installed and executable.

I *believe* some adjustments need to be made to your CI to have this new
test run, because it requires the `self-update` feature to be enabled,
and I didn't want to just start messing with how you do feature coverage
in your CI. **As a result I haven't yet had a chance to actually fully
run this in CI**, though I've locally tested it on windows (with the
guard disabled).


## Test Plan

Most of the machinery here is provided by axoupdater itself (cargo-dist
also includes a variant of these tests in its codebase). This initial
implementation has a couple major limitations:

* This is For Reals modifying the system that runs the test (so it's off
unless it detects it's running in CI, and if you want variations on this
test they'll need to be [run in
serial](5e7826f7b0/cargo-dist/tests/cli-tests.rs (L235))).
Since many of the testing issues were surrounding precise details of
Actual Deployed Executions, this seemed worth the tradeoff.
* The actual installer *script* it's ultimately invoking is the one you
last published, and *not* the one that cargo-dist will make when you
next publish.

We're already working on implementing some logic for "get cargo-dist to
generate a fresh installer script too", which is in fact the basis of a
huge amount of cargo-dist's own testsuite. Now that we're dogfooding
this stuff, it should be quite hard for this stuff to break without
cargo-dist's own codebase noticing it first.


<!-- How was it tested? -->
2024-04-09 23:41:16 -04:00
Charlie Marsh
7ae06b3b46
Surface invalid metadata as hints in error reports (#2850)
## Summary

Closes #2847.
2024-04-09 23:12:10 -04:00
Zanie Blue
ee9059978a
Add ecosystem test for Prefect (#2942)
Reproduced https://github.com/astral-sh/uv/issues/2941 and confirmed
fix.

We probably ought to have some ecosystem test coverage — this seems like
a good starting point we can extend to other projects in the future.
2024-04-09 21:29:39 -05:00
Charlie Marsh
7bcca28b12
Bump version to v0.1.31 (#2944) 2024-04-09 19:20:43 +00:00
Charlie Marsh
f9c0632953
Ignore direct URL distributions in prefetcher (#2943)
## Summary

The prefetcher tallies the number of times we tried a given package, and
then once we hit a threshold, grabs the version map, assuming it's
already been fetched. For direct URL distributions, though, we don't
have a version map! And there's no need to prefetch.

Closes https://github.com/astral-sh/uv/issues/2941.
2024-04-09 14:09:41 -05:00
Charlie Marsh
83e2297633
Store common fields on BuiltWheelIndex struct (#2939)
## Summary

This mirrors the structure of the `RegistryWheelIndex`. It will be
useful once these indexes check hashes too.
2024-04-09 13:30:02 -04:00
Charlie Marsh
13ae5ac8dc
Replace PyPI-internal Hashes representation with flat vector (#2925)
## Summary

Right now, we have a `Hashes` representation that looks like:

```rust
/// A dictionary mapping a hash name to a hex encoded digest of the file.
///
/// PEP 691 says multiple hashes can be included and the interpretation is left to the client.
#[derive(Debug, Clone, Eq, PartialEq, Default, Deserialize)]
pub struct Hashes {
    pub md5: Option<Box<str>>,
    pub sha256: Option<Box<str>>,
    pub sha384: Option<Box<str>>,
    pub sha512: Option<Box<str>>,
}
```

It stems from the PyPI API, which returns a dictionary of hashes.

We tend to pass these around as a vector of `Vec<Hashes>`. But it's a
bit strange because each entry in that vector could contain multiple
hashes. And it makes it difficult to ask questions like "Is
`sha256:ab21378ca980a8` in the set of hashes"?

This PR instead treats `Hashes` as the PyPI-internal type, and uses a
new `Vec<HashDigest>` everywhere in our own APIs.
2024-04-09 16:56:16 +00:00
Zanie Blue
1512e07a2e
Split configuration options out of uv-types (#2924)
Needed to prevent circular dependencies in my toolchain work (#2931). I
think this is probably a reasonable change as we move towards persistent
configuration too?

Unfortunately `BuildIsolation` needs to be in `uv-types` to avoid
circular dependencies still. We might be able to resolve that in the
future.
2024-04-09 11:35:53 -05:00
Charlie Marsh
90735660cb
Upgrade cargo-dist (#2936) 2024-04-09 16:19:22 +00:00
Charlie Marsh
a4f5a7d233
Bump version to v0.1.30 (#2934) 2024-04-09 12:06:11 -04:00
Zanie Blue
1cdadbdec8
Add filtering of patch Python versions unless explicitly requested (#2930)
Elides Python patch versions from the test suite unless the test
specifically requests a patch version.

This reduces some toil when not using our bootstrapped Python versions.

Partially addresses https://github.com/astral-sh/uv/issues/2165 though
we'll need changes to the scenario tests to really support their case.
2024-04-09 10:04:28 -05:00
Zanie Blue
d7ff8d93c0
Skip scenario tests on Windows (#2932)
These tests are about resolver correctness, which should not be platform
dependent and Windows CI is horribly slow.
2024-04-09 09:57:30 -05:00
Charlie Marsh
07e3694c3c
Separate local archive vs. local source tree paths in source database (#2922)
## Summary

When you specify a source distribution via a path, it can either be a
path to an archive (like a `.tar.gz` file), or a source tree (a
directory). Right now, we handle both paths through the same methods in
the source database. This PR splits them up into separate handlers.

This will make hash generation a little easier, since we need to
generate hashes for archives, but _can't_ generate hashes for source
trees.

It also means that we can now store the unzipped source distribution in
the cache (in the case of archives), and avoid unzipping the source
distribution needlessly on every invocation; and, overall, let's un
enforce clearer expectations between the two routes (e.g., what errors
are possible vs. not), at the cost of duplicating some code.

Closes #2760 (incidentally -- not exactly the motivation for the change,
but it did accomplish it).
2024-04-09 01:12:33 +00:00
Charlie Marsh
06e96a8f58
DRY up source distribution fetching between wheel and metadata routes (#2921)
These will get more involved with hash-checking, so easiest to extract
them now.

No functional changes.
2024-04-09 00:14:42 +00:00
Charlie Marsh
4f14e2a764
Rebrand Manifest as Revision in wheel database (#2920)
## Summary

I think this is a much clearer name for this concept: the set of
"versions" of a given wheel or source distribution. We also use
"Manifest" elsewhere to refer to the set of requirements, constraints,
etc., so this was overloaded.
2024-04-08 20:00:57 -04:00
Charlie Marsh
1ab471d167
Reduce visibility of some methods in source database (#2919) 2024-04-08 23:49:23 +00:00
Zanie Blue
31860565f6
Disable CentOS system check (#2916)
This is broken (see https://github.com/astral-sh/uv/issues/2915) and not
a priority since we have Amazon Linux coverage
2024-04-08 21:33:31 +00:00
Zanie Blue
f42013214a
Restore lockfile (#2914)
Accidentally reverted the lockfile in
538c88130e

Closes #2912 
Closes #2910 
Closes #2913
2024-04-08 21:25:33 +00:00
Zanie Blue
538c88130e
Group pyo3 dependency updates (#2889)
Seems needed for https://github.com/astral-sh/uv/pull/2879
2024-04-08 16:06:55 -05:00
Charlie Marsh
cc3c5700e1
Use scheme parsing to determine absolute vs. relative URLs (#2904)
## Summary

We have a heuristic in `File` that attempts to detect whether a URL is
absolute or relative. However, `contains("://")` is prone to false
positive. In the linked issues, the URLs look like:

```
/packages/5a/d8/4d75d1e4287ad9d051aab793c68f902c9c55c4397636b5ee540ebd15aedf/pytz-2005k.tar.bz2?hash=597b596dc1c2c130cd0a57a043459c3bd6477c640c07ac34ca3ce8eed7e6f30c&remote=4d75d1e428/pytz-2005k.tar.bz2 (sha256)=597b596dc1c2c130cd0a57a043459c3bd6477c640c07ac34ca3ce8eed7e6f30c
```

Which is relative, but includes `://`.

Instead, we should determine whether the URL has a _scheme_ which
matches the `Url` crate internally.

Closes https://github.com/astral-sh/uv/issues/2899.
2024-04-08 17:04:27 -04:00
Zanie Blue
bdeab55193
Add extract support for zstd (#2861)
We need this to extract toolchain downloads
2024-04-08 15:34:08 -05:00
Charlie Marsh
c46772eec5
Add a layer of indirection to the local path-based wheel cache (#2909)
## Summary

Right now, the path-based wheel cache just looks at the symlink to the
archives directory, checks the timestamp on it, and continues with that
symlink as long as the timestamp is up-to-date.

The HTTP-based wheel meanwhile, uses an intermediary `.http` file, which
includes the HTTP caching information. The `.http` file's payload is
just a path pointing to an entry in the archives directory.

This PR modifies the path-based codepaths to use a similar cache file,
which stores a timestamp along with a path to the archives directory.
The main advantage here is that we can add other data to this cache file
(namely, hashes in the future).

## Test Plan

Beyond existing tests, I also verified that this doesn't require a
version bump:

```
git checkout main 
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall
git checkout charlie/manifest
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall
cargo run pip install ~/Downloads/zeal-0.0.1-py3-none-any.whl --cache-dir baz --reinstall --refresh
```
2024-04-08 19:32:59 +00:00
Charlie Marsh
134810c547
Respect cached local --find-links in install plan (#2907)
## Summary

I think this is kind of just an oversight. If a wheel is available via
`--find-links`, and the index is "local", we never find it in the cache.

## Test Plan

`cargo test`
2024-04-08 18:58:33 +00:00
Charlie Marsh
31a67f539f
Remove unused local wheel types (#2906)
## Summary

No behavior changes. Just removing unused code.
2024-04-08 18:15:20 +00:00
Charlie Marsh
1daa35176f
Always return unzipped wheels from the distribution database (#2905)
## Summary

In all cases, we unzip these immediately after returning. By moving the
unzipping into the database, we can remove a bunch of code (coming in a
separate PR), and pave the way for hash-checking, since hash generation
will _also_ happen in the database, and splitting the caching layers
across the database and the unzipper creates complications.

Closes #2863.
2024-04-08 14:07:17 -04:00
Sławomir Ehlert
f1630a70f5
Suppress MultipleHandlers from Ctrl-C in confirm (#2903)
## Summary

Fixes #2900

## Test Plan

Tried reproducing the steps described in #2900,
but with `cargo run -- pip ...` and it didn't crash 😄.
2024-04-08 17:18:53 +00:00
Charlie Marsh
10dfd43af9
DRY up HTTP request builder in source database (#2902) 2024-04-08 14:45:26 +00:00
Charlie Marsh
f11a5e2208
DRY up local wheel path in distribution database (#2901) 2024-04-08 10:40:17 -04:00
konsti
fb4ba2bbc2
Speed up cold cache urllib3/boto3/botocore with batched prefetching (#2452)
With pubgrub being fast for complex ranges, we can now compute the next
n candidates without taking a performance hit. This speeds up cold cache
`urllib3<1.25.4` `boto3` from maybe 40s - 50s to ~2s. See docstrings for
details on the heuristics.

**Before**


![image](b7b06950-e45b-4c49-b65e-ae19fe9888cc)

**After**


![image](1c749248-850e-49c1-9d57-a7d78f87b3aa)

---

We need two parts of the prefetching, first looking for compatible
version and then falling back to flat next versions. After we selected a
boto3 version, there is only one compatible botocore version remaining,
so when won't find other compatible candidates for prefetching. We see
this as a pattern where we only prefetch boto3 (stack bars), but not
botocore (sequential requests between the stacked bars).


![image](e5186800-23ac-4ed1-99b9-4d1046fbd03a)

The risk is that we're completely wrong with the guess and cause a lot
of useless network requests. I think this is acceptable since this
mechanism only triggers when we're already on the bad path and we should
simply have fetched all versions after some seconds (assuming a fast
index like pypi).

---

It would be even better if the pubgrub state was copy-on-write so we
could simulate more progress than we actually have; currently we're
guessing what the next version is which could be completely wrong, but i
think this is still a valuable heuristic.

Fixes #170.
2024-04-08 14:28:56 +00:00
renovate[bot]
47333c985b
Update Rust crate tokio to v1.37.0 (#2886) 2024-04-08 09:35:00 -04:00
renovate[bot]
e3ebd4de10
Update debian Docker tag to v12 (#2896) 2024-04-08 09:34:48 -04:00
Zanie Blue
e5ea1785ff
Renovate: Group updates to development dependencies (#2888)
I don't think we need to audit these individually since they're not
user-facing.
2024-04-08 07:21:50 +01:00
Zanie Blue
b181907ad2
Fix linehaul tests (#2891)
Cleans up the assertions a bit. I looked into snapshot tests per #2564
but it didn't seem worth it for cross-platform tests.

Closes #2564 
Closes https://github.com/astral-sh/uv/pull/2878
2024-04-07 23:42:19 -05:00
renovate[bot]
356a26646c
Update fedora Docker tag to v41 (#2898) 2024-04-07 23:41:49 -05:00
renovate[bot]
aa7760534f
Update dependency ubuntu to v22 (#2897) 2024-04-07 23:41:22 -05:00
renovate[bot]
31813f90c7
Update pre-commit dependencies (#2893) 2024-04-08 04:35:44 +00:00
renovate[bot]
61e06bb2c3
Update Rust crate rayon to v1.10.0 (#2880) 2024-04-07 22:54:40 -05:00
renovate[bot]
a866cb2f32
Update Rust crate insta to v1.38.0 (#2877) 2024-04-07 22:54:16 -05:00
renovate[bot]
f0c83a4ded
Update Rust crate base64 to 0.22.0 (#2874) 2024-04-08 03:02:35 +00:00
Charlie Marsh
52577892eb
Expand some documentation around identifier traits (#2876)
## Summary

I already added more documentation since this issue was created, but
this doesn't hurt.

Closes https://github.com/astral-sh/uv/issues/496.
2024-04-08 02:50:24 +00:00