Commit graph

36 commits

Author SHA1 Message Date
Charlie Marsh
ca92b55605
Make .egg-info filename parsing spec compliant (#4533)
## Summary

It turns out that `.egg-info` files and directories can _both_ have up
to four segments in the filename:
https://setuptools.pypa.io/en/latest/deprecated/python_eggs.html#filename-embedded-metadata.
This PR upgrades the parsing and now uses the same parsing for files and
directories.

Closes https://github.com/astral-sh/uv/issues/4532.
2024-06-25 23:49:43 +00:00
Charlie Marsh
44592681a0
Represent build tag as u64 (#4253)
## Summary

The build tags in this case are like, e.g., `202206090410`. That's
larger than a `u32`, so we're rejecting the wheel. In theory build tags
could be even larger, but we already use `u64` for version segment so I
think it's fine to keep that constraint here.

I'm going to look into surfacing these errors separately.

Closes https://github.com/astral-sh/uv/issues/4252.

## Test Plan

`cargo run pip install monailabel`
2024-06-11 21:40:08 +00:00
Charlie Marsh
a9d9a6c13f
Incorporate build tag into wheel prioritization (#3781)
## Summary

It turns out that in the
[spec](https://packaging.python.org/en/latest/specifications/binary-distribution-format/#file-name-convention),
if a wheel filename includes a build tag, then we need to use it to
break ties. This PR implements that behavior. (Previously, we dropped
the build tag entirely.)

Closes #3779.

## Test Plan

Run: `cargo run pip install -i https://pypi.anaconda.org/intel/simple
mkl_fft==1.3.8 --python-platform linux --python-version 3.10`. This now
resolves without error. Previously, we selected build tag 63 of
`mkl_fft==1.3.8`, which led to an incompatibility with NumPy. Now, we
select build tag 70.
2024-05-23 21:12:53 +00:00
Andrew Gallant
1089abda3f
require serde and rkyv everywhere; remove optional serde and rkyv features (#3345)
In *some* places in our crates, `serde` (and `rkyv`) are optional
dependencies. I believe this was done out of reasons of "good sense,"
that is, it follows a Rust ecosystem pattern where serde integration
tends to be an opt-in crate feature. (And similarly for `rkyv`.)

However, ultimately, `uv` itself requires `serde` and `rkyv` to
function. Since our crates are strictly internal, there are limited
consumers for our crates without `serde` (and `rkyv`) enabled. I think
one possibility is that optional `serde` (and `rkyv`) integration means
that someone can do this:

    cargo test -p pep440_rs

And this will run tests _without_ `serde` or `rkyv` enabled. That in
turn could lead to faster iteration time by reducing compile times. But,
I'm not sure this is worth supporting. The iterative compilation times
of
individual crates are probably fast enough in debug mode, even with
`serde` and `rkyv` enabled. Namely, `serde` and `rkyv` themselves
shouldn't need to be re-compiled in most cases. On `main`:

```
from-scratch: `cargo test -p pep440_rs --lib` 0.685
incremental: `cargo test -p pep440_rs --lib` 0.278s
from-scratch: `cargo test -p pep440_rs --features serde,rkyv --lib` 3.948s
incremental: `cargo test -p pep440_rs --features serde,rkyv --lib` 0.321s
```

So while a from-scratch build does take significantly longer, an
incremental build is about the same.

The benefit of doing this change is two-fold:

1. It brings out crates into alignment with "reality." In particular,
   some crates were _implicitly_ relying on `serde` being enabled
   without explicitly declaring it. This technically means that our
   `Cargo.toml`s were wrong in some cases, but it is hard to observe it
   because of feature unification in a Cargo workspace.
2. We no longer need to deal with the cognitive burden of writing
   `#[cfg_attr(feature = "serde", ...)]` everywhere.
2024-05-03 10:21:03 -04:00
konsti
538a85b827
Add missing optional rkyv feature bound (#3336)
`PackageName` needs to derive the rkyv types.
2024-05-02 10:01:10 +00:00
Sergey Kolosov
d2551bb2bd
Add support for .tar.bz2 source distributions (#3069)
## Summary

Source distributions in the .tar.bz2 format are still relatively common
within the existing code-bases, namely, the most common examples are the
Twisted source distributions up to the version 20.3.0. As quite so often
the ability to upgrade Twisted to a more recent version is not available
for a given project, we add the support for .tar.bz2 here to still allow
`uv` to be a drop-in replacement for `pip` in these projects.

## Test Plan

The feature was tested both by adding the corresponding test coverage,
and by directly installing a package of interest under a Python version
that doesn't have the corresponding wheel:

```sh
cargo run venv -p python3.8
cargo run pip install Twisted==20.3.0 --no-cache
```

The `--no-cache` argument in the example above serves the purpose of
cleaning the cached information regarding the unsatisfiability of the
requirements, as it may have been cached during some previous attempt to
install this package by `uv` version that didn't implement this feature
yet.
2024-04-16 18:34:55 +00:00
konsti
48bd02b8a8
Update miette v7, pubgrub and small Cargo.toml cleanup (#2610)
I was going through the output of `cargo tree --duplicate -p uv`, not
much success except these small cleanups.
2024-03-22 10:42:48 +00:00
konsti
70e0967dbd
Avoid repeating paths of workspace packages (#2573)
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
2024-03-20 16:16:02 -04:00
dependabot[bot]
e66afa8767
Bump insta from 1.35.1 to 1.36.1 (#2180) 2024-03-04 23:01:49 +00:00
danieleades
8d721830db
Clippy pedantic (#1963)
Address a few pedantic lints

lints are separated into separate commits so they can be reviewed
individually.

I've not added enforcement for any of these lints, but that could be
added if desirable.
2024-02-25 14:04:05 -05:00
dependabot[bot]
019e2fd1b5
Bump insta from 1.34.0 to 1.35.1 (#1942) 2024-02-23 21:00:35 +00:00
Zanie Blue
2586f655bb
Rename to uv (#1302)
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.

Then, update directory and file names:

```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g

# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```

Then add all the files again

```
# Add all the files again
git add crates
git add python/uv

# This one needs a force-add
git add -f crates/uv-trampoline
```
2024-02-15 11:19:46 -06:00
Zanie Blue
e9e3e573a2
Report incompatible distributions to users (#1293)
Instead of dropping versions without a compatible distribution, we track
them as incompatibilities in the solver. This implementation follows
patterns established in https://github.com/astral-sh/puffin/pull/1290.

This required some significant refactoring of how we track incompatible
distributions. Notably:

- `Option<TagPriority>` is now `WheelCompatibility` which allows us to
track the reason a wheel is incompatible instead of just `None`.
- `Candidate` now has a `CandidateDist` with `Compatible` and
`Incompatibile` variants instead of just `ResolvableDist`; candidates
are not strictly compatible anymore
- `ResolvableDist` was renamed to `CompatibleDist`
- `IncompatibleWheel` was given an ordering implementation so we can
track the "most compatible" (but still incompatible) wheel. This allows
us to collapse the reason a version cannot be used to a single
incompatibility.
- The filtering in the `VersionMap` is retained, we still only store one
incompatible wheel per version. This is sufficient for error reporting.
- A `TagCompatibility` type was added for tracking which part of a wheel
tag is incompatible
- `Candidate::validate_python` moved to
`PythonRequirement::validate_dist`

I am doing more refactoring in #1298 — I think a couple passes will be
necessary to clarify the relationships of these types.

Includes improved error message snapshots for multiple incompatible
Python tag types from #1285 — we should add more scenarios for coverage
of behavior when multiple tags with different levels are present.
2024-02-15 10:48:15 -06:00
Andrew Gallant
5219d37250
add initial rkyv support (#1135)
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.

For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:

* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
  cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
  type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
  of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
  else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
  introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
  given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
  allocate and create all necessary objects.

This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.

The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.

My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.

There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.

[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
2024-01-28 12:14:59 -05:00
konsti
e9b6b6fa36
Implement --find-links as flat indexes (directories in pip-compile) (#912)
Add directory `--find-links` support for local paths to pip-compile.

It seems that pip joins all sources and then picks the best package. We
explicitly give find links packages precedence if the same exists on an
index and locally by prefilling the `VersionMap`, otherwise they are
added as another index and the existing rules of precedence apply.

Internally, the feature is called _flat index_, which is more meaningful
than _find links_: We're not looking for links, we're picking up local
directories, and (TBD) support another index format that's just a flat
list of files instead of a nested index.

`RegistryBuiltDist` and `RegistrySourceDist` now use `WheelFilename` and
`SourceDistFilename` respectively. The `File` inside `RegistryBuiltDist`
and `RegistrySourceDist` gained the ability to represent both a url and
a path so that `--find-links` with a url and with a path works the same,
both being locked as `<package_name>@<version>` instead of
`<package_name> @ <url>`. (This is more of a detail, this PR in general
still work if we strip that and have directory find links represented as
`<package_name> @ file:///path/to/file.ext`)

`PrioritizedDistribution` and `FlatIndex` have been moved to locations
where we can use them in the upstack PR.

I added a `scripts/wheels` directory with stripped down wheels to use
for testing.

We're lacking tests for correct tag priority precedence with flat
indexes, i only confirmed this manually since it is not covered in the
pip-compile or pip-sync output.

Closes #876
2024-01-15 02:04:10 +00:00
konsti
4d8bfd7f61
Split source dist error type into error and kind (#872)
It's a better, less redundant error type. It will come in handy when
adding a second parse function.
2024-01-10 17:42:54 +00:00
Andrew Gallant
6c98ae9d77
pep440: rewrite the parser and make version comparisons cheaper (#789)
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s

real    34.579
user    34.004
sys     0.413
maxmem  2867 MB
faults  0
```

And now with this PR:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s

real    9.218
user    8.919
sys     0.165
maxmem  463 MB
faults  0
```

This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:

```
$ hyperfine -w10 \
    "puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
    "puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     337.5 ms ±   3.9 ms    [User: 310.5 ms, System: 73.2 ms]
  Range (min … max):   333.6 ms … 343.4 ms    10 runs

Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     189.8 ms ±   3.0 ms    [User: 168.1 ms, System: 78.4 ms]
  Range (min … max):   185.0 ms … 196.2 ms    15 runs

Summary
  puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
    1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```

There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.

Fixes #373, Closes #396
2024-01-05 11:57:32 -05:00
Charlie Marsh
5bce699ee1
Add support for HTML indexes (#719)
## Summary

This PR adds support for HTML index responses (as with
`--index-url=https://download.pytorch.org/whl`).

Closes https://github.com/astral-sh/puffin/issues/412.
2023-12-24 16:04:00 +00:00
konsti
b84fbb86b2
Impl Version debug as display (#606)
Currently, `dbg!` is hard to read because versions are verbose, showing
all optional fields, and we have a lot of versions. Changing debug
formatting to displaying the version number (which can be losslessly
converted to the struct and back) makes this more readable.

See e.g.
https://gist.github.com/konstin/38c0f32b109dffa73b3aa0ab86b9662b

**Before**

```text
version: Version {
    epoch: 0,
    release: [
        1,
        2,
        3,
    ],
    pre: None,
    post: None,
    dev: None,
    local: None,
},
```

**After**

```text
version: "1.2.3",
```
2023-12-11 16:38:14 +01:00
Zanie Blue
ef7be9103c
Parse SimpleJson into categorized data in the client (#522)
Extends #517 with a suggestion from @konstin to parse the `SimpleJson`
into an intermediate type `SimpleMetadata(BTreeMap<Version,
VersionFiles>)` before converting to a `VersionMap`. This reduces the
number of times we need to parse the response. Additionally, we cache
the parsed response now instead of `SimpleJson`.

`VersionFiles` stores two vectors with
`WheelFilename`/`SourceDistFilename` and `File` tuples. These can be
iterated over together or separately. A new enum `DistFilename` was
added to capture the `SourceDistFilename` and `WheelFilename` variants
allowing iteration over both vectors.
2023-12-07 11:04:47 -06:00
Charlie Marsh
5370484307
Remove .whl extension for cached, unzipped wheels (#574)
## Summary

This PR uses the wheel stem (e.g., `foo-1.2.3-py3-none-any`) instead of
the wheel name (e.g., `foo-1.2.3-py3-none-any.whl`) when storing
unzipped wheels in the cache, which removes a class of confusing issues
around overwrites and directory-vs.-file collisions.

For now, we retain _both_ the zipped and unzipped wheels in the cache,
though we can easily change this by storing the zipped wheels in a
temporary directory.

Closes https://github.com/astral-sh/puffin/issues/573.

## Test Plan

Some examples from my local cache:

<img width="835" alt="Screen Shot 2023-12-05 at 4 09 55 PM"
src="784146aa-b080-416e-9767-40c843fe5d6a">
<img width="847" alt="Screen Shot 2023-12-05 at 4 12 14 PM"
src="4bc7f30f-bef3-47f1-b4e8-da9cabf87f28">
<img width="637" alt="Screen Shot 2023-12-05 at 4 09 50 PM"
src="25ca4944-4a06-4a08-ac85-c6f7d8b5c8ea">
2023-12-05 22:41:22 +00:00
Charlie Marsh
9d35128840
Use Clippy lint table over Cargo config (#490)
Closes https://github.com/astral-sh/puffin/issues/482.
2023-11-22 15:10:27 +00:00
konsti
45d032dd7d
Fix wheel filename serialization (#465)
We need an underscore in the wheel filename, not a dash
2023-11-20 11:21:22 +00:00
konsti
2fed14fdc6
Optional serde feature for distribution-filename (#461)
https://github.com/astral-sh/puffin/pull/459#discussion_r1398482972
2023-11-19 19:53:32 +00:00
konsti
255edf4445
Serde support for WheelFilename through str repr (#459)
I need this later, splitting out for PR size
2023-11-19 19:43:14 +00:00
Charlie Marsh
d3caf9ae86
Choose most-compatible wheel in resolver and installer (#422)
## Summary

This PR implements logic to sort wheels by priority, where priority is
defined as preferring more "specific" wheels over less "specific"
wheels. For example, in the case of Black, my machine now selects
`black-23.11.0-cp311-cp311-macosx_11_0_arm64.whl`, whereas sorting by
lowest priority instead gives me `black-23.11.0-py3-none-any.whl`.

As part of this change, I've also modified the resolver to fallback to
using incompatible wheels when determining package metadata, if no
compatible wheels are available.

The `VersionMap` was also moved out of `resolver.rs` and into its own
file with a wrapper type, for clarity.

Closes https://github.com/astral-sh/puffin/issues/380.
Closes https://github.com/astral-sh/puffin/issues/421.
2023-11-15 18:22:11 +00:00
Charlie Marsh
6a15950cb5
Rename Distribution to Dist in all structs and traits (#384)
We tend to avoid abbreviations, but this one is just so long and
absolutely ubiquitous.
2023-11-10 14:55:11 +00:00
konsti
5cef40d87a
Add proper caching for pypi metadata fetching kinds (#368)
I intend this to become the main form of caching for puffin: You can
make http requests, you tranform the data to what you really need, you
have control over the cache key, and the cache is always json (or
anything else much faster we want to replace it with as long as it's
serde!)
2023-11-10 11:03:40 +00:00
Andrew Gallant
33c0901a28
distribution-filename: speed up is_compatible (#367)
This PR tweaks the representation of `Tags` in order to offer a
faster implementation of `WheelFilename::is_compatible`. We now use a
nested map of tags that lets us avoid looping over every supported
platform tag. As the code comments suggest, that is the essential gain.
We still do not mind looping over the tags in each wheel name since they
tend to be quite small. And pushing our thumb on that side of things can
make things worse overall since it would likely slow down WheelFilename
construction itself.

For micro-benchmarks, we improve considerably for compatibility
checking:

    $ critcmp base test3
group base test3
----- ---- -----
build_platform_tags/burntsushi-archlinux 1.00 46.2±0.28µs ? ?/sec 2.48
114.8±0.45µs ? ?/sec
wheelname_parsing/flyte-long-compatible 1.00 624.8±3.31ns 174.0 MB/sec
1.01 629.4±4.30ns 172.7 MB/sec
wheelname_parsing/flyte-long-incompatible 1.00 743.6±4.23ns 165.4 MB/sec
1.00 746.9±4.62ns 164.7 MB/sec
wheelname_parsing/flyte-short-compatible 1.00 526.7±4.76ns 54.3 MB/sec
1.01 530.2±5.81ns 54.0 MB/sec
wheelname_parsing/flyte-short-incompatible 1.00 540.4±4.93ns 60.0 MB/sec
1.01 545.7±5.31ns 59.4 MB/sec
wheelname_parsing_failure/flyte-long-extension 1.00 13.6±0.13ns 3.2
GB/sec 1.01 13.7±0.14ns 3.2 GB/sec
wheelname_parsing_failure/flyte-short-extension 1.00 14.0±0.20ns 1160.4
MB/sec 1.01 14.1±0.14ns 1146.5 MB/sec
wheelname_tag_compatibility/flyte-long-compatible 11.33 159.8±2.79ns
680.5 MB/sec 1.00 14.1±0.23ns 7.5 GB/sec
wheelname_tag_compatibility/flyte-long-incompatible 237.60
1671.8±37.99ns 73.6 MB/sec 1.00 7.0±0.08ns 17.1 GB/sec
wheelname_tag_compatibility/flyte-short-compatible 16.07 223.5±8.60ns
128.0 MB/sec 1.00 13.9±0.30ns 2.0 GB/sec
wheelname_tag_compatibility/flyte-short-incompatible 149.83 628.3±2.13ns
51.6 MB/sec 1.00 4.2±0.10ns 7.6 GB/sec

We do regress slightly on the time it takes for `Tags::new` to run, but
this is somewhat expected. And in absolute terms, 114us is perfectly
acceptable given that it's only executed ~once for each `puffin`
invocation.

Ad hoc benchmarks indicate an overall 25% perf improvement in `puffin
pip-compile` times. This roughly corresponds with how much time
`is_compatible` was taking. Indeed, profiling confirms that it has
virtually disappeared from the profile.

Fixes #157
2023-11-09 09:01:03 -05:00
konsti
c11586f2f0
Fix index out of bounds in SourceDistributionFilename::parse (#353)
Found this one in the top 8k pypi tests too
2023-11-07 11:44:40 +00:00
konsti
81f380b10e
Validate package and extra name (#290)
`PackageName` and `ExtraName` can now only be constructed from valid
names. They share the same rules, so i gave them the same
implementation. Constructors are split between `new` (owned) and
`from_str` (borrowed), with the owned version avoiding allocations.

Closes #279

---------

Co-authored-by: Zanie <contact@zanie.dev>
2023-11-06 10:04:31 +00:00
konsti
4adaa9a700
Wheel filename distribution package name (#278)
The normalized name abstractions were not consistently, this PR uses
them where they were previously missing:
* `WheelFilename::distribution`
* `Requirement::name`
* `Requirement::extras`
* `Metadata21::name`
* `Metadata21::provides_dist`

With `puffin-package` depending on `pep508_rs` this would be cyclical
crate dependency, so `puffin-normalize` gets split out from
`puffin-package`.

`DistInfoName` has the same task and semantics as `PackageName`, so it's
merged into the latter.

`PackageName` and `ExtraName` documentation is moved onto the type and
their constructors are called `new` instead of `normalize`. We now use
these constructors rarely enough the implicit allocation by
`to_string()` shouldn't matter anymore, while more actual cloning
becomes visible.
2023-11-02 11:15:27 +00:00
Charlie Marsh
2652caa3e3
Add support for URL dependencies (#251)
## Summary

This PR adds support for resolving and installing dependencies via
direct URLs, like:

```
werkzeug @ 960bb4017c/Werkzeug-2.0.0-py3-none-any.whl
```

These are fairly common (e.g., with `torch`), but you most often see
them as Git dependencies.

Broadly, structs like `RemoteDistribution` and friends are now enums
that can represent either registry-based dependencies or URL-based
dependencies:

```rust
/// A built distribution (wheel) that exists as a remote file (e.g., on `PyPI`).
#[derive(Debug, Clone)]
#[allow(clippy::large_enum_variant)]
pub enum RemoteDistribution {
    /// The distribution exists in a registry, like `PyPI`.
    Registry(PackageName, Version, File),
    /// The distribution exists at an arbitrary URL.
    Url(PackageName, Url),
}
```

In the resolver, we now allow packages to take on an extra, optional
`Url` field:

```rust
#[derive(Debug, Clone, Eq, Derivative)]
#[derivative(PartialEq, Hash)]
pub enum PubGrubPackage {
    Root,
    Package(
        PackageName,
        Option<DistInfoName>,
        #[derivative(PartialEq = "ignore")]
        #[derivative(PartialOrd = "ignore")]
        #[derivative(Hash = "ignore")]
        Option<Url>,
    ),
}
```

However, for the purpose of version satisfaction, we ignore the URL.
This allows for the URL dependency to satisfy the transitive request in
cases like:

```
flask==3.0.0
werkzeug @ 254c3e9b5f/werkzeug-3.0.1-py3-none-any.whl
```

There are a couple limitations in the current approach:

- The caching for remote URLs is done separately in the resolver vs. the
installer. I decided not to sweat this too much... We need to figure out
caching holistically.
- We don't support any sort of time-based cache for remote URLs -- they
just exist forever. This will be a problem for URL dependencies, where
we need some way to evict and refresh them. But I've deferred it for
now.
- I think I need to redo how this is modeled in the resolver, because
right now, we don't detect a variety of invalid cases, e.g., providing
two different URLs for a dependency, asking for a URL dependency and a
_different version_ of the same dependency in the list of first-party
dependencies, etc.
- (We don't yet support VCS dependencies.)
2023-11-01 09:21:44 -04:00
Charlie Marsh
89dad0c9ad
Move distribution abstraction in shared crate (#258)
This also allows us to get rid of `PinnedPackage` _and_ to remove some
`Result<...>` types due to needless conversions between
otherwise-identical types.
2023-10-31 15:30:06 -04:00
Charlie Marsh
d0aeb2ac80
Remove vector allocation in WheelFilename (#177) 2023-10-24 01:23:14 +00:00
konsti
ae9d1f7572
Add source distribution filename abstraction (#154)
The need for this became clear when working on the source distribution
integration into the resolver.

While at it i also switch the `WheelFilename` version to the parsed
`pep440_rs` version now that we have this crate.
2023-10-20 17:45:57 +02:00