## Summary
This PR modifies the resolver to treat the Python version as a package,
which allows for better error messages (since we no longer treat
incompatible packages as if they "don't exist at all").
There are a few tricky pieces here...
First, we need to track both the interpreter's Python version and the
_target_ Python version, because we support resolving for other versions
via `--python 3.7`.
Second, we allow using incompatible wheels during resolution, as long as
there's a compatible source distribution. So we still need to test for
`requires-python` compatibility when selecting distributions.
This could use more testing, but it feels like an area where `packse`
would be more productive than writing PyPI tests.
Closes https://github.com/astral-sh/puffin/issues/406.
This PR fixes our prefetching logic to ensure that we always attempt to
prefetch the "best-guess" distribution for all dependencies. This logic
already existed, but because we only attempted to prefetch when package
metadata was available, it almost never triggered. Now, we wait for the
package metadata to become available, _then_ kick off the "best-guess"
prefetch (for every package).
In my testing, this dramatically improves performance (like 2x). I'm
wondering if this regressed at some point?
Closes#743.
Co-authored-by: konsti <konstin@mailbox.org>
I've tried to investigate puffin's performance wrt to builds and
parallelism in general, but found the previous instrumentation to
granular. I've tried to add spans to every function that either needs
noticeable io or cpu resources without creating duplication. This also
fixes some wrong tracing usage on async functions
(https://docs.rs/tracing/latest/tracing/struct.Span.html#in-asynchronous-code)
and some spans that weren't actually entered.
This PR adds a dedicated error message for resolutions that fail, but
might've succeeded if pre-releases were allowed. Specifically, if we see
a failed resolution, and failed to find a version for a package that
included a pre-release marker, we add a hint nudging the user to
explicitly enable all pre-releases.
We'd prefer a solution like
https://github.com/astral-sh/puffin/pull/666, but believe that it will
break some assumptions in PubGrub, so this is the lighter-weight
solution.
Closes https://github.com/astral-sh/puffin/issues/659.
The `async fn` and return-position `impl Trait` in traits improve
`BuildContext` ergonomics. The traits use `impl Future` over `async fn`
to make the send bound explicit
(https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html).
The remaining changes are due to clippy.
This PR combines three small changes to finish up the install-many
testing.
* Download pypi_10k_most_dependents.txt in script I'd like to have the
setup process of the large scale checks automated.
* Some install-many dev script improvements
* Fix mkl_fft-1.3.6-58-cp310-cp310-manylinux2014_x86_64.whl:
mkl_fft-1.3.6-58-cp310-cp310-manylinux2014_x86_64.whl has multiple
Wheel-Version entries, we have to ignore that like pip
Apart from the mkl-fft fix the only other errors i've seen showing up
are
https://github.com/astral-sh/puffin/issues/520#issuecomment-1869625642.
## Summary
This PR adds support for relative URLs in the simple JSON responses. We
already support relative URLs for HTML responses, but the handling has
been consolidated between the two. Similar to index URLs, we now store
the base alongside the metadata, and use the base when resolving the
URL.
Closes#455.
## Test Plan
`cargo test` (to test HTML indexes). Separately, I also ran `cargo run
-p puffin-cli -- pip-compile requirements.in -n
--index-url=http://localhost:3141/packages/pypi/+simple` on the
`zb/relative` branch with `packse` running, and forced both HTML and
JSON by limiting the `accept` header.
## Summary
This PR makes the `pypi_types::File` a response-only type (i.e., a type
that's only used when deserializing over the wire), and adds a separate
internal `File` type. Right now, the representations are similar, but
already, we can avoid the "lenient" deserialization on our internal
`File` type, and avoid the special-casing of the property names that's
required in the JSON. Over time, we can evolve this representation
entirely separately from the representation we receive from PyPI and
other indexes.
This crate started off as generic caching utilities, but we started
adding a lot of Puffin-specific stuff (like the cache buckets
abstraction that knows about Git vs. direct URL vs. indexes and so on).
This PR moves the generic stuff into a new `cache-key` crate.
I don't have a good testing strategy here (I'm manually testing against
`devpi` via `packse`), but the HTML index uses (e.g.)
`data-requires-python=">=3.8"`, so we need to decode.
From manual inspection, this dataset generated through the [libraries.io
API](https://libraries.io/api#project-search) seems more mainstream than
the current 8k one, which is also preserved. I've added the dataset to
the repo because the API requires an API key.
We lock git checkout directories and the virtualenv to avoid two puffin
instances running in parallel changing files at the same time and
leading to a broken state. When one instance is blocking another, we
need to inform the user (why is the program hanging?) and also add some
information for them to debug the situation.
The new messages will print
```
Waiting to acquire lock for /home/konsti/projects/puffin/.venv (lockfile: /home/konsti/projects/puffin/.venv/.lock)
```
or
```
Waiting to acquire lock for git+https://github.com/pydantic/pydantic-extra-types@0ce9f207a1e09a862287ab77512f0060c1625223 (lockfile: /home/konsti/projects/puffin/cache-all-kinds/git-v0/locks/f157fd329a506a34)
```
The messages aren't perfect but clear enough to see what the contention
is and in the worst case to delete the lockfile.
Fixes#714
Easier than i expected: We simply never construct the pubgrub error
variants since we have our own main loop. The `unreachable!()`s can be
removed when never is stabilized
Otherwise, when a server does not support HTTP range requests we throw
an error instead of downloading without range requests.
---------
Co-authored-by: konstin <konstin@mailbox.org>
This allows the default index URL to be easily overridden with a local
index e.g. a `packse` server
```
export PUFFIN_INDEX_URL="http://localhost:3141/packages/all/+simple"
```
For the install tests, i need the ability to ignore failures in the
`DistFinder`. To avoid just copy&pasting a version that collects errors
separately, i followed
https://gendignoux.com/blog/2021/04/01/rust-async-streams-futures-part1.html
and switched the custom channel over to an async stream yielding
`Result` items.
I like the async streams mirror the normal iterator api.
The high level goal here is to improve the tests for the version parser.
Namely, we now check not just that version strings parse successfully,
but that they parse to the expected result.
We also do a few other cleanups. Most notably, `Version` is now an
opaque type so that we can more easily change its representation going
forward.
Reviewing commit-by-commit is suggested. :-)
The test creates a cache from multiple sources and injects faults (once
using invalid data and once by making the files unreadable on the fs
level), then resolves again.
I didn't test git because it has its own locking and correctness logic.
The main drawback is that this test is slow (2.5s for me), we could
`#[ignore]` it.
This is a pure refactor to follow-up #690, to separate the metadata that
we know upfront about distributions (like the version, for
registry-based distributions) vs. the metadata that requires building
(like the version, for URL-based distributions).
We now show the fully-resolved URL, rather than the URL as given by the
user, _everywhere_ except for the output resolution file (which should
retain relative paths, unexpanded environment variables, etc.).
Closes https://github.com/astral-sh/puffin/issues/687.
With `Option<T>` and `.unwrap_or_default()` later, the default of `T`
isn't shown in the help output.
Old:
```
--link-mode <LINK_MODE>
The method to use when installing packages from the global cache
Possible values:
- clone: Clone (i.e., copy-on-write) packages from the wheel into the site packages
- copy: Copy packages from the wheel into the site packages
- hardlink: Hard link packages from the wheel into the site packages
-q, --quiet
Do not print any output
--resolution <RESOLUTION>
Possible values:
- highest: Resolve the highest compatible version of each package
- lowest: Resolve the lowest compatible version of each package
- lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies
--prerelease <PRERELEASE>
Possible values:
- disallow: Disallow all pre-release versions
- allow: Allow all pre-release versions
- if-necessary: Allow pre-release versions if all versions of a package are pre-release
- explicit: Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
- if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```

New:
```
--link-mode <LINK_MODE>
The method to use when installing packages from the global cache
[default: hardlink]
Possible values:
- clone: Clone (i.e., copy-on-write) packages from the wheel into the site packages
- copy: Copy packages from the wheel into the site packages
- hardlink: Hard link packages from the wheel into the site packages
-q, --quiet
Do not print any output
--resolution <RESOLUTION>
[default: highest]
Possible values:
- highest: Resolve the highest compatible version of each package
- lowest: Resolve the lowest compatible version of each package
- lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies
--prerelease <PRERELEASE>
[default: if-necessary-or-explicit]
Possible values:
- disallow: Disallow all pre-release versions
- allow: Allow all pre-release versions
- if-necessary: Allow pre-release versions if all versions of a package are pre-release
- explicit: Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
- if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```

This PR uses borrowed data in `BuildDispatch` which makes creating a
`BuildDispatch` extremely cheap (only one allocation, for the Python
executable). I can be talked out of this, it will have no measurable
impact.