Commit graph

791 commits

Author SHA1 Message Date
konsti
0dde84dd27
Fix main (#635)
Seems to be a PR timing error
2023-12-13 13:55:06 +01:00
Charlie Marsh
ea920e22d1
Validate environment after pip-sync (#629)
Not 100% sure that we actually want to do this, it seems reasonable
though.

Closes https://github.com/astral-sh/puffin/issues/410.
2023-12-13 09:13:43 +01:00
Charlie Marsh
cbfd39093e
Clean up some function signatures (#633) 2023-12-13 06:21:47 +00:00
Charlie Marsh
920e10fc8f
Use FxHash consistently (#632) 2023-12-13 05:36:03 +00:00
Charlie Marsh
edd741bf13
Add a diagnostic to detect invalid Python versions (#630)
Related to: https://github.com/astral-sh/puffin/issues/410.
2023-12-13 03:45:02 +00:00
Charlie Marsh
a24eb57e93
Make warnings user-facing (#628)
## Summary

Now, `puffin_warnings::warn_once` and `puffin_warnings::warn` will go to
`stderr`, as long as the user isn't running under `--quiet`. Previously,
these went through `tracing`, and so were only visible when running
under `--verbose`.
2023-12-12 21:24:38 -05:00
Zanie Blue
490fb55ac5
Use available versions to simplify unsat error reports (#547)
Uses https://github.com/pubgrub-rs/pubgrub/pull/156 to consolidate
version ranges in error reports using the actual available versions for
each package.

Alternative to https://github.com/zanieb/pubgrub/pull/8 which implements
this behavior as a method in the `Reporter` — here it's implemented in
our custom report formatter (#521) instead which requires no upstream
changes.

Requires https://github.com/zanieb/pubgrub/pull/11 to only retrieve the
versions for packages that will be used in the report.

This is a work in progress. Some things to do:
- ~We may want to allow lazy retrieval of the version maps from the
formatter~
- [x] We should probably create a separate error type for no solution
instead of mixing them with other resolve errors
- ~We can probably do something smarter than creating vectors to hold
the versions~
- [x] This degrades error messages when a single version is not
available, we'll need to special case that
- [x] It seems safer to coerce the error type in `resolve` instead of
`solve` if feasible
2023-12-12 23:25:16 +00:00
Charlie Marsh
a8512d7d51
Remove one string clone (#626) 2023-12-12 20:56:15 +00:00
konsti
a24a681db9
Towards using prepare_metadata_for_build_wheel in the resolver (#616)
Make `prepare_metadata_for_build_wheel` accessible across the puffin
codebase by splitting the built call into a setup, a metadata and a
wheel call. This does not actually use the hook yet, but it's the
required refactoring for it.

Part of #599.
2023-12-12 20:45:37 +00:00
Charlie Marsh
85c37b2b9c
Add extra to debug logging (#625) 2023-12-12 20:09:09 +00:00
Charlie Marsh
f459e1ee50
Use a non-async Mutex in OnceMap (#624)
I don't know why, but this seems to resolve
https://github.com/astral-sh/puffin/issues/619. The Tokio docs also say
that using Tokio's Mutex is _not_ recommended unless you need to hold
the Mutex across an `.await`, which we don't.

Since this is a non-deterministic failure, I just ran it a bunch of
times and ensured it didn't hang (whereas it did hang occasionally prior
to this PR).

Closes https://github.com/astral-sh/puffin/issues/619
2023-12-12 14:59:45 -05:00
Charlie Marsh
4fb2e0955e
Add a fast-path to skip resolution when installation is complete (#613)
For a very large resolution (a few hundred packages), I see 13ms vs.
400ms for a no-op. It's worth optimizing this case, in my opinion.
2023-12-12 17:43:12 +00:00
Charlie Marsh
3aaab32a9d
Omit extra in resolver progress (#623)
Closes #621.
2023-12-12 12:41:18 -05:00
Charlie Marsh
6c7f5cb846
Validate installed packages in virtual environment (#611)
## Summary

Now, after running `pip-install`, we validate that the set of installed
packages is consistent -- that is, that we don't have any packages that
are missing dependencies, or incompatible versions of installed
dependencies.
2023-12-12 17:33:38 +00:00
Charlie Marsh
c764155988
Avoid double-resolving during pip-install (#610)
## Summary

At present, when performing a `pip-install`, we first do a resolution,
then take the set of requirements and basically run them through our
`pip-sync`, which itself includes re-resolving the dependencies to get a
specific `Dist` for each package. (E.g., the set of requirements might
say `flask==3.0.0`, but the installer needs a specific _wheel_ or source
distribution to install.)

This PR removes this second resolution by exposing the set of pinned
packages from the resolution. The main challenge here is that we have an
optimization in the resolver such that we let the resolver read metadata
from an incompatible wheel as long as a source distribution exists for a
given package. This lets us avoid building source distributions in the
resolver under the assumption that we'll be able to install the package
later on, if needed. As such, the resolver now needs to track the
resolution and installation filenames separately.
2023-12-12 17:29:09 +00:00
Charlie Marsh
a0b3815d84
Respect existing versions when pip-installing (#608)
## Summary

When running `puffin pip-install`, we should respect versions that are
already installed in the environment. For example, if you run `puffin
pip-install flask==2.0.0` and then `puffin pip-install flask`, we should
avoid upgrading Flask. The most natural way to model this is to mark
them as "preferences".

(It's not enough to just filter those requirements out prior to
resolving, since we may not have the _dependencies_ of those packages
installed. We _could_ recursively verify this across the
`site-packages`, but that would be a larger PR.)
2023-12-12 17:22:47 +00:00
Charlie Marsh
974cb4cc15
Add a pip-install subcommand (#607)
## Summary

This PR adds a `pip-install` command that operates like, well, `pip
install`. In short, it resolves the provided dependency, then makes sure
they're all installed in the environment. The primary differences with
`pip-sync` are that (1) `pip-sync` ignores dependencies, and assumes
that the packages represent a complete set; and (2) `pip-sync`
uninstalls any unlisted packages.

There are a bunch of TODOs that I'll resolve in subsequent PRs.

Closes https://github.com/astral-sh/puffin/issues/129.
2023-12-12 12:16:00 -05:00
Charlie Marsh
3e837da5b8
Avoid unicode decoding in name normalization (#617) 2023-12-12 10:01:02 -05:00
konsti
5ae4023cda
Activate venv before source dist build (#567)
Fixes #552
2023-12-12 15:46:37 +01:00
konsti
7c1dd71f66
Implement editable installs in dev command (#566)
First step, sufficient to run
```shell
cargo run --bin puffin-dev -- build --editable -w target/editables/ scripts/editable-installs/poetry_editable/
```
and check the wheel to confirm its working. Tests will be added with the
pip-sync integration.
2023-12-12 15:45:55 +01:00
Charlie Marsh
c25d5240f1
Remove regular expressions for package name normalization (#614)
Very random but the hand-written version is about 3-4x faster
(benchmarked in a standalone repo).
2023-12-12 05:50:48 +00:00
Charlie Marsh
edcb71b1be
Remove some unused fields from SimpleJson (#612) 2023-12-11 23:01:37 -05:00
Charlie Marsh
1181288078
Download, build, and install in a single pipeline phase (#605)
## Summary

At present, we have two separate phases within the installation pipeline
related to populating wheels into the cache. The first phase downloads
the distribution, and then builds any source distributions into wheels;
the second phase unzips all the built wheels into the cache.

This PR merges those two phases into one, such that we seamlessly
download, build, and unzip wheels in one pass. This is more efficient,
since we can start unzipping while we build. It also ensures that if the
install _fails_ partway through, we don't end up with a bunch of
downloaded wheels that we never had a chance to unzip. The code is also
much simpler.

The main downside is that the user-facing feedback isn't as granular,
since we only have one phase and one progress bar for what was
originally three distinct phases.

Closes https://github.com/astral-sh/puffin/issues/571.

## Test Plan

I ran the benchmark script on two separate requirements files, and saw a
7% and 31% speedup respectively:

```text
+ TARGET=./scripts/benchmarks/requirements.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     269.4 ms ±  33.0 ms    [User: 42.4 ms, System: 117.5 ms]
  Range (min … max):   221.7 ms … 446.7 ms    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     250.6 ms ±  28.3 ms    [User: 41.5 ms, System: 127.4 ms]
  Range (min … max):   207.6 ms … 336.4 ms    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache' ran
    1.07 ± 0.18 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
```

```text
+ TARGET=./scripts/benchmarks/requirements-large.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      5.053 s ±  0.354 s    [User: 1.413 s, System: 6.710 s]
  Range (min … max):    4.584 s …  6.333 s    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      3.845 s ±  0.225 s    [User: 1.364 s, System: 6.970 s]
  Range (min … max):    3.482 s …  4.715 s    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
```
2023-12-11 15:42:29 +00:00
konsti
b84fbb86b2
Impl Version debug as display (#606)
Currently, `dbg!` is hard to read because versions are verbose, showing
all optional fields, and we have a lot of versions. Changing debug
formatting to displaying the version number (which can be losslessly
converted to the struct and back) makes this more readable.

See e.g.
https://gist.github.com/konstin/38c0f32b109dffa73b3aa0ab86b9662b

**Before**

```text
version: Version {
    epoch: 0,
    release: [
        1,
        2,
        3,
    ],
    pre: None,
    post: None,
    dev: None,
    local: None,
},
```

**After**

```text
version: "1.2.3",
```
2023-12-11 16:38:14 +01:00
Charlie Marsh
00f1703111
Avoid storing partial wheels in the cache (#604)
Closes https://github.com/astral-sh/puffin/issues/603.
2023-12-09 19:11:30 -05:00
Charlie Marsh
32f54a5947
Use async Command for wheel build operations (#601)
Incredibly, this speeds up the install on a large project from 2m6s to
50s.
2023-12-09 16:20:52 +00:00
Charlie Marsh
f1c05dcd66
Buffer streamed file writes (#602) 2023-12-09 16:20:31 +00:00
Charlie Marsh
0499fe0613
Fix incorrect unknown size marker in traces (#600)
It said `(unknown size)` for _all_ disk-based wheels.
2023-12-09 04:46:01 +00:00
Charlie Marsh
24d81912cf
Use consistent change event order (#598)
Closes #591.
2023-12-09 04:12:40 +00:00
Charlie Marsh
714a64549b
Use a progress bar for the build phase (#597)
I think this might've been an oversight when copying over the build
reporting during the source distribution refactor.
2023-12-09 04:05:13 +00:00
Charlie Marsh
5878f8dde7
Misc. tweaks to puffin-lib's lib.rs (#596) 2023-12-09 03:37:47 +00:00
Charlie Marsh
600c5d072f
Avoid walrus operator in PEP 517 scripts (#595)
I believe this unnecessarily puts a Python 3.7+ requirement on these
scripts.
2023-12-09 01:25:22 +00:00
Charlie Marsh
a24534b0ce
Use rustc-hash instead of fxhash crate (#594)
`fxhash` is the old, less maintained version of this crate
(`rustc-hash`). We use the latter in Ruff.
2023-12-08 20:27:49 +00:00
konsti
6005d7a552
Keep track of in flight unzips using OnceMap (#544)
I saw warnings when we were e.g. unzipping wheel and setuptools in two
tasks at the same time. We now keep track of in flight unzips.

This introduces a `OnceMap` abstraction which we also use in the
resolver.
2023-12-08 20:18:11 +00:00
Charlie Marsh
ffb8480087
Add --reinstall flag to pip-sync (#590)
## Summary

This PR adds two flags to `pip-sync`: `--reinstall`, and
`--reinstall-package [PACKAGE]`. The former reinstalls all packages in
the requirements, while the latter can be repeated and reinstalls all
specified packages.

For our purposes, a reinstall includes (1) purging the cache, and (2)
marking any already-installed versions as extraneous.

Closes #572.

Closes https://github.com/astral-sh/puffin/issues/271.
2023-12-08 19:58:42 +00:00
Charlie Marsh
4b8642c6f7
Enable selective cache purging in puffin clean (#589)
## Summary

This PR enables `puffin clean` to accept package names as command line
arguments, and selectively purge entries from the cache tied to the
given package.

Relate to #572.

## Test Plan

Modified all the caching tests to run an additional step to (1) purge
the cache, and (2) re-install the package.
2023-12-08 19:51:32 +00:00
Charlie Marsh
cbe1cb4229
Avoid race when unpacking wheels (#593)
## Summary

If someone else beats us to the unzip, we should let them win.

We already have a check for this at the top of the unzip method, but
it's also possible that two source distributions get built in parallel
that both try to unpack the same build dependency.
2023-12-08 17:46:19 +00:00
Charlie Marsh
5ae3a8b1cb
Restructure Git cache to include package name (#588)
## Summary

This PR modifies the Git wheel cache to: (1) use a shorter version of
the SHA, to save space; and (2) include the package name, for
consistency with all other buckets.

I considered removing the URL hash entirely, and _just_ using the SHA,
which would be even _more_ consistent with other buckets. But if we
remove the URL, then we won't have separate directories for
subdirectories (which are part of the URL).

Before:

<img width="1035" alt="Screen Shot 2023-12-07 at 7 23 42 PM"
src="86afce67-682f-464f-9ba1-0b60d5b7f19f">

After:

<img width="1232" alt="Screen Shot 2023-12-07 at 8 09 23 PM"
src="eda42a19-974f-47fe-8c83-54a602ddfd2d">
2023-12-07 20:17:41 -05:00
Zanie Blue
ef7be9103c
Parse SimpleJson into categorized data in the client (#522)
Extends #517 with a suggestion from @konstin to parse the `SimpleJson`
into an intermediate type `SimpleMetadata(BTreeMap<Version,
VersionFiles>)` before converting to a `VersionMap`. This reduces the
number of times we need to parse the response. Additionally, we cache
the parsed response now instead of `SimpleJson`.

`VersionFiles` stores two vectors with
`WheelFilename`/`SourceDistFilename` and `File` tuples. These can be
iterated over together or separately. A new enum `DistFilename` was
added to capture the `SourceDistFilename` and `WheelFilename` variants
allowing iteration over both vectors.
2023-12-07 11:04:47 -06:00
Charlie Marsh
5d3ce963b2
Raise an error when pip-sync manifest contains duplicates (#584)
Also ensures that we filter out any incompatible requirements when
building the install plan. In general, we assume that requirements were
generated by `pip-compile`, in which case all requirements should be
compatible and there should be no duplicates; but we should handle this
case gracefully.

Closes https://github.com/astral-sh/puffin/issues/582.
2023-12-07 05:26:42 +00:00
Charlie Marsh
a825b2db06
Shard the registry cache by package (#583)
## Summary

This PR modifies the cache structure in a few ways. Most notably, we now
shard the set of registry wheels by package, and index them lazily when
computing the install plan.

This applies both to built wheels:

<img width="989" alt="Screen Shot 2023-12-06 at 4 42 19 PM"
src="0e8a306f-befd-4be9-a63e-2303389837bb">

And remote wheels:

<img width="836" alt="Screen Shot 2023-12-06 at 4 42 30 PM"
src="7fd908cd-dd86-475e-9779-07ed067b4a1a">

For other distributions, we now consistently cache using the package
name, which is really just for clarity and debuggability (we could
consider omitting these):

<img width="955" alt="Screen Shot 2023-12-06 at 4 58 30 PM"
src="3e8d0f99-df45-429a-9175-d57b54a72e56">

Obliquely closes https://github.com/astral-sh/puffin/issues/575.
2023-12-07 05:02:46 +00:00
Charlie Marsh
aa065f5c97
Modify install plan to support all distribution types (#581)
This PR adds caching support for built wheels in the installer.
Specifically, the `RegistryWheelIndex` now indexes both downloaded and
built wheels (from registries), and we have a new `BuiltWheelIndex` that
takes a subdirectory and returns the "best-matching" compatible wheel.

Closes #570.
2023-12-07 04:43:34 +00:00
Charlie Marsh
edaeb9b0e8
Add tests for repeated installs with source distributions (#580)
Adds a few more tests for re-installs with various kinds of source
distributions, and changes the tests to use packages that we can safely
import (via `check_command`) for extra validation.

Once we properly respect cached built wheels, we should expect these
snapshots to change, since we'll no longer download and re-build
unnecessarily.
2023-12-06 20:02:32 +00:00
Zanie Blue
2bb04771ce
Allow switching out the resolver's IO (#517)
I'm working off of @konstin's commit here to implement arbitrary unsat
test cases for the resolver.

The entirety of the resolver's io are two functions: Get the version map
for a package (PEP 440 version -> distribution) and get the metadata for
a distribution. A new trait `ResolverProvider` abstracts these two away and
allows replacing the real network requests e.g. with stored responses
(https://github.com/pradyunsg/pip-resolver-benchmarks/blob/main/scenarios/pyrax_198.json).

---------

Co-authored-by: konsti <konstin@mailbox.org>
2023-12-06 11:53:16 -06:00
konsti
366c389385
Parse editable installs (#564)
Parse `-e` for editable installs in `requirements.txt`.

Unlike all the other requirements, editable installs don't have the name
of the package specified.
2023-12-06 18:21:15 +01:00
konsti
3f4d7b7826
Improve path source dist caching (#578)
Path distribution cache reading errors are no longer fatal.

We now invalidate the path file source dists if its modification
timestamp changed, and invalidate path dir source dists if
`pyproject.toml` or alternatively `setup.py` changed, which seems good
choices since changing pyproject.toml should trigger a rebuild and the
user can `touch` the file as part of their workflow.

`CachedByTimestamp` is now a shared util. It doesn't have methods as i
don't think it's worth it yet for two users.

Closes #478

TODO(konstin): Write a test. This is probably twice as much work as that
fix itself, so i made that PR without one for now.
2023-12-06 11:47:01 -05:00
konsti
1bf754556f
Add test for cache source dist installing (#545)
The code changes are outdated, now it's only adding a test
2023-12-06 11:37:55 +00:00
Charlie Marsh
218894375a
Avoid removing existing directories when unzipping and building (#577)
Now that we don't store zipped and unzipped wheels at the same location,
we can avoid these safeguards that entail removing existing directories
when writing. This supersedes
https://github.com/astral-sh/puffin/pull/545.

Closes https://github.com/astral-sh/puffin/issues/554.
2023-12-06 02:36:12 +00:00
Charlie Marsh
5fec63bff5
Add caching for path source distributions (#576)
Follows the strategy that we use for other source distributions.

Closes https://github.com/astral-sh/puffin/issues/557.
2023-12-06 01:33:52 +00:00
Charlie Marsh
5370484307
Remove .whl extension for cached, unzipped wheels (#574)
## Summary

This PR uses the wheel stem (e.g., `foo-1.2.3-py3-none-any`) instead of
the wheel name (e.g., `foo-1.2.3-py3-none-any.whl`) when storing
unzipped wheels in the cache, which removes a class of confusing issues
around overwrites and directory-vs.-file collisions.

For now, we retain _both_ the zipped and unzipped wheels in the cache,
though we can easily change this by storing the zipped wheels in a
temporary directory.

Closes https://github.com/astral-sh/puffin/issues/573.

## Test Plan

Some examples from my local cache:

<img width="835" alt="Screen Shot 2023-12-05 at 4 09 55 PM"
src="784146aa-b080-416e-9767-40c843fe5d6a">
<img width="847" alt="Screen Shot 2023-12-05 at 4 12 14 PM"
src="4bc7f30f-bef3-47f1-b4e8-da9cabf87f28">
<img width="637" alt="Screen Shot 2023-12-05 at 4 09 50 PM"
src="25ca4944-4a06-4a08-ac85-c6f7d8b5c8ea">
2023-12-05 22:41:22 +00:00