Commit graph

124 commits

Author SHA1 Message Date
konsti
4adaa9a700
Wheel filename distribution package name (#278)
The normalized name abstractions were not consistently, this PR uses
them where they were previously missing:
* `WheelFilename::distribution`
* `Requirement::name`
* `Requirement::extras`
* `Metadata21::name`
* `Metadata21::provides_dist`

With `puffin-package` depending on `pep508_rs` this would be cyclical
crate dependency, so `puffin-normalize` gets split out from
`puffin-package`.

`DistInfoName` has the same task and semantics as `PackageName`, so it's
merged into the latter.

`PackageName` and `ExtraName` documentation is moved onto the type and
their constructors are called `new` instead of `normalize`. We now use
these constructors rarely enough the implicit allocation by
`to_string()` shouldn't matter anymore, while more actual cloning
becomes visible.
2023-11-02 11:15:27 +00:00
konsti
c6aa1cd7a3
Only fall back to copy when the first hard linking failed (#268)
Hard linking might not be supported but we (afaik) can't detect this
ahead of time, so we'll try hard linking the first file, if this
succeeds we'll know later hard linking errors are not due to lack of
os/fs support, if it fails we'll switch to copying for the rest of the
install. Follow up to
https://github.com/astral-sh/puffin/pull/237#discussion_r1376705137
2023-11-01 18:35:52 +01:00
Charlie Marsh
3312ce30f5
Upgrade crates and remove unused dependencies (#256) 2023-10-31 13:16:58 -04:00
konsti
35d6bd761b
Fallback to copy if hardlinking failed (#237) 2023-10-30 19:10:01 +00:00
Charlie Marsh
ffbf6b6c16
Avoid symlinking RECORD file (#227)
This is the one file that gets modified during installation. Hardlinking
it is bad!
2023-10-30 03:10:38 +00:00
konsti
889f6173cc
Unify python interpreter abstractions (#178)
Previously, we had two python interpreter metadata structs, one in
gourgeist and one in puffin. Both would spawn a subprocess to query
overlapping metadata and both would appear in the cli crate, if you
weren't careful you could even have to different base interpreters at
once. This change unifies this to one set of metadata, queried and
cached once.

Another effect of this crate is proper separation of python interpreter
and venv. A base interpreter (such as `/usr/bin/python/`, but also pyenv
and conda installed python) has a set of metadata. A venv has a root and
inherits the base python metadata except for `sys.prefix`, which unlike
`sys.base_prefix`, gets set to the venv root. From the root and the
interpreter info we can compute the paths inside the venv. We can reuse
the interpreter info of the base interpreter when creating a venv
without having to query the newly created `python`.
2023-10-25 20:11:36 +00:00
konsti
1fbe328257
Build source distributions in the resolver (#138)
This is isn't ready, but it can resolve
`meine_stadt_transparent==0.2.14`.

The source distributions are currently being built serially one after
the other, i don't know if that is incidentally due to the resolution
order, because sdist building is blocking or because of something in the
resolver that could be improved.

It's a bit annoying that the thing that was supposed to do http requests
now suddenly also has to a whole download/unpack/resolve/install/build
routine, it messes up the type hierarchy. The much bigger problem though
is avoid recursive crate dependencies, it's the reason for the callback
and for splitting the builder into two crates (badly named atm)
2023-10-25 20:05:13 +00:00
konstin
815c2117c8 Clippy 2023-10-23 13:54:31 +02:00
Charlie Marsh
49a27ff33c
Add support for parameterized link modes (#164)
Allows the user to select between clone, hardlink, and copy semantics
for installs. (The pnpm documentation has a decent description of what
these mean: https://pnpm.io/npmrc#package-import-method.)

Closes #159.
2023-10-22 04:35:50 +00:00
konsti
ae9d1f7572
Add source distribution filename abstraction (#154)
The need for this became clear when working on the source distribution
integration into the resolver.

While at it i also switch the `WheelFilename` version to the parsed
`pep440_rs` version now that we have this crate.
2023-10-20 17:45:57 +02:00
Charlie Marsh
4645f79237
Use FxHash (#151) 2023-10-20 05:26:06 +00:00
Charlie Marsh
7ef6c0315c
Unify site-packages into distribution enum (#136)
Gets rid of the custom `DistInfo` struct in the site-packages
abstraction in favor of a new kind of distribution
(`InstalledDistribution`). No change in behavior.
2023-10-19 04:37:52 +00:00
Charlie Marsh
573f5832a3
Allow uninstall to take multiple packages and files (#125)
Moves the command to `puffin pip-uninstall` for now to separate from the
managed interface, and redoes the command output.
2023-10-18 22:30:11 -04:00
Charlie Marsh
7e8ffeb2df
Use fs-err in more crates (#100)
Closes https://github.com/astral-sh/puffin/issues/88.
2023-10-16 13:37:58 +00:00
Charlie Marsh
85162d1111
Parallelize wheel installations with Rayon (#84)
It looks like using _either_ async Rust with a `JoinSet` _or_
parallelizing a fixed threadpool with Rayon provide about a ~5% speed-up
over our current serial approach:

```console
❯ hyperfine --runs 30 --warmup 5 --prepare "./target/release/puffin venv .venv" \
  "./target/release/rayon sync ./scripts/benchmarks/requirements-large.txt" \
  "./target/release/async sync ./scripts/benchmarks/requirements-large.txt" \
  "./target/release/main sync ./scripts/benchmarks/requirements-large.txt"
Benchmark 1: ./target/release/rayon sync ./scripts/benchmarks/requirements-large.txt
  Time (mean ± σ):     295.7 ms ±  16.9 ms    [User: 28.6 ms, System: 263.3 ms]
  Range (min … max):   249.2 ms … 315.9 ms    30 runs

Benchmark 2: ./target/release/async sync ./scripts/benchmarks/requirements-large.txt
  Time (mean ± σ):     296.2 ms ±  20.2 ms    [User: 36.1 ms, System: 340.1 ms]
  Range (min … max):   258.0 ms … 359.4 ms    30 runs

Benchmark 3: ./target/release/main sync ./scripts/benchmarks/requirements-large.txt
  Time (mean ± σ):     306.6 ms ±  19.5 ms    [User: 25.3 ms, System: 220.5 ms]
  Range (min … max):   269.6 ms … 332.2 ms    30 runs

Summary
  './target/release/rayon sync ./scripts/benchmarks/requirements-large.txt' ran
    1.00 ± 0.09 times faster than './target/release/async sync ./scripts/benchmarks/requirements-large.txt'
    1.04 ± 0.09 times faster than './target/release/main sync ./scripts/benchmarks/requirements-large.txt'
```

It's much easier to just parallelize with Rayon and avoid async in the
underlying wheel code, so this PR takes that approach for now.
2023-10-10 23:46:30 -04:00
Charlie Marsh
a0294a510c
Rework puffin sync output to summarize (#81)
This also moves away from using `tracing` for user-facing logging,
instead introducing a new `Printer` abstraction.

Closes #66.
2023-10-10 03:29:09 +00:00
Charlie Marsh
ba2b200fce
Enable release builds via cargo-dist (#79) 2023-10-09 20:48:55 +00:00
Charlie Marsh
b90140e1bc
Add support for wheel uninstalls (#77)
Closes #36.
2023-10-09 14:14:33 -04:00
Charlie Marsh
ba72950546
Avoid passing cached wheels to the resolver step (#70)
When we go to install a locked `requirements.txt`, if a wheel is already
available in the local cache, and matches the version specifiers, we can
just use it directly without fetching the package metadata. This speeds
up the no-op case by about 33%.

Closes https://github.com/astral-sh/puffin/issues/48.
2023-10-08 22:17:19 -04:00
Charlie Marsh
5b71cfdd0b
Remove Monotrail-specific code from install-wheel-rs (#68)
I think this isn't necessary to support in this generic crate. If we
choose to adopt Monotrail-style concepts, we'll likely need to rework
them anyway.
2023-10-08 18:28:57 -04:00
Charlie Marsh
adbee4fb32
Use recursive clonefile calls on macOS (#67)
It turns out that on macOS, you can pass `clonefile` a directory to
recursively copy an entire directory. This speeds up wheel installation
dramatically, by about 3x.
2023-10-08 21:44:02 +00:00
Charlie Marsh
2a846e76b7
Store unzipped wheels in a cache (#49)
This PR massively speeds up the case in which you need to install wheels
that already exist in the global cache.

The new strategy is as follows:

- Download the wheel into the content-addressed cache.
- Unzip the wheel into the cache, but ignore content-addressing. It
turns out that writing to `cacache` for every file in the zip added a
ton of overhead, and I don't see any actual advantages to doing so.
Instead, we just unzip the contents into a directory at, e.g.,
`~/.cache/puffin/django-4.1.5`.
- (The unzip itself is now parallelized with Rayon.)
- When installing the wheel, we now support unzipping from a directory
instead of a zip archive. This required duplicating and tweaking a few
functions.
- When installing the wheel, we now use reflinks (or copy-on-write
links). These have a few fantastic properties: (1) they're extremely
cheap to create (on macOS, they are allegedly faster than hard links);
(2) they minimize disk space, since we avoid copying files entirely in
the vast majority of cases; and (3) if the user then edits a file
locally, the cache doesn't get polluted. Orogene, Bun, and soon pnpm all
use reflinks.

Puffin is now ~15x faster than `pip` for the common case of installing
cached data into a fresh environment.

Closes https://github.com/astral-sh/puffin/issues/21.

Closes https://github.com/astral-sh/puffin/issues/39.
2023-10-08 04:04:48 +00:00
Charlie Marsh
ae28552b3a
Use local copy of install-wheel-rs (#34)
This PR modifies the `install-wheel-rs` (and a few other crates) to get
everything playing nicely. Specifically, CI should pass, and all these
crates now use workspace dependencies between one another.

As part of this change, I split out the wheel name parsing into its own
`wheel-filename` crate, and the compatibility tag parsing into its own
`platform-tags` crate.
2023-10-07 01:43:55 +00:00
Charlie Marsh
e824fe6d2b
Copy over install-wheel-rs crate (#33)
This PR copies over the `install-wheel-rs` crate at commit
`10730ea1a84c58af6b35fb74c89ed0578ab042b6` with no modifications.

It won't pass CI, but modifications will intentionally be confined to
later PRs.
2023-10-06 21:38:38 -04:00