This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s
real 34.579
user 34.004
sys 0.413
maxmem 2867 MB
faults 0
```
And now with this PR:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s
real 9.218
user 8.919
sys 0.165
maxmem 463 MB
faults 0
```
This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:
```
$ hyperfine -w10 \
"puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
"puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 337.5 ms ± 3.9 ms [User: 310.5 ms, System: 73.2 ms]
Range (min … max): 333.6 ms … 343.4 ms 10 runs
Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 189.8 ms ± 3.0 ms [User: 168.1 ms, System: 78.4 ms]
Range (min … max): 185.0 ms … 196.2 ms 15 runs
Summary
puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```
There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.
Fixes#373, Closes#396
This PR does a bit of refactoring to the pep440 crate, and in
particular around the erorr types. This PR is meant to be a precursor
to another PR that does some surgery (both in parsing and in `Version`
representation) that benefits somewhat from this refactoring.
As usual, please review commit-by-commit.
The high level goal here is to improve the tests for the version parser.
Namely, we now check not just that version strings parse successfully,
but that they parse to the expected result.
We also do a few other cleanups. Most notably, `Version` is now an
opaque type so that we can more easily change its representation going
forward.
Reviewing commit-by-commit is suggested. :-)
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.
```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 217.1 ms ± 3.2 ms [User: 194.0 ms, System: 55.1 ms]
Range (min … max): 211.0 ms … 228.1 ms 50 runs
Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 276.7 ms ± 5.7 ms [User: 252.4 ms, System: 54.6 ms]
Range (min … max): 268.9 ms … 303.5 ms 50 runs
Summary
target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```
---------
Co-authored-by: Andrew Gallant <andrew@astral.sh>
Currently, `dbg!` is hard to read because versions are verbose, showing
all optional fields, and we have a lot of versions. Changing debug
formatting to displaying the version number (which can be losslessly
converted to the struct and back) makes this more readable.
See e.g.
https://gist.github.com/konstin/38c0f32b109dffa73b3aa0ab86b9662b
**Before**
```text
version: Version {
epoch: 0,
release: [
1,
2,
3,
],
pre: None,
post: None,
dev: None,
local: None,
},
```
**After**
```text
version: "1.2.3",
```
It turns out that it's not uncommon to use timestamps as patch versions
(e.g., `20230628214621`). I believe this is the ISO 8601 "basic format".
These can't be represented by a `u32`, so I think it makes sense to just
bump to `u64` to remove this limitation.
Filter out source dists and wheels whose `requires-python` from the
simple api is incompatible with the current python version.
This change showed an important problem: When we use a fake python
version for resolving, building source distributions breaks down because
we can only build with versions we actually have.
This change became surprisingly big. The tests now require python 3.7 to
be installed, but changing that would mean an even bigger change.
Fixes#388
This copies the allocator configuration used in the Ruff project. In
particular, this gives us an instant 10% win when resolving the top 1K
PyPI packages:
$ hyperfine \
"./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null" \
"./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null"
Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 974.2 ms ± 26.4 ms [User: 17503.3 ms, System: 2205.3
ms]
Range (min … max): 943.5 ms … 1015.9 ms 10 runs
Benchmark 2: ./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 883.1 ms ± 23.3 ms [User: 14626.1 ms, System: 2542.2
ms]
Range (min … max): 849.5 ms … 916.9 ms 10 runs
Summary
'./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null' ran
1.10 ± 0.04 times faster than './target/profiling/puffin-dev-main
resolve-many --cache-dir cache-docker-no-build --no-build
pypi_top_8k_flat.txt --limit 1000 2> /dev/null'
I was moved to do this because I noticed `malloc`/`free` taking up a
fairly sizeable percentage of time during light profiling.
As is becoming a pattern, it will be easier to review this
commit-by-commit.
Ref #396 (wouldn't call this issue fixed)
-----
I did also try adding a `smallvec` optimization to the
`Version::release` field, but it didn't bare any fruit. I still think
there is more to explore since the results I observed don't quite line
up with what I expect. (So probably either my mental model is off or my
measurement process is flawed.) You can see that attempt with a little
more explanation here:
f9528b4ecd
In the course of adding the `smallvec` optimization, I also shrunk the
`Version` fields from a `usize` to a `u32`. They should at least be a
fixed size integer since version numbers aren't used to index memory,
and I shrunk it to `u32` since it seems reasonable to assume that all
version numbers will be smaller than `2^32`.
Ran `cargo upgrade --incompatible`, seems there are no changes required.
From cacache 0.12.0:
> BREAKING CHANGE: some signatures for copy have changed, and copy no
longer automatically reflinks
`which` 5.0.0 seems to have only error message changes.
We now accept a pre-release if (1) all versions are pre-releases, or (2)
there was a pre-release marker in the dependency specifiers for a direct
dependency.
The code is written such that we can support a variety of pre-release
strategies.
Closes https://github.com/astral-sh/puffin/issues/191.
Previously, we had two python interpreter metadata structs, one in
gourgeist and one in puffin. Both would spawn a subprocess to query
overlapping metadata and both would appear in the cli crate, if you
weren't careful you could even have to different base interpreters at
once. This change unifies this to one set of metadata, queried and
cached once.
Another effect of this crate is proper separation of python interpreter
and venv. A base interpreter (such as `/usr/bin/python/`, but also pyenv
and conda installed python) has a set of metadata. A venv has a root and
inherits the base python metadata except for `sys.prefix`, which unlike
`sys.base_prefix`, gets set to the venv root. From the root and the
interpreter info we can compute the paths inside the venv. We can reuse
the interpreter info of the base interpreter when creating a venv
without having to query the newly created `python`.
This is isn't ready, but it can resolve
`meine_stadt_transparent==0.2.14`.
The source distributions are currently being built serially one after
the other, i don't know if that is incidentally due to the resolution
order, because sdist building is blocking or because of something in the
resolver that could be improved.
It's a bit annoying that the thing that was supposed to do http requests
now suddenly also has to a whole download/unpack/resolve/install/build
routine, it messes up the type hierarchy. The much bigger problem though
is avoid recursive crate dependencies, it's the reason for the callback
and for splitting the builder into two crates (badly named atm)
This PR modifies the PEP 440 and PEP 508 crates to pass CI, primarily by
fixing all lint violations.
We're also now using these crates in the workspace via `path`.
(Previously, we were still fetching them from Cargo.)
This PR copies over the `pep440-rs` crate at commit
`a8303b01ffef6fccfdce562a887f6b110d482ef3` with no modifications.
It won't pass CI, but modifications will intentionally be confined to
later PRs.