## Summary
There are a few ideas at play here:
1. pip always strips versions to the release when evaluating against a
`Requires-Python`, so we now do the same. That means, e.g., using
`3.13.0b0` will be accepted by a project with `Requires-Python: >=
3.13`, which does _not_ adhere to PEP 440 semantics but is somewhat
intuitive.
2. Because we know we'll only be evaluating against release-only
versions, we can use different semantics in PubGrub that let us collapse
ranges. For example, `python_version >= '3.10' or python_version <
'3.10'` can be collapsed to the truthy marker.
Closes https://github.com/astral-sh/uv/issues/4714.
Closes https://github.com/astral-sh/uv/issues/4272.
Closes https://github.com/astral-sh/uv/issues/4719.
## Summary
Given:
```text
numpy >=1.26 ; python_version >= '3.9'
numpy <1.26 ; python_version < '3.9'
```
When resolving for Python 3.8, we need to narrow the `requires-python`
requirement in the top branch of the fork, because `numpy >=1.26` all
require Python 3.9 or later -- but we know (in that branch) that we only
need to _solve_ for Python 3.9 or later.
Closes https://github.com/astral-sh/uv/issues/4669.
## Summary
Simplify and normalize marker expressions in the lockfile. Right now
this does a simple analysis by only looking at related operators at the
same level of precedence. I think anything more complex would be out of
scope.
Resolves https://github.com/astral-sh/uv/issues/4002.
Follow-up to #4016.
This exposes `Range` and `PubGrubSpecifier` from outside the resolver to
use pubgrub's union creating a dependency edge we don't really want.
## Summary
If `Requires-Python` is omitted in `uv lock` or `uv run`, we now warn
and default to `>=` the current minor version.
Closes https://github.com/astral-sh/uv/issues/4050.
This bumps the versions of pep580 and pep440 to coincide with the
crates.io versions. While not strictly the same, the new types in uv us
an `Inner` struct. Practically I've found I'm still able to use the
patched versions, as can seen from the open PR here:
https://github.com/prefix-dev/pixi/pull/1436.
Would be great if this bump can be done so we can keep combining the
types :)
Since we're adding a `Option<MarkerTree>` to `PubGrubPackage`, and since
we just make `PubGrubPackage` implement `Ord`, it follows that we want
`MarkerTree` to also implement `Ord`.
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Just fix typos.
While `alpha-numeric` is not really a misspelling:
- it is missing from mainstream curated dictionaries, all of them
suggest `alphanumeric`;
- it is less used than `alphanumeric` (more than ⨉10 less) according to
the Google [Ngram
Viewer](https://books.google.com/ngrams/graph?content=alpha-numeric%2Calphanumeric&year_start=1900&year_end=2019&corpus=en-2019);
- it is [missing from
SCOWL](http://app.aspell.net/lookup?dict=en_US-large;words=alpha-numeric).
## Test Plan
CI jobs.
We would previously show the parsed version when erroring due to
trailing content after a valid version, which can look different than
the input. E.g. when encountering `0.1-bulbasaur`, we would display:
```
after parsing '0.1b0', found 'ulbasaur', which is not part of a valid version
```
With storing the input string instead of the input version, we now show:
```
after parsing '0.1-b', found 'ulbasaur', which is not part of a valid version
```
## Introduction
PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification
The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.
## `tool.uv.source`
Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:
```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
"tqdm >=4.66.2,<5",
"torch ==2.2.2",
"transformers[torch] >=4.39.3,<5",
"importlib_metadata >=7.1.0,<8; python_version < '3.10'",
"mollymawk ==0.1.0"
]
[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }
[tool.uv.workspace]
include = [
"packages/mollymawk"
]
[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```
See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.
This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.
## Changes
This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
* Git dependencies
* Url dependencies
* Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)
It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies
One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.
## Implementation
The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.
We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.
I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.
I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.
## Demo
Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:

[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
In *some* places in our crates, `serde` (and `rkyv`) are optional
dependencies. I believe this was done out of reasons of "good sense,"
that is, it follows a Rust ecosystem pattern where serde integration
tends to be an opt-in crate feature. (And similarly for `rkyv`.)
However, ultimately, `uv` itself requires `serde` and `rkyv` to
function. Since our crates are strictly internal, there are limited
consumers for our crates without `serde` (and `rkyv`) enabled. I think
one possibility is that optional `serde` (and `rkyv`) integration means
that someone can do this:
cargo test -p pep440_rs
And this will run tests _without_ `serde` or `rkyv` enabled. That in
turn could lead to faster iteration time by reducing compile times. But,
I'm not sure this is worth supporting. The iterative compilation times
of
individual crates are probably fast enough in debug mode, even with
`serde` and `rkyv` enabled. Namely, `serde` and `rkyv` themselves
shouldn't need to be re-compiled in most cases. On `main`:
```
from-scratch: `cargo test -p pep440_rs --lib` 0.685
incremental: `cargo test -p pep440_rs --lib` 0.278s
from-scratch: `cargo test -p pep440_rs --features serde,rkyv --lib` 3.948s
incremental: `cargo test -p pep440_rs --features serde,rkyv --lib` 0.321s
```
So while a from-scratch build does take significantly longer, an
incremental build is about the same.
The benefit of doing this change is two-fold:
1. It brings out crates into alignment with "reality." In particular,
some crates were _implicitly_ relying on `serde` being enabled
without explicitly declaring it. This technically means that our
`Cargo.toml`s were wrong in some cases, but it is hard to observe it
because of feature unification in a Cargo workspace.
2. We no longer need to deal with the cognitive burden of writing
`#[cfg_attr(feature = "serde", ...)]` everywhere.
With https://github.com/pubgrub-rs/pubgrub/pull/190, pubgrub attaches
all types to a dependency provider to reduce the number of generics. We
need a dummy dependency provider now to emulate this. On the plus side,
pep440_rs drops its pubgrub dependency.
Add a single job for for fast lint tools. Rustfmt for rust, ruff for
python formatting and linting, prettier avoids inconsistent formatter
changes between pycharm and vscode.
## Summary
This PR adds limited support for PEP 440-compatible local version
testing. Our behavior is _not_ comprehensively in-line with the spec.
However, it does fix by _far_ the biggest practical limitation, and
resolves all the issues that've been raised on uv related to local
versions without introducing much complexity into the resolver, so it
feels like a good tradeoff for me.
I'll summarize the change here, but for more context, see [Andrew's
write-up](https://github.com/astral-sh/uv/issues/1855#issuecomment-1967024866)
in the linked issue.
Local version identifiers are really tricky because of asymmetry.
`==1.2.3` should allow `1.2.3+foo`, but `==1.2.3+foo` should not allow
`1.2.3`. It's very hard to map them to PubGrub, because PubGrub doesn't
think of things in terms of individual specifiers (unlike the PEP 440
spec) -- it only thinks in terms of ranges.
Right now, resolving PyTorch and friends fails, because...
- The user provides requirements like `torch==2.0.0+cu118` and
`torchvision==0.15.1+cu118`.
- We then match those exact versions.
- We then look at the requirements of `torchvision==0.15.1+cu118`, which
includes `torch==2.0.0`.
- Under PEP 440, this is fine, because `torch @ 2.0.0+cu118` should be
compatible with `torch==2.0.0`.
- In our model, though, it's not, because these are different versions.
If we change our comparison logic in various places to allow this, we
risk breaking some fundamental assumptions of PubGrub around version
continuity.
- Thus, we fail to resolve, because we can't accept both `torch @ 2.0.0`
and `torch @ 2.0.0+cu118`.
As compared to the solutions we explored in
https://github.com/astral-sh/uv/issues/1855#issuecomment-1967024866, at
a high level, this approach differs in that we lie about the
_dependencies_ of packages that rely on our local-version-using package,
rather than lying about the versions that exist, or the version we're
returning, etc.
In short:
- When users specify local versions upfront, we keep track of them. So,
above, we'd take note of `torch` and `torchvision`.
- When we convert the dependencies of a package to PubGrub ranges, we
check if the requirement matches `torch` or `torchvision`. If it's
an`==`, we check if it matches (in the above example) for
`torch==2.0.0`. If so, we _change_ the requirement to
`torch==2.0.0+cu118`. (If it's `==` some other version, we return an
incompatibility.)
In other words, we selectively override the declared dependencies by
making them _more specific_ if a compatible local version was specified
upfront.
The net effect here is that the motivating PyTorch resolutions all work.
And, in general, transitive local versions work as expected.
The thing that still _doesn't_ work is: imagine if there were _only_
local versions of `torch` available. Like, `torch @ 2.0.0` didn't exist,
but `torch @ 2.0.0+cpu` did, and `torch @ 2.0.0+gpu` did, and so on.
`pip install torch==2.0.0` would arbitrarily choose one one `2.0.0+cpu`
or `2.0.0+gpu`, and that's correct as per PEP 440 (local version
segments should be completely ignored on `torch==2.0.0`). However, uv
would fail to identify a compatible version. I'd _probably_ prefer to
fix this, although candidly I think our behavior is _ok_ in practice,
and it's never been reported as an issue.
Closes https://github.com/astral-sh/uv/issues/1855.
Closes https://github.com/astral-sh/uv/issues/2080.
Closes https://github.com/astral-sh/uv/issues/2328.
## Summary
This PR attempts to use a similar trick to that we added in
https://github.com/astral-sh/uv/pull/1878, but for post-releases.
In https://github.com/astral-sh/uv/pull/1878, we added a fake "minimum"
version to enable us to treat `< 1.0.0` as _excluding_ pre-releases of
1.0.0.
Today, on `main`, we accept post-releases and local versions in `>
1.0.0`. But per PEP 440, that should _exclude_ post-releases and local
versions, unless the specifier is itself a pre-release, in which case,
pre-releases are allowed (e.g., `> 1.0.0.post0` should allow `>
1.0.0.post1`).
To support this, we add a fake "maximum" version that's greater than all
the post and local releases for a given version. This leverages our last
remaining free bit in the compact representation.
In this context, we already know (as the comment says) that `self` does
not have a local segment, so we don't need to strip it.
This change isn't motivated by anything other than making the code and
comment in sync. For example, when I first looked at it, I wondered
whether the extra stripping was somehow necessary. But it isn't.
Address a few pedantic lints
lints are separated into separate commits so they can be reviewed
individually.
I've not added enforcement for any of these lints, but that could be
added if desirable.
## Summary
Even when pre-releases are "allowed", per PEP 440, `pydantic<2.0.0`
should _not_ include pre-releases. This PR modifies the specifier
translation to treat `pydantic<2.0.0` as `pydantic<2.0.0.min0`, where
`min` is an internal-only version segment that's invisible to users.
Closes https://github.com/astral-sh/uv/issues/1641.
A couple moons ago, I introduced an optimization for version comparisons
by devising a format where *most* versions would be represented by a
single `u64`. This in turn meant most comparisons (of which many are
done during resolution) would be extremely cheap.
Unfortunately, when I did that, I screwed up the preservation of
ordering as defined by the [Version Specifiers spec]. I think I messed
it up because I had originally devised the representation so that we
could pack things like `1.2.3.dev1.post5`, but later realized it would
be better to limit ourselves to a single suffix. However, I never
updated the binary encoding to better match "up to 4 release versions
and up to precisely 1 suffix." Because of that, there were cases where
versions weren't ordered correctly. For example, this fixes a bug where
`1.0a2 < 1.0dev2`, even though all dev releases should order before
pre-releases.
We also update a test so that it catches these kinds of bugs in the
future. (By testing all pairs of versions in a sequence instead of just
the adjacent versions.)
[Version Specifiers spec]:
https://packaging.python.org/en/latest/specifications/version-specifiers/#summary-of-permitted-suffixes-and-relative-ordering
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.
For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:
* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
allocate and create all necessary objects.
This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.
The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.
My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.
There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.
[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
## Summary
This PR adds a release workflow powered by `cargo-dist`. It's similar to
the version that's PR'd in Ruff
(https://github.com/astral-sh/ruff/pull/9559), with the exception that
it doesn't include the Docker build or the "update dependents" step for
pre-commit.
## Summary
I'm running into some annoyances converting `&Version` to
`&PubGrubVersion` (which is just a wrapper type around `Version`), and I
realized... We don't even need `PubGrubVersion`?
The reason we "need" it today is due to the orphan trait rule: `Version`
is defined in `pep440_rs`, but we want to `impl
pubgrub::version::Version for Version` in the resolver crate.
Instead of introducing a new type here, which leads to a lot of
awkwardness around conversion and API isolation, what if we instead just
implement `pubgrub::version::Version` in `pep440_rs` via a feature? That
way, we can just use `Version` everywhere without any confusion and
conversion for the wrapper type.
Looking at the profile for tf-models-nightly after #789,
`compare_release` is the single biggest item. Adding a fast path, we
avoid paying the cost for padding releases with 0s when they are the
same length, resulting in a 16% for this pathological case. Note that
this mainly happens because tf-models-nightly is almost all large dev
releases that hit the slow path.
**Before**

**After**

```
$ hyperfine --warmup 1 --runs 3 "target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt"
"target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt"
Benchmark 1: target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
Time (mean ± σ): 11.963 s ± 0.225 s [User: 11.478 s, System: 0.451 s]
Range (min … max): 11.747 s … 12.196 s 3 runs
Benchmark 2: target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt
Time (mean ± σ): 10.317 s ± 0.720 s [User: 9.885 s, System: 0.404 s]
Range (min … max): 9.501 s … 10.860 s 3 runs
Summary
target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt ran
1.16 ± 0.08 times faster than target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
```
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s
real 34.579
user 34.004
sys 0.413
maxmem 2867 MB
faults 0
```
And now with this PR:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s
real 9.218
user 8.919
sys 0.165
maxmem 463 MB
faults 0
```
This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:
```
$ hyperfine -w10 \
"puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
"puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 337.5 ms ± 3.9 ms [User: 310.5 ms, System: 73.2 ms]
Range (min … max): 333.6 ms … 343.4 ms 10 runs
Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 189.8 ms ± 3.0 ms [User: 168.1 ms, System: 78.4 ms]
Range (min … max): 185.0 ms … 196.2 ms 15 runs
Summary
puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```
There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.
Fixes#373, Closes#396
This PR does a bit of refactoring to the pep440 crate, and in
particular around the erorr types. This PR is meant to be a precursor
to another PR that does some surgery (both in parsing and in `Version`
representation) that benefits somewhat from this refactoring.
As usual, please review commit-by-commit.
The high level goal here is to improve the tests for the version parser.
Namely, we now check not just that version strings parse successfully,
but that they parse to the expected result.
We also do a few other cleanups. Most notably, `Version` is now an
opaque type so that we can more easily change its representation going
forward.
Reviewing commit-by-commit is suggested. :-)
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.
```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 217.1 ms ± 3.2 ms [User: 194.0 ms, System: 55.1 ms]
Range (min … max): 211.0 ms … 228.1 ms 50 runs
Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 276.7 ms ± 5.7 ms [User: 252.4 ms, System: 54.6 ms]
Range (min … max): 268.9 ms … 303.5 ms 50 runs
Summary
target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```
---------
Co-authored-by: Andrew Gallant <andrew@astral.sh>
Currently, `dbg!` is hard to read because versions are verbose, showing
all optional fields, and we have a lot of versions. Changing debug
formatting to displaying the version number (which can be losslessly
converted to the struct and back) makes this more readable.
See e.g.
https://gist.github.com/konstin/38c0f32b109dffa73b3aa0ab86b9662b
**Before**
```text
version: Version {
epoch: 0,
release: [
1,
2,
3,
],
pre: None,
post: None,
dev: None,
local: None,
},
```
**After**
```text
version: "1.2.3",
```
It turns out that it's not uncommon to use timestamps as patch versions
(e.g., `20230628214621`). I believe this is the ISO 8601 "basic format".
These can't be represented by a `u32`, so I think it makes sense to just
bump to `u64` to remove this limitation.
Filter out source dists and wheels whose `requires-python` from the
simple api is incompatible with the current python version.
This change showed an important problem: When we use a fake python
version for resolving, building source distributions breaks down because
we can only build with versions we actually have.
This change became surprisingly big. The tests now require python 3.7 to
be installed, but changing that would mean an even bigger change.
Fixes#388
This copies the allocator configuration used in the Ruff project. In
particular, this gives us an instant 10% win when resolving the top 1K
PyPI packages:
$ hyperfine \
"./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null" \
"./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null"
Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 974.2 ms ± 26.4 ms [User: 17503.3 ms, System: 2205.3
ms]
Range (min … max): 943.5 ms … 1015.9 ms 10 runs
Benchmark 2: ./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 883.1 ms ± 23.3 ms [User: 14626.1 ms, System: 2542.2
ms]
Range (min … max): 849.5 ms … 916.9 ms 10 runs
Summary
'./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null' ran
1.10 ± 0.04 times faster than './target/profiling/puffin-dev-main
resolve-many --cache-dir cache-docker-no-build --no-build
pypi_top_8k_flat.txt --limit 1000 2> /dev/null'
I was moved to do this because I noticed `malloc`/`free` taking up a
fairly sizeable percentage of time during light profiling.
As is becoming a pattern, it will be easier to review this
commit-by-commit.
Ref #396 (wouldn't call this issue fixed)
-----
I did also try adding a `smallvec` optimization to the
`Version::release` field, but it didn't bare any fruit. I still think
there is more to explore since the results I observed don't quite line
up with what I expect. (So probably either my mental model is off or my
measurement process is flawed.) You can see that attempt with a little
more explanation here:
f9528b4ecd
In the course of adding the `smallvec` optimization, I also shrunk the
`Version` fields from a `usize` to a `u32`. They should at least be a
fixed size integer since version numbers aren't used to index memory,
and I shrunk it to `u32` since it seems reasonable to assume that all
version numbers will be smaller than `2^32`.
Ran `cargo upgrade --incompatible`, seems there are no changes required.
From cacache 0.12.0:
> BREAKING CHANGE: some signatures for copy have changed, and copy no
longer automatically reflinks
`which` 5.0.0 seems to have only error message changes.
We now accept a pre-release if (1) all versions are pre-releases, or (2)
there was a pre-release marker in the dependency specifiers for a direct
dependency.
The code is written such that we can support a variety of pre-release
strategies.
Closes https://github.com/astral-sh/puffin/issues/191.
Previously, we had two python interpreter metadata structs, one in
gourgeist and one in puffin. Both would spawn a subprocess to query
overlapping metadata and both would appear in the cli crate, if you
weren't careful you could even have to different base interpreters at
once. This change unifies this to one set of metadata, queried and
cached once.
Another effect of this crate is proper separation of python interpreter
and venv. A base interpreter (such as `/usr/bin/python/`, but also pyenv
and conda installed python) has a set of metadata. A venv has a root and
inherits the base python metadata except for `sys.prefix`, which unlike
`sys.base_prefix`, gets set to the venv root. From the root and the
interpreter info we can compute the paths inside the venv. We can reuse
the interpreter info of the base interpreter when creating a venv
without having to query the newly created `python`.
This is isn't ready, but it can resolve
`meine_stadt_transparent==0.2.14`.
The source distributions are currently being built serially one after
the other, i don't know if that is incidentally due to the resolution
order, because sdist building is blocking or because of something in the
resolver that could be improved.
It's a bit annoying that the thing that was supposed to do http requests
now suddenly also has to a whole download/unpack/resolve/install/build
routine, it messes up the type hierarchy. The much bigger problem though
is avoid recursive crate dependencies, it's the reason for the callback
and for splitting the builder into two crates (badly named atm)