mirror of
https://github.com/astral-sh/uv.git
synced 2025-11-03 13:14:41 +00:00
This PR adds initial support for [rkyv] to puffin. In particular, the main aim here is to make puffin-client's `SimpleMetadata` type possible to deserialize from a `&[u8]` without doing any copies. This PR **stops short of actuallying doing that zero-copy deserialization**. Instead, this PR is about adding the necessary trait impls to a variety of types, along with a smattering of small refactorings to make rkyv possible to use. For those unfamiliar, rkyv works via the interplay of three traits: `Archive`, `Serialize` and `Deserialize`. The usual flow of things is this: * Make a type `T` implement `Archive`, `Serialize` and `Deserialize`. rkyv helpfully provides `derive` macros to make this pretty painless in most cases. * The process of implementing `Archive` for `T` *usually* creates an entirely new distinct type within the same namespace. One can refer to this type without naming it explicitly via `Archived<T>` (where `Archived` is a clever type alias defined by rkyv). * Serialization happens from `T` to (conceptually) a `Vec<u8>`. The serialization format is specifically designed to reflect the in-memory layout of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`. * One can then get an `Archived<T>` with no copying (albeit, we will likely need to incur some cost for validation) from the previously created `&[u8]`. This is quite literally [implemented as a pointer cast][rkyv-ptr-cast]. * The problem with an `Archived<T>` is that it isn't your `T`. It's something else. And while there is limited interoperability between a `T` and an `Archived<T>`, the main issue is that the surrounding code generally demands a `T` and not an `Archived<T>`. **This is at the heart of the tension for introducing zero-copy deserialization, and this is mostly an intrinsic problem to the technique and not an rkyv-specific issue.** For this reason, given an `Archived<T>`, one can get a `T` back via an explicit deserialization step. This step is like any other kind of deserialization, although generally faster since no real "parsing" is required. But it will allocate and create all necessary objects. This PR largely proceeds by deriving the three aforementioned traits for `SimpleMetadata`. And, of course, all of its type dependencies. But we stop there for now. The main issue with carrying this work forward so that rkyv is actually used to deserialize a `SimpleMetadata` is figuring out how to deal with `DataWithCachePolicy` inside of the cached client. Ideally, this type would itself have rkyv support, but adding it is difficult. The main difficulty lay in the fact that its `CachePolicy` type is opaque, not easily constructable and is internally the tip of the iceberg of a rat's nest of types found in more crates such as `http`. While one "dumb"-but-annoying approach would be to fork both of those crates and add rkyv trait impls to all necessary types, it is my belief that this is the wrong approach. What we'd *like* to do is not just use rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to get an `Archived<DataWithCachePolicy>` and make actual decisions used the archived type directly. Doing that will require some work to make `Archived<DataWithCachePolicy>` directly useful. My suspicion is that, after doing the above, we may want to mush forward with a similar approach for `SimpleMetadata`. That is, we want `Archived<SimpleMetadata>` to be as useful as possible. But right now, the structure of the code demands an eager conversion (and thus deserialization) into a `SimpleMetadata` and then into a `VersionMap`. Getting rid of that eagerness is, I think, the next step after dealing with `DataWithCachePolicy` to unlock bigger wins here. There are many commits in this PR, but most are tiny. I still encourage review to happen commit-by-commit. [rkyv]: https://rkyv.org/ [rkyv-ptr-cast]: https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68 |
||
|---|---|---|
| .. | ||
| python | ||
| src | ||
| test | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CHANGELOG.md | ||
| License-Apache | ||
| License-BSD | ||
| Readme.md | ||
PEP440 in rust
A library for python version numbers and specifiers, implementing PEP 440. See Reimplementing PEP 440 for some background.
Higher level bindings to the requirements syntax are available in pep508_rs.
use std::str::FromStr;
use pep440_rs::{parse_version_specifiers, Version, VersionSpecifier};
let version = Version::from_str("1.19").unwrap();
let version_specifier = VersionSpecifier::from_str("==1.*").unwrap();
assert!(version_specifier.contains(&version));
let version_specifiers = parse_version_specifiers(">=1.16, <2.0").unwrap();
assert!(version_specifiers.iter().all(|specifier| specifier.contains(&version)));
In python (pip install pep440_rs):
from pep440_rs import Version, VersionSpecifier
assert Version("1.1a1").any_prerelease()
assert Version("1.1.dev2").any_prerelease()
assert not Version("1.1").any_prerelease()
assert VersionSpecifier(">=1.0").contains(Version("1.1a1"))
assert not VersionSpecifier(">=1.1").contains(Version("1.1a1"))
# Note that python comparisons are the version ordering, not the version specifiers operators
assert Version("1.1") >= Version("1.1a1")
assert Version("2.0") in VersionSpecifier("==2")
PEP 440 has a lot of unintuitive features, including:
- An epoch that you can prefix the version which, e.g.
1!1.2.3. Lower epoch always means lower version (1.0 <=2!0.1) - post versions, which can be attached to both stable releases and prereleases
- dev versions, which can be attached to sbpth table releases and prereleases. When attached to a prerelease the dev version is ordered just below the normal prerelease, however when attached to a stable version, the dev version is sorted before a prereleases
- prerelease handling is a mess: "Pre-releases of any kind, including developmental releases, are implicitly excluded from all version specifiers, unless they are already present on the system, explicitly requested by the user, or if the only available version that satisfies the version specifier is a pre-release.". This means that we can't say whether a specifier matches without also looking at the environment
- prelease vs. prerelease incl. dev is fuzzy
- local versions on top of all the others, which are added with a + and have implicitly typed string and number segments
- no semver-caret (
^), but a pseudo-semver tilde (~=) - ordering contradicts matching: We have e.g.
1.0+local > 1.0when sorting, but==1.0matches1.0+local. While the ordering of versions itself is a total order the version matching needs to catch all sorts of special cases