uv/crates/pep440-rs
Andrew Gallant 5219d37250
add initial rkyv support (#1135)
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.

For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:

* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
  cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
  type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
  of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
  else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
  introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
  given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
  allocate and create all necessary objects.

This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.

The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.

My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.

There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.

[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
2024-01-28 12:14:59 -05:00
..
python Unify python interpreter abstractions (#178) 2023-10-25 20:11:36 +00:00
src add initial rkyv support (#1135) 2024-01-28 12:14:59 -05:00
test Copy over pep440-rs crate (#30) 2023-10-06 20:11:52 -04:00
Cargo.lock Copy over pep440-rs crate (#30) 2023-10-06 20:11:52 -04:00
Cargo.toml add initial rkyv support (#1135) 2024-01-28 12:14:59 -05:00
CHANGELOG.md Enable release builds via cargo-dist (#79) 2023-10-09 20:48:55 +00:00
License-Apache Copy over pep440-rs crate (#30) 2023-10-06 20:11:52 -04:00
License-BSD Copy over pep440-rs crate (#30) 2023-10-06 20:11:52 -04:00
Readme.md Rename pep440-rs to Readme.md (#1014) 2024-01-19 15:16:12 -05:00

PEP440 in rust

Crates.io PyPI

A library for python version numbers and specifiers, implementing PEP 440. See Reimplementing PEP 440 for some background.

Higher level bindings to the requirements syntax are available in pep508_rs.

use std::str::FromStr;
use pep440_rs::{parse_version_specifiers, Version, VersionSpecifier};

let version = Version::from_str("1.19").unwrap();
let version_specifier = VersionSpecifier::from_str("==1.*").unwrap();
assert!(version_specifier.contains(&version));
let version_specifiers = parse_version_specifiers(">=1.16, <2.0").unwrap();
assert!(version_specifiers.iter().all(|specifier| specifier.contains(&version)));

In python (pip install pep440_rs):

from pep440_rs import Version, VersionSpecifier

assert Version("1.1a1").any_prerelease()
assert Version("1.1.dev2").any_prerelease()
assert not Version("1.1").any_prerelease()
assert VersionSpecifier(">=1.0").contains(Version("1.1a1"))
assert not VersionSpecifier(">=1.1").contains(Version("1.1a1"))
# Note that python comparisons are the version ordering, not the version specifiers operators
assert Version("1.1") >= Version("1.1a1")
assert Version("2.0") in VersionSpecifier("==2")

PEP 440 has a lot of unintuitive features, including:

  • An epoch that you can prefix the version which, e.g. 1!1.2.3. Lower epoch always means lower version (1.0 <=2!0.1)
  • post versions, which can be attached to both stable releases and prereleases
  • dev versions, which can be attached to sbpth table releases and prereleases. When attached to a prerelease the dev version is ordered just below the normal prerelease, however when attached to a stable version, the dev version is sorted before a prereleases
  • prerelease handling is a mess: "Pre-releases of any kind, including developmental releases, are implicitly excluded from all version specifiers, unless they are already present on the system, explicitly requested by the user, or if the only available version that satisfies the version specifier is a pre-release.". This means that we can't say whether a specifier matches without also looking at the environment
  • prelease vs. prerelease incl. dev is fuzzy
  • local versions on top of all the others, which are added with a + and have implicitly typed string and number segments
  • no semver-caret (^), but a pseudo-semver tilde (~=)
  • ordering contradicts matching: We have e.g. 1.0+local > 1.0 when sorting, but ==1.0 matches 1.0+local. While the ordering of versions itself is a total order the version matching needs to catch all sorts of special cases