![]() This PR adds initial support for [rkyv] to puffin. In particular, the main aim here is to make puffin-client's `SimpleMetadata` type possible to deserialize from a `&[u8]` without doing any copies. This PR **stops short of actuallying doing that zero-copy deserialization**. Instead, this PR is about adding the necessary trait impls to a variety of types, along with a smattering of small refactorings to make rkyv possible to use. For those unfamiliar, rkyv works via the interplay of three traits: `Archive`, `Serialize` and `Deserialize`. The usual flow of things is this: * Make a type `T` implement `Archive`, `Serialize` and `Deserialize`. rkyv helpfully provides `derive` macros to make this pretty painless in most cases. * The process of implementing `Archive` for `T` *usually* creates an entirely new distinct type within the same namespace. One can refer to this type without naming it explicitly via `Archived<T>` (where `Archived` is a clever type alias defined by rkyv). * Serialization happens from `T` to (conceptually) a `Vec<u8>`. The serialization format is specifically designed to reflect the in-memory layout of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`. * One can then get an `Archived<T>` with no copying (albeit, we will likely need to incur some cost for validation) from the previously created `&[u8]`. This is quite literally [implemented as a pointer cast][rkyv-ptr-cast]. * The problem with an `Archived<T>` is that it isn't your `T`. It's something else. And while there is limited interoperability between a `T` and an `Archived<T>`, the main issue is that the surrounding code generally demands a `T` and not an `Archived<T>`. **This is at the heart of the tension for introducing zero-copy deserialization, and this is mostly an intrinsic problem to the technique and not an rkyv-specific issue.** For this reason, given an `Archived<T>`, one can get a `T` back via an explicit deserialization step. This step is like any other kind of deserialization, although generally faster since no real "parsing" is required. But it will allocate and create all necessary objects. This PR largely proceeds by deriving the three aforementioned traits for `SimpleMetadata`. And, of course, all of its type dependencies. But we stop there for now. The main issue with carrying this work forward so that rkyv is actually used to deserialize a `SimpleMetadata` is figuring out how to deal with `DataWithCachePolicy` inside of the cached client. Ideally, this type would itself have rkyv support, but adding it is difficult. The main difficulty lay in the fact that its `CachePolicy` type is opaque, not easily constructable and is internally the tip of the iceberg of a rat's nest of types found in more crates such as `http`. While one "dumb"-but-annoying approach would be to fork both of those crates and add rkyv trait impls to all necessary types, it is my belief that this is the wrong approach. What we'd *like* to do is not just use rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to get an `Archived<DataWithCachePolicy>` and make actual decisions used the archived type directly. Doing that will require some work to make `Archived<DataWithCachePolicy>` directly useful. My suspicion is that, after doing the above, we may want to mush forward with a similar approach for `SimpleMetadata`. That is, we want `Archived<SimpleMetadata>` to be as useful as possible. But right now, the structure of the code demands an eager conversion (and thus deserialization) into a `SimpleMetadata` and then into a `VersionMap`. Getting rid of that eagerness is, I think, the next step after dealing with `DataWithCachePolicy` to unlock bigger wins here. There are many commits in this PR, but most are tiny. I still encourage review to happen commit-by-commit. [rkyv]: https://rkyv.org/ [rkyv-ptr-cast]: https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68 |
||
---|---|---|
.. | ||
bench | ||
cache-key | ||
distribution-filename | ||
distribution-types | ||
gourgeist | ||
install-wheel-rs | ||
once-map | ||
pep440-rs | ||
pep508-rs | ||
platform-host | ||
platform-tags | ||
puffin | ||
puffin-build | ||
puffin-cache | ||
puffin-client | ||
puffin-dev | ||
puffin-dispatch | ||
puffin-distribution | ||
puffin-extract | ||
puffin-fs | ||
puffin-git | ||
puffin-installer | ||
puffin-interpreter | ||
puffin-normalize | ||
puffin-resolver | ||
puffin-traits | ||
puffin-trampoline | ||
puffin-warnings | ||
puffin-workspace | ||
pypi-types | ||
requirements-txt | ||
README.md |
Crates
bench
Functionality for benchmarking Puffin.
cache-key
Generic functionality for caching paths, URLs, and other resources across platforms.
distribution-filename
Parse built distribution (wheel) and source distribution (sdist) filenames to extract structured metadata.
distribution-types
Abstractions for representing built distributions (wheels) and source distributions (sdists), and the sources from which they can be downloaded.
gourgeist
A venv
replacement to create virtual environments in Rust.
install-wheel-rs
Install built distributions (wheels) into a virtual environment.]
once-map
A waitmap
-like concurrent hash map for executing tasks
exactly once.
pep440-rs
Utilities for interacting with Python version numbers and specifiers.
pep508-rs
Utilities for interacting with PEP 508 dependency specifiers.
platform-host
Functionality for detecting the current platform (operating system, architecture, etc.).
platform-tags
Functionality for parsing and inferring Python platform tags as per PEP 425.
puffin
Command-line interface for the Puffin package manager.
puffin-build
A PEP 517-compatible build frontend for Puffin.
puffin-cache
Functionality for caching Python packages and associated metadata.
puffin-client
Client for interacting with PyPI-compatible HTTP APIs.
puffin-dev
Development utilities for Puffin.
puffin-dispatch
A centralized struct
for resolving and building source distributions in isolated environments.
Implements the traits defined in puffin-traits
.
puffin-distribution
Client for interacting with built distributions (wheels) and source distributions (sdists). Capable of fetching metadata, distribution contents, etc.
puffin-extract
Utilities for extracting files from archives.
puffin-fs
Utilities for interacting with the filesystem.
puffin-git
Functionality for interacting with Git repositories.
puffin-installer
Functionality for installing Python packages into a virtual environment.
puffin-interpreter
Functionality for detecting and leveraging the current Python interpreter.
puffin-normalize
Normalize package and extra names as per Python specifications.
puffin-package
Types and functionality for working with Python packages, e.g., parsing wheel files.
puffin-resolver
Functionality for resolving Python packages and their dependencies.
puffin-traits
Shared traits for Puffin, to avoid circular dependencies.
pypi-types
General-purpose type definitions for types used in PyPI-compatible APIs.
puffin-warnings
User-facing warnings for Puffin.
requirements-txt
Functionality for parsing requirements.txt
files.