add initial rkyv support (#1135)

This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.

For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:

* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
  cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
  type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
  of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
  else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
  introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
  given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
  allocate and create all necessary objects.

This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.

The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.

My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.

There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.

[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
This commit is contained in:
Andrew Gallant 2024-01-28 12:14:59 -05:00 committed by GitHub
parent c0e7668dfa
commit 5219d37250
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
33 changed files with 782 additions and 204 deletions

155
Cargo.lock generated
View file

@ -26,6 +26,17 @@ dependencies = [
"const-random", "const-random",
] ]
[[package]]
name = "ahash"
version = "0.7.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5a824f2aa7e75a0c98c5a504fceb80649e9c35265d44525b5f94de4771a395cd"
dependencies = [
"getrandom",
"once_cell",
"version_check",
]
[[package]] [[package]]
name = "aho-corasick" name = "aho-corasick"
version = "1.1.2" version = "1.1.2"
@ -294,6 +305,18 @@ version = "2.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ed570934406eb16438a4e976b1b4500774099c13b8cb96eec99f620f05090ddf" checksum = "ed570934406eb16438a4e976b1b4500774099c13b8cb96eec99f620f05090ddf"
[[package]]
name = "bitvec"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1bc2832c24239b0141d5674bb9174f9d68a8b5b3f2753311927c172ca46f7e9c"
dependencies = [
"funty",
"radium",
"tap",
"wyz",
]
[[package]] [[package]]
name = "block-buffer" name = "block-buffer"
version = "0.10.4" version = "0.10.4"
@ -341,6 +364,28 @@ version = "3.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f30e7476521f6f8af1a1c4c0b8cc94f0bee37d91763d0ca2665f299b6cd8aec" checksum = "7f30e7476521f6f8af1a1c4c0b8cc94f0bee37d91763d0ca2665f299b6cd8aec"
[[package]]
name = "bytecheck"
version = "0.6.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8b6372023ac861f6e6dc89c8344a8f398fb42aaba2b5dbc649ca0c0e9dbcb627"
dependencies = [
"bytecheck_derive",
"ptr_meta",
"simdutf8",
]
[[package]]
name = "bytecheck_derive"
version = "0.6.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7ec4c6f261935ad534c0c22dbef2201b45918860eb1c574b972bd213a76af61"
dependencies = [
"proc-macro2",
"quote",
"syn 1.0.109",
]
[[package]] [[package]]
name = "byteorder" name = "byteorder"
version = "1.5.0" version = "1.5.0"
@ -712,7 +757,7 @@ version = "3.11.10"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0f260e2fc850179ef410018660006951c1b55b79e8087e87111a2c388994b9b5" checksum = "0f260e2fc850179ef410018660006951c1b55b79e8087e87111a2c388994b9b5"
dependencies = [ dependencies = [
"ahash", "ahash 0.3.8",
"cfg-if 0.1.10", "cfg-if 0.1.10",
"num_cpus", "num_cpus",
] ]
@ -802,6 +847,7 @@ dependencies = [
"pep440_rs 0.3.12", "pep440_rs 0.3.12",
"platform-tags", "platform-tags",
"puffin-normalize", "puffin-normalize",
"rkyv",
"serde", "serde",
"thiserror", "thiserror",
"url", "url",
@ -813,7 +859,6 @@ version = "0.0.1"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"cache-key", "cache-key",
"chrono",
"data-encoding", "data-encoding",
"distribution-filename", "distribution-filename",
"fs-err", "fs-err",
@ -825,6 +870,7 @@ dependencies = [
"puffin-git", "puffin-git",
"puffin-normalize", "puffin-normalize",
"pypi-types", "pypi-types",
"rkyv",
"rustc-hash", "rustc-hash",
"serde", "serde",
"serde_json", "serde_json",
@ -961,6 +1007,12 @@ dependencies = [
"winapi", "winapi",
] ]
[[package]]
name = "funty"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6d5a32815ae3f33302d95fdcb2ce17862f8c65363dcfd29360480ba1001fc9c"
[[package]] [[package]]
name = "futures" name = "futures"
version = "0.3.30" version = "0.3.30"
@ -1199,6 +1251,9 @@ name = "hashbrown"
version = "0.12.3" version = "0.12.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a9ee70c43aaf417c914396645a0fa852624801b24ebb7ae78fe8272889ac888" checksum = "8a9ee70c43aaf417c914396645a0fa852624801b24ebb7ae78fe8272889ac888"
dependencies = [
"ahash 0.7.7",
]
[[package]] [[package]]
name = "hashbrown" name = "hashbrown"
@ -2054,6 +2109,7 @@ dependencies = [
"once_cell", "once_cell",
"pubgrub", "pubgrub",
"pyo3", "pyo3",
"rkyv",
"serde", "serde",
"tracing", "tracing",
"unicode-width", "unicode-width",
@ -2085,6 +2141,7 @@ dependencies = [
"pyo3", "pyo3",
"pyo3-log", "pyo3-log",
"regex", "regex",
"rkyv",
"serde", "serde",
"serde_json", "serde_json",
"testing_logger", "testing_logger",
@ -2270,6 +2327,26 @@ dependencies = [
"unicode-ident", "unicode-ident",
] ]
[[package]]
name = "ptr_meta"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0738ccf7ea06b608c10564b31debd4f5bc5e197fc8bfe088f68ae5ce81e7a4f1"
dependencies = [
"ptr_meta_derive",
]
[[package]]
name = "ptr_meta_derive"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "16b845dbfca988fa33db069c0e230574d15a3088f147a87b64c7589eb662c9ac"
dependencies = [
"proc-macro2",
"quote",
"syn 1.0.109",
]
[[package]] [[package]]
name = "pubgrub" name = "pubgrub"
version = "0.2.1" version = "0.2.1"
@ -2427,6 +2504,7 @@ dependencies = [
"reqwest", "reqwest",
"reqwest-middleware", "reqwest-middleware",
"reqwest-retry", "reqwest-retry",
"rkyv",
"rmp-serde", "rmp-serde",
"rustc-hash", "rustc-hash",
"serde", "serde",
@ -2672,6 +2750,7 @@ dependencies = [
name = "puffin-normalize" name = "puffin-normalize"
version = "0.0.1" version = "0.0.1"
dependencies = [ dependencies = [
"rkyv",
"serde", "serde",
] ]
@ -2851,6 +2930,7 @@ dependencies = [
"pep508_rs", "pep508_rs",
"puffin-normalize", "puffin-normalize",
"regex", "regex",
"rkyv",
"serde", "serde",
"serde_json", "serde_json",
"tempfile", "tempfile",
@ -2897,6 +2977,12 @@ version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "79ec282e887b434b68c18fe5c121d38e72a5cf35119b59e54ec5b992ea9c8eb0" checksum = "79ec282e887b434b68c18fe5c121d38e72a5cf35119b59e54ec5b992ea9c8eb0"
[[package]]
name = "radium"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc33ff2d4973d518d823d61aa239014831e521c75da58e3df4840d3f47749d09"
[[package]] [[package]]
name = "rand" name = "rand"
version = "0.8.5" version = "0.8.5"
@ -3031,6 +3117,15 @@ version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c08c74e62047bb2de4ff487b251e4a92e24f48745648451635cec7d591162d9f" checksum = "c08c74e62047bb2de4ff487b251e4a92e24f48745648451635cec7d591162d9f"
[[package]]
name = "rend"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a2571463863a6bd50c32f94402933f03457a3fbaf697a707c5be741e459f08fd"
dependencies = [
"bytecheck",
]
[[package]] [[package]]
name = "requirements-txt" name = "requirements-txt"
version = "0.0.1" version = "0.0.1"
@ -3164,6 +3259,35 @@ dependencies = [
"windows-sys 0.48.0", "windows-sys 0.48.0",
] ]
[[package]]
name = "rkyv"
version = "0.7.43"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "527a97cdfef66f65998b5f3b637c26f5a5ec09cc52a3f9932313ac645f4190f5"
dependencies = [
"bitvec",
"bytecheck",
"bytes",
"hashbrown 0.12.3",
"ptr_meta",
"rend",
"rkyv_derive",
"seahash",
"tinyvec",
"uuid",
]
[[package]]
name = "rkyv_derive"
version = "0.7.43"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b5c462a1328c8e67e4d6dbad1eb0355dd43e8ab432c6e227a43657f16ade5033"
dependencies = [
"proc-macro2",
"quote",
"syn 1.0.109",
]
[[package]] [[package]]
name = "rmp" name = "rmp"
version = "0.8.12" version = "0.8.12"
@ -3403,6 +3527,12 @@ dependencies = [
"libc", "libc",
] ]
[[package]]
name = "simdutf8"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f27f6278552951f1f2b8cf9da965d10969b2efdea95a6ec47987ab46edfe263a"
[[package]] [[package]]
name = "similar" name = "similar"
version = "2.4.0" version = "2.4.0"
@ -3535,6 +3665,12 @@ dependencies = [
"libc", "libc",
] ]
[[package]]
name = "tap"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "55937e1799185b12863d447f42597ed69d9928686b8d88a1df17376a097d8369"
[[package]] [[package]]
name = "tar" name = "tar"
version = "0.4.40" version = "0.4.40"
@ -4068,6 +4204,12 @@ version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "711b9620af191e0cdc7468a8d14e709c3dcdb115b36f838e601583af800a370a" checksum = "711b9620af191e0cdc7468a8d14e709c3dcdb115b36f838e601583af800a370a"
[[package]]
name = "uuid"
version = "1.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f00cc9702ca12d3c81455259621e676d0f7251cec66a21e98fe2e9a37db93b2a"
[[package]] [[package]]
name = "valuable" name = "valuable"
version = "0.1.0" version = "0.1.0"
@ -4487,6 +4629,15 @@ dependencies = [
"windows-sys 0.48.0", "windows-sys 0.48.0",
] ]
[[package]]
name = "wyz"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "05f360fc0b24296329c78fda852a1e9ae82de9cf7b27dae4b7f62f118f77b9ed"
dependencies = [
"tap",
]
[[package]] [[package]]
name = "xattr" name = "xattr"
version = "1.3.1" version = "1.3.1"

View file

@ -73,6 +73,7 @@ regex = { version = "1.10.2" }
reqwest = { version = "0.11.23", default-features = false, features = ["json", "gzip", "brotli", "stream", "rustls-tls"] } reqwest = { version = "0.11.23", default-features = false, features = ["json", "gzip", "brotli", "stream", "rustls-tls"] }
reqwest-middleware = { version = "0.2.4" } reqwest-middleware = { version = "0.2.4" }
reqwest-retry = { version = "0.3.0" } reqwest-retry = { version = "0.3.0" }
rkyv = { version = "0.7.43", features = ["strict", "validation"] }
rmp-serde = { version = "1.1.2" } rmp-serde = { version = "1.1.2" }
rustc-hash = { version = "1.1.0" } rustc-hash = { version = "1.1.0" }
same-file = { version = "1.0.6" } same-file = { version = "1.0.6" }

View file

@ -12,11 +12,15 @@ license = { workspace = true }
[lints] [lints]
workspace = true workspace = true
[features]
rkyv = ["dep:rkyv", "pep440_rs/rkyv"]
[dependencies] [dependencies]
pep440_rs = { path = "../pep440-rs" } pep440_rs = { path = "../pep440-rs" }
platform-tags = { path = "../platform-tags" } platform-tags = { path = "../platform-tags" }
puffin-normalize = { path = "../puffin-normalize" } puffin-normalize = { path = "../puffin-normalize" }
rkyv = { workspace = true, features = ["strict", "validation"], optional = true }
serde = { workspace = true, optional = true } serde = { workspace = true, optional = true }
thiserror = { workspace = true } thiserror = { workspace = true }
url = { workspace = true } url = { workspace = true }

View file

@ -10,6 +10,12 @@ use puffin_normalize::{InvalidNameError, PackageName};
#[derive(Clone, Debug, PartialEq, Eq)] #[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(feature = "rkyv", archive_attr(derive(Debug)))]
pub enum SourceDistExtension { pub enum SourceDistExtension {
Zip, Zip,
TarGz, TarGz,
@ -52,6 +58,12 @@ impl SourceDistExtension {
/// need the latter. /// need the latter.
#[derive(Clone, Debug, PartialEq, Eq)] #[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(feature = "rkyv", archive_attr(derive(Debug)))]
pub struct SourceDistFilename { pub struct SourceDistFilename {
pub name: PackageName, pub name: PackageName,
pub version: Version, pub version: Version,

View file

@ -11,6 +11,12 @@ use platform_tags::{TagPriority, Tags};
use puffin_normalize::{InvalidNameError, PackageName}; use puffin_normalize::{InvalidNameError, PackageName};
#[derive(Debug, Clone, Eq, PartialEq, Hash)] #[derive(Debug, Clone, Eq, PartialEq, Hash)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(feature = "rkyv", archive_attr(derive(Debug)))]
pub struct WheelFilename { pub struct WheelFilename {
pub name: PackageName, pub name: PackageName,
pub version: Version, pub version: Version,

View file

@ -24,10 +24,10 @@ puffin-normalize = { path = "../puffin-normalize" }
pypi-types = { path = "../pypi-types" } pypi-types = { path = "../pypi-types" }
anyhow = { workspace = true } anyhow = { workspace = true }
chrono = { workspace = true, features = ["serde"] }
data-encoding = { workspace = true } data-encoding = { workspace = true }
fs-err = { workspace = true } fs-err = { workspace = true }
once_cell = { workspace = true } once_cell = { workspace = true }
rkyv = { workspace = true, features = ["strict", "validation"] }
rustc-hash = { workspace = true } rustc-hash = { workspace = true }
serde = { workspace = true, features = ["derive"] } serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true } serde_json = { workspace = true }

View file

@ -1,12 +1,11 @@
use std::fmt::{Display, Formatter}; use std::fmt::{Display, Formatter};
use std::path::PathBuf; use std::path::PathBuf;
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use thiserror::Error; use thiserror::Error;
use pep440_rs::{VersionSpecifiers, VersionSpecifiersParseError}; use pep440_rs::{VersionSpecifiers, VersionSpecifiersParseError};
use pypi_types::{BaseUrl, DistInfoMetadata, Hashes, Yanked}; use pypi_types::{DistInfoMetadata, Hashes, Yanked};
/// Error converting [`pypi_types::File`] to [`distribution_type::File`]. /// Error converting [`pypi_types::File`] to [`distribution_type::File`].
#[derive(Debug, Error)] #[derive(Debug, Error)]
@ -18,32 +17,40 @@ pub enum FileConversionError {
} }
/// Internal analog to [`pypi_types::File`]. /// Internal analog to [`pypi_types::File`].
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(
Debug, Clone, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct File { pub struct File {
pub dist_info_metadata: Option<DistInfoMetadata>, pub dist_info_metadata: Option<DistInfoMetadata>,
pub filename: String, pub filename: String,
pub hashes: Hashes, pub hashes: Hashes,
pub requires_python: Option<VersionSpecifiers>, pub requires_python: Option<VersionSpecifiers>,
pub size: Option<u64>, pub size: Option<u64>,
pub upload_time: Option<DateTime<Utc>>, // N.B. We don't use a chrono DateTime<Utc> here because it's a little
// annoying to do so with rkyv. Since we only use this field for doing
// comparisons in testing, we just store it as a UTC timestamp in
// milliseconds.
pub upload_time_utc_ms: Option<i64>,
pub url: FileLocation, pub url: FileLocation,
pub yanked: Option<Yanked>, pub yanked: Option<Yanked>,
} }
impl File { impl File {
/// `TryFrom` instead of `From` to filter out files with invalid requires python version specifiers /// `TryFrom` instead of `From` to filter out files with invalid requires python version specifiers
pub fn try_from(file: pypi_types::File, base: &BaseUrl) -> Result<Self, FileConversionError> { pub fn try_from(file: pypi_types::File, base: &str) -> Result<Self, FileConversionError> {
Ok(Self { Ok(Self {
dist_info_metadata: file.dist_info_metadata, dist_info_metadata: file.dist_info_metadata,
filename: file.filename, filename: file.filename,
hashes: file.hashes, hashes: file.hashes,
requires_python: file.requires_python.transpose()?, requires_python: file.requires_python.transpose()?,
size: file.size, size: file.size,
upload_time: file.upload_time, upload_time_utc_ms: file.upload_time.map(|dt| dt.timestamp_millis()),
url: if file.url.contains("://") { url: if file.url.contains("://") {
FileLocation::AbsoluteUrl(file.url) FileLocation::AbsoluteUrl(file.url)
} else { } else {
FileLocation::RelativeUrl(base.clone(), file.url) FileLocation::RelativeUrl(base.to_string(), file.url)
}, },
yanked: file.yanked, yanked: file.yanked,
}) })
@ -51,14 +58,18 @@ impl File {
} }
/// While a registry file is generally a remote URL, it can also be a file if it comes from a directory flat indexes. /// While a registry file is generally a remote URL, it can also be a file if it comes from a directory flat indexes.
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(
Debug, Clone, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub enum FileLocation { pub enum FileLocation {
/// URL relative to the base URL. /// URL relative to the base URL.
RelativeUrl(BaseUrl, String), RelativeUrl(String, String),
/// Absolute URL. /// Absolute URL.
AbsoluteUrl(String), AbsoluteUrl(String),
/// Absolute path to a file. /// Absolute path to a file.
Path(PathBuf), Path(#[with(rkyv::with::AsString)] PathBuf),
} }
impl Display for FileLocation { impl Display for FileLocation {

View file

@ -705,6 +705,16 @@ impl Identifier for &str {
} }
} }
impl Identifier for (&str, &str) {
fn distribution_id(&self) -> DistributionId {
DistributionId::new(cache_key::digest(&self))
}
fn resource_id(&self) -> ResourceId {
ResourceId::new(cache_key::digest(&self))
}
}
impl Identifier for (&Url, &str) { impl Identifier for (&Url, &str) {
fn distribution_id(&self) -> DistributionId { fn distribution_id(&self) -> DistributionId {
DistributionId::new(cache_key::digest(&self)) DistributionId::new(cache_key::digest(&self))
@ -718,7 +728,7 @@ impl Identifier for (&Url, &str) {
impl Identifier for FileLocation { impl Identifier for FileLocation {
fn distribution_id(&self) -> DistributionId { fn distribution_id(&self) -> DistributionId {
match self { match self {
FileLocation::RelativeUrl(base, url) => (base.as_url(), url.as_str()).distribution_id(), FileLocation::RelativeUrl(base, url) => (base.as_str(), url.as_str()).distribution_id(),
FileLocation::AbsoluteUrl(url) => url.distribution_id(), FileLocation::AbsoluteUrl(url) => url.distribution_id(),
FileLocation::Path(path) => path.distribution_id(), FileLocation::Path(path) => path.distribution_id(),
} }
@ -726,7 +736,7 @@ impl Identifier for FileLocation {
fn resource_id(&self) -> ResourceId { fn resource_id(&self) -> ResourceId {
match self { match self {
FileLocation::RelativeUrl(base, url) => (base.as_url(), url.as_str()).resource_id(), FileLocation::RelativeUrl(base, url) => (base.as_str(), url.as_str()).resource_id(),
FileLocation::AbsoluteUrl(url) => url.resource_id(), FileLocation::AbsoluteUrl(url) => url.resource_id(),
FileLocation::Path(path) => path.resource_id(), FileLocation::Path(path) => path.resource_id(),
} }

View file

@ -21,6 +21,7 @@ once_cell = { workspace = true }
pubgrub = { workspace = true, optional = true } pubgrub = { workspace = true, optional = true }
pyo3 = { workspace = true, optional = true, features = ["extension-module", "abi3-py37"] } pyo3 = { workspace = true, optional = true, features = ["extension-module", "abi3-py37"] }
serde = { workspace = true, features = ["derive"], optional = true } serde = { workspace = true, features = ["derive"], optional = true }
rkyv = { workspace = true, features = ["strict", "validation"], optional = true }
tracing = { workspace = true, optional = true } tracing = { workspace = true, optional = true }
unicode-width = { workspace = true } unicode-width = { workspace = true }
unscanny = { workspace = true } unscanny = { workspace = true }

View file

@ -38,8 +38,8 @@
pub use version::PyVersion; pub use version::PyVersion;
pub use { pub use {
version::{ version::{
LocalSegment, Operator, OperatorParseError, PreRelease, Version, VersionParseError, LocalSegment, Operator, OperatorParseError, PreRelease, PreReleaseKind, Version,
VersionPattern, VersionPatternParseError, MIN_VERSION, VersionParseError, VersionPattern, VersionPatternParseError, MIN_VERSION,
}, },
version_specifier::{ version_specifier::{
parse_version_specifiers, VersionSpecifier, VersionSpecifiers, VersionSpecifiersParseError, parse_version_specifiers, VersionSpecifier, VersionSpecifiers, VersionSpecifiersParseError,

View file

@ -16,6 +16,15 @@ use serde::{de, Deserialize, Deserializer, Serialize, Serializer};
/// One of `~=` `==` `!=` `<=` `>=` `<` `>` `===` /// One of `~=` `==` `!=` `<=` `>=` `<` `>` `===`
#[derive(Eq, PartialEq, Debug, Hash, Clone, Copy)] #[derive(Eq, PartialEq, Debug, Hash, Clone, Copy)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
#[cfg_attr(feature = "pyo3", pyclass)] #[cfg_attr(feature = "pyo3", pyclass)]
pub enum Operator { pub enum Operator {
/// `== 1.2.3` /// `== 1.2.3`
@ -240,11 +249,29 @@ impl std::fmt::Display for OperatorParseError {
/// let version = Version::from_str("1.19").unwrap(); /// let version = Version::from_str("1.19").unwrap();
/// ``` /// ```
#[derive(Clone)] #[derive(Clone)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
pub struct Version { pub struct Version {
inner: Arc<VersionInner>, inner: Arc<VersionInner>,
} }
#[derive(Clone, Debug)] #[derive(Clone, Debug)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
enum VersionInner { enum VersionInner {
Small { small: VersionSmall }, Small { small: VersionSmall },
Full { full: VersionFull }, Full { full: VersionFull },
@ -324,7 +351,7 @@ impl Version {
/// Returns the pre-relase part of this version, if it exists. /// Returns the pre-relase part of this version, if it exists.
#[inline] #[inline]
pub fn pre(&self) -> Option<(PreRelease, u64)> { pub fn pre(&self) -> Option<PreRelease> {
match *self.inner { match *self.inner {
VersionInner::Small { ref small } => small.pre(), VersionInner::Small { ref small } => small.pre(),
VersionInner::Full { ref full } => full.pre, VersionInner::Full { ref full } => full.pre,
@ -425,7 +452,7 @@ impl Version {
/// Set the pre-release component and return the updated version. /// Set the pre-release component and return the updated version.
#[inline] #[inline]
pub fn with_pre(mut self, value: Option<(PreRelease, u64)>) -> Version { pub fn with_pre(mut self, value: Option<PreRelease>) -> Version {
if let VersionInner::Small { ref mut small } = Arc::make_mut(&mut self.inner) { if let VersionInner::Small { ref mut small } = Arc::make_mut(&mut self.inner) {
if small.set_pre(value) { if small.set_pre(value) {
return self; return self;
@ -581,7 +608,7 @@ impl std::fmt::Display for Version {
let pre = self let pre = self
.pre() .pre()
.as_ref() .as_ref()
.map(|(pre_kind, pre_version)| format!("{pre_kind}{pre_version}")) .map(|PreRelease { kind, number }| format!("{kind}{number}"))
.unwrap_or_default(); .unwrap_or_default();
let post = self let post = self
.post() .post()
@ -746,6 +773,15 @@ impl FromStr for Version {
/// incredibly rare. Virtually all versions have zero or one pre, dev or post /// incredibly rare. Virtually all versions have zero or one pre, dev or post
/// release components. /// release components.
#[derive(Clone, Debug)] #[derive(Clone, Debug)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
struct VersionSmall { struct VersionSmall {
/// The representation discussed above. /// The representation discussed above.
repr: u64, repr: u64,
@ -869,23 +905,23 @@ impl VersionSmall {
} }
#[inline] #[inline]
fn pre(&self) -> Option<(PreRelease, u64)> { fn pre(&self) -> Option<PreRelease> {
let v = (self.repr >> 8) & 0xFF; let v = (self.repr >> 8) & 0xFF;
if v == 0xFF { if v == 0xFF {
return None; return None;
} }
let number = v & 0b0011_1111; let number = v & 0b0011_1111;
let kind = match v >> 6 { let kind = match v >> 6 {
0 => PreRelease::Alpha, 0 => PreReleaseKind::Alpha,
1 => PreRelease::Beta, 1 => PreReleaseKind::Beta,
2 => PreRelease::Rc, 2 => PreReleaseKind::Rc,
_ => unreachable!(), _ => unreachable!(),
}; };
Some((kind, number)) Some(PreRelease { kind, number })
} }
#[inline] #[inline]
fn set_pre(&mut self, value: Option<(PreRelease, u64)>) -> bool { fn set_pre(&mut self, value: Option<PreRelease>) -> bool {
if value.is_some() && (self.post().is_some() || self.dev().is_some()) { if value.is_some() && (self.post().is_some() || self.dev().is_some()) {
return false; return false;
} }
@ -893,14 +929,14 @@ impl VersionSmall {
None => { None => {
self.repr |= 0xFF << 8; self.repr |= 0xFF << 8;
} }
Some((kind, number)) => { Some(PreRelease { kind, number }) => {
if number > 0b0011_1111 { if number > 0b0011_1111 {
return false; return false;
} }
let kind = match kind { let kind = match kind {
PreRelease::Alpha => 0, PreReleaseKind::Alpha => 0,
PreRelease::Beta => 1, PreReleaseKind::Beta => 1,
PreRelease::Rc => 2, PreReleaseKind::Rc => 2,
}; };
self.repr &= !(0xFF << 8); self.repr &= !(0xFF << 8);
self.repr |= ((kind << 6) | number) << 8; self.repr |= ((kind << 6) | number) << 8;
@ -956,6 +992,15 @@ impl VersionSmall {
/// In general, the "full" representation is rarely used in practice since most /// In general, the "full" representation is rarely used in practice since most
/// versions will fit into the "small" representation. /// versions will fit into the "small" representation.
#[derive(Clone, Debug)] #[derive(Clone, Debug)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
struct VersionFull { struct VersionFull {
/// The [versioning /// The [versioning
/// epoch](https://peps.python.org/pep-0440/#version-epochs). Normally /// epoch](https://peps.python.org/pep-0440/#version-epochs). Normally
@ -973,7 +1018,7 @@ struct VersionFull {
/// ///
/// Note that whether this is Some influences the version range /// Note that whether this is Some influences the version range
/// matching since normally we exclude all prerelease versions /// matching since normally we exclude all prerelease versions
pre: Option<(PreRelease, u64)>, pre: Option<PreRelease>,
/// The [Post release /// The [Post release
/// version](https://peps.python.org/pep-0440/#post-releases), higher /// version](https://peps.python.org/pep-0440/#post-releases), higher
/// post version are preferred over lower post or none-post versions /// post version are preferred over lower post or none-post versions
@ -1066,12 +1111,40 @@ impl FromStr for VersionPattern {
} }
} }
/// An optional pre-release modifier and number applied to a version.
#[derive(PartialEq, Eq, Debug, Hash, Clone, Copy, Ord, PartialOrd)]
#[cfg_attr(feature = "pyo3", pyclass)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
pub struct PreRelease {
/// The kind of pre-release.
pub kind: PreReleaseKind,
/// The number associated with the pre-release.
pub number: u64,
}
/// Optional prerelease modifier (alpha, beta or release candidate) appended to version /// Optional prerelease modifier (alpha, beta or release candidate) appended to version
/// ///
/// <https://peps.python.org/pep-0440/#pre-releases> /// <https://peps.python.org/pep-0440/#pre-releases>
#[derive(PartialEq, Eq, Debug, Hash, Clone, Copy, Ord, PartialOrd)] #[derive(PartialEq, Eq, Debug, Hash, Clone, Copy, Ord, PartialOrd)]
#[cfg_attr(feature = "pyo3", pyclass)] #[cfg_attr(feature = "pyo3", pyclass)]
pub enum PreRelease { #[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
pub enum PreReleaseKind {
/// alpha prerelease /// alpha prerelease
Alpha, Alpha,
/// beta prerelease /// beta prerelease
@ -1080,7 +1153,7 @@ pub enum PreRelease {
Rc, Rc,
} }
impl std::fmt::Display for PreRelease { impl std::fmt::Display for PreReleaseKind {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self { match self {
Self::Alpha => write!(f, "a"), Self::Alpha => write!(f, "a"),
@ -1106,6 +1179,15 @@ impl std::fmt::Display for PreRelease {
/// ///
/// Luckily the default `Ord` implementation for `Vec<LocalSegment>` matches the PEP 440 rules. /// Luckily the default `Ord` implementation for `Vec<LocalSegment>` matches the PEP 440 rules.
#[derive(Eq, PartialEq, Debug, Clone, Hash)] #[derive(Eq, PartialEq, Debug, Clone, Hash)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(
feature = "rkyv",
archive_attr(derive(Debug, Eq, PartialEq, PartialOrd, Ord))
)]
pub enum LocalSegment { pub enum LocalSegment {
/// Not-parseable as integer segment of local version /// Not-parseable as integer segment of local version
String(String), String(String),
@ -1160,7 +1242,7 @@ struct Parser<'a> {
/// The release numbers extracted from the version. /// The release numbers extracted from the version.
release: ReleaseNumbers, release: ReleaseNumbers,
/// The pre-release version, if any. /// The pre-release version, if any.
pre: Option<(PreRelease, u64)>, pre: Option<PreRelease>,
/// The post-release version, if any. /// The post-release version, if any.
post: Option<u64>, post: Option<u64>,
/// The dev release, if any. /// The dev release, if any.
@ -1384,15 +1466,15 @@ impl<'a> Parser<'a> {
// since the strings are matched in order. // since the strings are matched in order.
const SPELLINGS: StringSet = const SPELLINGS: StringSet =
StringSet::new(&["alpha", "beta", "preview", "pre", "rc", "a", "b", "c"]); StringSet::new(&["alpha", "beta", "preview", "pre", "rc", "a", "b", "c"]);
const MAP: &[PreRelease] = &[ const MAP: &[PreReleaseKind] = &[
PreRelease::Alpha, PreReleaseKind::Alpha,
PreRelease::Beta, PreReleaseKind::Beta,
PreRelease::Rc, PreReleaseKind::Rc,
PreRelease::Rc, PreReleaseKind::Rc,
PreRelease::Rc, PreReleaseKind::Rc,
PreRelease::Alpha, PreReleaseKind::Alpha,
PreRelease::Beta, PreReleaseKind::Beta,
PreRelease::Rc, PreReleaseKind::Rc,
]; ];
let oldpos = self.i; let oldpos = self.i;
@ -1410,7 +1492,7 @@ impl<'a> Parser<'a> {
// Under the normalization rules, a pre-release without an // Under the normalization rules, a pre-release without an
// explicit number defaults to `0`. // explicit number defaults to `0`.
let number = self.parse_number()?.unwrap_or(0); let number = self.parse_number()?.unwrap_or(0);
self.pre = Some((kind, number)); self.pre = Some(PreRelease { kind, number });
Ok(()) Ok(())
} }
@ -1991,7 +2073,7 @@ impl PyVersion {
/// Note that whether this is Some influences the version /// Note that whether this is Some influences the version
/// range matching since normally we exclude all prerelease versions /// range matching since normally we exclude all prerelease versions
#[getter] #[getter]
pub fn pre(&self) -> Option<(PreRelease, u64)> { pub fn pre(&self) -> Option<PreRelease> {
self.0.pre() self.0.pre()
} }
/// The [Post release version](https://peps.python.org/pep-0440/#post-releases), /// The [Post release version](https://peps.python.org/pep-0440/#post-releases),
@ -2134,17 +2216,32 @@ fn sortable_tuple(version: &Version) -> (u64, u64, Option<u64>, u64, &[LocalSegm
// dev release // dev release
(None, None, Some(n)) => (0, 0, None, n, version.local()), (None, None, Some(n)) => (0, 0, None, n, version.local()),
// alpha release // alpha release
(Some((PreRelease::Alpha, n)), post, dev) => { (
(1, n, post, dev.unwrap_or(u64::MAX), version.local()) Some(PreRelease {
} kind: PreReleaseKind::Alpha,
number: n,
}),
post,
dev,
) => (1, n, post, dev.unwrap_or(u64::MAX), version.local()),
// beta release // beta release
(Some((PreRelease::Beta, n)), post, dev) => { (
(2, n, post, dev.unwrap_or(u64::MAX), version.local()) Some(PreRelease {
} kind: PreReleaseKind::Beta,
number: n,
}),
post,
dev,
) => (2, n, post, dev.unwrap_or(u64::MAX), version.local()),
// alpha release // alpha release
(Some((PreRelease::Rc, n)), post, dev) => { (
(3, n, post, dev.unwrap_or(u64::MAX), version.local()) Some(PreRelease {
} kind: PreReleaseKind::Rc,
number: n,
}),
post,
dev,
) => (3, n, post, dev.unwrap_or(u64::MAX), version.local()),
// final release // final release
(None, None, None) => (4, 0, None, 0, version.local()), (None, None, None) => (4, 0, None, 0, version.local()),
// post release // post release
@ -2236,70 +2333,109 @@ mod tests {
("1.0.dev456", Version::new([1, 0]).with_dev(Some(456))), ("1.0.dev456", Version::new([1, 0]).with_dev(Some(456))),
( (
"1.0a1", "1.0a1",
Version::new([1, 0]).with_pre(Some((PreRelease::Alpha, 1))), Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1,
})),
), ),
( (
"1.0a2.dev456", "1.0a2.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Alpha, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 2,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1.0a12.dev456", "1.0a12.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Alpha, 12))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 12,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1.0a12", "1.0a12",
Version::new([1, 0]).with_pre(Some((PreRelease::Alpha, 12))), Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 12,
})),
), ),
( (
"1.0b1.dev456", "1.0b1.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Beta, 1))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 1,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1.0b2", "1.0b2",
Version::new([1, 0]).with_pre(Some((PreRelease::Beta, 2))), Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
})),
), ),
( (
"1.0b2.post345.dev456", "1.0b2.post345.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Beta, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
}))
.with_dev(Some(456)) .with_dev(Some(456))
.with_post(Some(345)), .with_post(Some(345)),
), ),
( (
"1.0b2.post345", "1.0b2.post345",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Beta, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
}))
.with_post(Some(345)), .with_post(Some(345)),
), ),
( (
"1.0b2-346", "1.0b2-346",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Beta, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
}))
.with_post(Some(346)), .with_post(Some(346)),
), ),
( (
"1.0c1.dev456", "1.0c1.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_pre(Some((PreRelease::Rc, 1))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1.0c1", "1.0c1",
Version::new([1, 0]).with_pre(Some((PreRelease::Rc, 1))), Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1,
})),
), ),
( (
"1.0rc2", "1.0rc2",
Version::new([1, 0]).with_pre(Some((PreRelease::Rc, 2))), Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 2,
})),
), ),
( (
"1.0c3", "1.0c3",
Version::new([1, 0]).with_pre(Some((PreRelease::Rc, 3))), Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 3,
})),
), ),
("1.0", Version::new([1, 0])), ("1.0", Version::new([1, 0])),
( (
@ -2362,46 +2498,67 @@ mod tests {
"1!1.0a1", "1!1.0a1",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Alpha, 1))), .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1,
})),
), ),
( (
"1!1.0a2.dev456", "1!1.0a2.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Alpha, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 2,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1!1.0a12.dev456", "1!1.0a12.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Alpha, 12))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 12,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1!1.0a12", "1!1.0a12",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Alpha, 12))), .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 12,
})),
), ),
( (
"1!1.0b1.dev456", "1!1.0b1.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Beta, 1))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 1,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1!1.0b2", "1!1.0b2",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Beta, 2))), .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
})),
), ),
( (
"1!1.0b2.post345.dev456", "1!1.0b2.post345.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Beta, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
}))
.with_post(Some(345)) .with_post(Some(345))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
@ -2409,40 +2566,58 @@ mod tests {
"1!1.0b2.post345", "1!1.0b2.post345",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Beta, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
}))
.with_post(Some(345)), .with_post(Some(345)),
), ),
( (
"1!1.0b2-346", "1!1.0b2-346",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Beta, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 2,
}))
.with_post(Some(346)), .with_post(Some(346)),
), ),
( (
"1!1.0c1.dev456", "1!1.0c1.dev456",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Rc, 1))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1,
}))
.with_dev(Some(456)), .with_dev(Some(456)),
), ),
( (
"1!1.0c1", "1!1.0c1",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Rc, 1))), .with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1,
})),
), ),
( (
"1!1.0rc2", "1!1.0rc2",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Rc, 2))), .with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 2,
})),
), ),
( (
"1!1.0c3", "1!1.0c3",
Version::new([1, 0]) Version::new([1, 0])
.with_epoch(1) .with_epoch(1)
.with_pre(Some((PreRelease::Rc, 3))), .with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 3,
})),
), ),
("1!1.0", Version::new([1, 0]).with_epoch(1)), ("1!1.0", Version::new([1, 0]).with_epoch(1)),
( (
@ -2812,7 +2987,10 @@ mod tests {
assert_eq!( assert_eq!(
p("1.0a1.*").unwrap_err(), p("1.0a1.*").unwrap_err(),
ErrorKind::UnexpectedEnd { ErrorKind::UnexpectedEnd {
version: Version::new([1, 0]).with_pre(Some((PreRelease::Alpha, 1))), version: Version::new([1, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1
})),
remaining: ".*".to_string() remaining: ".*".to_string()
} }
.into(), .into(),
@ -2858,79 +3036,136 @@ mod tests {
// pre-release tests // pre-release tests
assert_eq!( assert_eq!(
p("5a1"), p("5a1"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5alpha1"), p("5alpha1"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5b1"), p("5b1"),
Version::new([5]).with_pre(Some((PreRelease::Beta, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5beta1"), p("5beta1"),
Version::new([5]).with_pre(Some((PreRelease::Beta, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Beta,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5rc1"), p("5rc1"),
Version::new([5]).with_pre(Some((PreRelease::Rc, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5c1"), p("5c1"),
Version::new([5]).with_pre(Some((PreRelease::Rc, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5preview1"), p("5preview1"),
Version::new([5]).with_pre(Some((PreRelease::Rc, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5pre1"), p("5pre1"),
Version::new([5]).with_pre(Some((PreRelease::Rc, 1))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5.6.7pre1"), p("5.6.7pre1"),
Version::new([5, 6, 7]).with_pre(Some((PreRelease::Rc, 1))) Version::new([5, 6, 7]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Rc,
number: 1
}))
); );
assert_eq!( assert_eq!(
p("5alpha789"), p("5alpha789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5.alpha789"), p("5.alpha789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5-alpha789"), p("5-alpha789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5_alpha789"), p("5_alpha789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5alpha.789"), p("5alpha.789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5alpha-789"), p("5alpha-789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5alpha_789"), p("5alpha_789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5ALPHA789"), p("5ALPHA789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5aLpHa789"), p("5aLpHa789"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 789))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 789
}))
); );
assert_eq!( assert_eq!(
p("5alpha"), p("5alpha"),
Version::new([5]).with_pre(Some((PreRelease::Alpha, 0))) Version::new([5]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 0
}))
); );
// post-release tests // post-release tests
@ -3048,19 +3283,28 @@ mod tests {
assert_eq!( assert_eq!(
p("5a2post3"), p("5a2post3"),
Version::new([5]) Version::new([5])
.with_pre(Some((PreRelease::Alpha, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 2
}))
.with_post(Some(3)) .with_post(Some(3))
); );
assert_eq!( assert_eq!(
p("5.a-2_post-3"), p("5.a-2_post-3"),
Version::new([5]) Version::new([5])
.with_pre(Some((PreRelease::Alpha, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 2
}))
.with_post(Some(3)) .with_post(Some(3))
); );
assert_eq!( assert_eq!(
p("5a2-3"), p("5a2-3"),
Version::new([5]) Version::new([5])
.with_pre(Some((PreRelease::Alpha, 2))) .with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 2
}))
.with_post(Some(3)) .with_post(Some(3))
); );

View file

@ -36,6 +36,12 @@ use crate::{
/// assert_eq!(version_specifiers.iter().position(|specifier| *specifier.operator() == Operator::LessThan), Some(1)); /// assert_eq!(version_specifiers.iter().position(|specifier| *specifier.operator() == Operator::LessThan), Some(1));
/// ``` /// ```
#[derive(Eq, PartialEq, Debug, Clone, Hash)] #[derive(Eq, PartialEq, Debug, Clone, Hash)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(feature = "rkyv", archive_attr(derive(Debug)))]
#[cfg_attr(feature = "pyo3", pyclass(sequence))] #[cfg_attr(feature = "pyo3", pyclass(sequence))]
pub struct VersionSpecifiers(Vec<VersionSpecifier>); pub struct VersionSpecifiers(Vec<VersionSpecifier>);
@ -240,6 +246,12 @@ impl std::error::Error for VersionSpecifiersParseError {}
/// assert!(version_specifier.contains(&version)); /// assert!(version_specifier.contains(&version));
/// ``` /// ```
#[derive(Eq, PartialEq, Debug, Clone, Hash)] #[derive(Eq, PartialEq, Debug, Clone, Hash)]
#[cfg_attr(
feature = "rkyv",
derive(rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)
)]
#[cfg_attr(feature = "rkyv", archive(check_bytes))]
#[cfg_attr(feature = "rkyv", archive_attr(derive(Debug)))]
#[cfg_attr(feature = "pyo3", pyclass(get_all))] #[cfg_attr(feature = "pyo3", pyclass(get_all))]
pub struct VersionSpecifier { pub struct VersionSpecifier {
/// ~=|==|!=|<=|>=|<|>|===, plus whether the version ended with a star /// ~=|==|!=|<=|>=|<|>|===, plus whether the version ended with a star
@ -727,7 +739,7 @@ mod tests {
use indoc::indoc; use indoc::indoc;
use crate::{LocalSegment, PreRelease}; use crate::{LocalSegment, PreRelease, PreReleaseKind};
use super::*; use super::*;
@ -1436,7 +1448,10 @@ mod tests {
"==2.0a1.*", "==2.0a1.*",
ParseErrorKind::InvalidVersion( ParseErrorKind::InvalidVersion(
version::ErrorKind::UnexpectedEnd { version::ErrorKind::UnexpectedEnd {
version: Version::new([2, 0]).with_pre(Some((PreRelease::Alpha, 1))), version: Version::new([2, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1,
})),
remaining: ".*".to_string(), remaining: ".*".to_string(),
} }
.into(), .into(),
@ -1447,7 +1462,10 @@ mod tests {
"!=2.0a1.*", "!=2.0a1.*",
ParseErrorKind::InvalidVersion( ParseErrorKind::InvalidVersion(
version::ErrorKind::UnexpectedEnd { version::ErrorKind::UnexpectedEnd {
version: Version::new([2, 0]).with_pre(Some((PreRelease::Alpha, 1))), version: Version::new([2, 0]).with_pre(Some(PreRelease {
kind: PreReleaseKind::Alpha,
number: 1,
})),
remaining: ".*".to_string(), remaining: ".*".to_string(),
} }
.into(), .into(),

View file

@ -25,6 +25,7 @@ once_cell = { workspace = true }
pyo3 = { workspace = true, optional = true, features = ["abi3", "extension-module"] } pyo3 = { workspace = true, optional = true, features = ["abi3", "extension-module"] }
pyo3-log = { workspace = true, optional = true } pyo3-log = { workspace = true, optional = true }
regex = { workspace = true } regex = { workspace = true }
rkyv = { workspace = true, features = ["strict"], optional = true }
serde = { workspace = true, features = ["derive"], optional = true } serde = { workspace = true, features = ["derive"], optional = true }
serde_json = { workspace = true, optional = true } serde_json = { workspace = true, optional = true }
thiserror = { workspace = true } thiserror = { workspace = true }
@ -40,5 +41,6 @@ testing_logger = { version = "0.1.1" }
[features] [features]
pyo3 = ["dep:pyo3", "pep440_rs/pyo3", "pyo3-log"] pyo3 = ["dep:pyo3", "pep440_rs/pyo3", "pyo3-log"]
rkyv = ["dep:rkyv", "pep440_rs/rkyv"]
serde = ["dep:serde", "pep440_rs/serde"] serde = ["dep:serde", "pep440_rs/serde"]
default = [] default = []

View file

@ -11,7 +11,7 @@
//! let marker = r#"requests [security,tests] >= 2.8.1, == 2.8.* ; python_version > "3.8""#; //! let marker = r#"requests [security,tests] >= 2.8.1, == 2.8.* ; python_version > "3.8""#;
//! let dependency_specification = Requirement::from_str(marker).unwrap(); //! let dependency_specification = Requirement::from_str(marker).unwrap();
//! assert_eq!(dependency_specification.name.as_ref(), "requests"); //! assert_eq!(dependency_specification.name.as_ref(), "requests");
//! assert_eq!(dependency_specification.extras, Some(vec![ExtraName::from_str("security").unwrap(), ExtraName::from_str("tests").unwrap()])); //! assert_eq!(dependency_specification.extras, vec![ExtraName::from_str("security").unwrap(), ExtraName::from_str("tests").unwrap()]);
//! ``` //! ```
#![deny(missing_docs)] #![deny(missing_docs)]

View file

@ -5,7 +5,7 @@ edition = "2021"
[dependencies] [dependencies]
cache-key = { path = "../cache-key" } cache-key = { path = "../cache-key" }
distribution-filename = { path = "../distribution-filename", features = ["serde"] } distribution-filename = { path = "../distribution-filename", features = ["rkyv", "serde"] }
distribution-types = { path = "../distribution-types" } distribution-types = { path = "../distribution-types" }
install-wheel-rs = { path = "../install-wheel-rs" } install-wheel-rs = { path = "../install-wheel-rs" }
pep440_rs = { path = "../pep440-rs" } pep440_rs = { path = "../pep440-rs" }
@ -27,6 +27,7 @@ http-cache-semantics = { workspace = true }
reqwest = { workspace = true } reqwest = { workspace = true }
reqwest-middleware = { workspace = true } reqwest-middleware = { workspace = true }
reqwest-retry = { workspace = true } reqwest-retry = { workspace = true }
rkyv = { workspace = true, features = ["strict", "validation"] }
rmp-serde = { workspace = true } rmp-serde = { workspace = true }
rustc-hash = { workspace = true } rustc-hash = { workspace = true }
serde = { workspace = true } serde = { workspace = true }

View file

@ -41,6 +41,10 @@ pub enum ErrorKind {
#[error(transparent)] #[error(transparent)]
UrlParseError(#[from] url::ParseError), UrlParseError(#[from] url::ParseError),
/// A base URL could not be joined with a possibly relative URL.
#[error(transparent)]
JoinRelativeError(#[from] pypi_types::JoinRelativeError),
/// Dist-info error /// Dist-info error
#[error(transparent)] #[error(transparent)]
InstallWheel(#[from] install_wheel_rs::Error), InstallWheel(#[from] install_wheel_rs::Error),

View file

@ -116,7 +116,7 @@ impl<'a> FlatIndexClient<'a> {
let files: Vec<File> = files let files: Vec<File> = files
.into_iter() .into_iter()
.filter_map(|file| { .filter_map(|file| {
match File::try_from(file, &base) { match File::try_from(file, base.as_url().as_str()) {
Ok(file) => Some(file), Ok(file) => Some(file),
Err(err) => { Err(err) => {
// Ignore files with unparseable version specifiers. // Ignore files with unparseable version specifiers.
@ -178,7 +178,7 @@ impl<'a> FlatIndexClient<'a> {
hashes: Hashes { sha256: None }, hashes: Hashes { sha256: None },
requires_python: None, requires_python: None,
size: None, size: None,
upload_time: None, upload_time_utc_ms: None,
url: FileLocation::Path(entry.path().to_path_buf()), url: FileLocation::Path(entry.path().to_path_buf()),
yanked: None, yanked: None,
}; };

View file

@ -2,7 +2,8 @@ pub use cached_client::{CacheControl, CachedClient, CachedClientError, DataWithC
pub use error::{Error, ErrorKind}; pub use error::{Error, ErrorKind};
pub use flat_index::{FlatDistributions, FlatIndex, FlatIndexClient, FlatIndexError}; pub use flat_index::{FlatDistributions, FlatIndex, FlatIndexClient, FlatIndexError};
pub use registry_client::{ pub use registry_client::{
read_metadata_async, RegistryClient, RegistryClientBuilder, SimpleMetadata, VersionFiles, read_metadata_async, RegistryClient, RegistryClientBuilder, SimpleMetadata, SimpleMetadatum,
VersionFiles,
}; };
mod cache_headers; mod cache_headers;

View file

@ -22,7 +22,7 @@ use install_wheel_rs::find_dist_info;
use pep440_rs::Version; use pep440_rs::Version;
use puffin_cache::{Cache, CacheBucket, WheelCache}; use puffin_cache::{Cache, CacheBucket, WheelCache};
use puffin_normalize::PackageName; use puffin_normalize::PackageName;
use pypi_types::{BaseUrl, Metadata21, SimpleJson}; use pypi_types::{Metadata21, SimpleJson};
use crate::cached_client::CacheControl; use crate::cached_client::CacheControl;
use crate::html::SimpleHtml; use crate::html::SimpleHtml;
@ -206,15 +206,16 @@ impl RegistryClient {
let bytes = response.bytes().await.map_err(ErrorKind::RequestError)?; let bytes = response.bytes().await.map_err(ErrorKind::RequestError)?;
let data: SimpleJson = serde_json::from_slice(bytes.as_ref()) let data: SimpleJson = serde_json::from_slice(bytes.as_ref())
.map_err(|err| Error::from_json_err(err, url.clone()))?; .map_err(|err| Error::from_json_err(err, url.clone()))?;
let base = BaseUrl::from(url.clone()); let metadata =
let metadata = SimpleMetadata::from_files(data.files, package_name, &base); SimpleMetadata::from_files(data.files, package_name, url.as_str());
Ok(metadata) Ok(metadata)
} }
MediaType::Html => { MediaType::Html => {
let text = response.text().await.map_err(ErrorKind::RequestError)?; let text = response.text().await.map_err(ErrorKind::RequestError)?;
let SimpleHtml { base, files } = SimpleHtml::parse(&text, &url) let SimpleHtml { base, files } = SimpleHtml::parse(&text, &url)
.map_err(|err| Error::from_html_err(err, url.clone()))?; .map_err(|err| Error::from_html_err(err, url.clone()))?;
let metadata = SimpleMetadata::from_files(files, package_name, &base); let metadata =
SimpleMetadata::from_files(files, package_name, base.as_url().as_str());
Ok(metadata) Ok(metadata)
} }
} }
@ -245,7 +246,8 @@ impl RegistryClient {
let metadata = match &built_dist { let metadata = match &built_dist {
BuiltDist::Registry(wheel) => match &wheel.file.url { BuiltDist::Registry(wheel) => match &wheel.file.url {
FileLocation::RelativeUrl(base, url) => { FileLocation::RelativeUrl(base, url) => {
let url = base.join_relative(url).map_err(ErrorKind::UrlParseError)?; let url = pypi_types::base_url_join_relative(base, url)
.map_err(ErrorKind::JoinRelativeError)?;
self.wheel_metadata_registry(&wheel.index, &wheel.file, &url) self.wheel_metadata_registry(&wheel.index, &wheel.file, &url)
.await? .await?
} }
@ -494,46 +496,78 @@ pub async fn read_metadata_async(
Ok(metadata) Ok(metadata)
} }
#[derive(Default, Debug, Serialize, Deserialize)] #[derive(
Default, Debug, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct VersionFiles { pub struct VersionFiles {
pub wheels: Vec<(WheelFilename, File)>, pub wheels: Vec<VersionWheel>,
pub source_dists: Vec<(SourceDistFilename, File)>, pub source_dists: Vec<VersionSourceDist>,
} }
impl VersionFiles { impl VersionFiles {
fn push(&mut self, filename: DistFilename, file: File) { fn push(&mut self, filename: DistFilename, file: File) {
match filename { match filename {
DistFilename::WheelFilename(inner) => self.wheels.push((inner, file)), DistFilename::WheelFilename(name) => self.wheels.push(VersionWheel { name, file }),
DistFilename::SourceDistFilename(inner) => self.source_dists.push((inner, file)), DistFilename::SourceDistFilename(name) => {
self.source_dists.push(VersionSourceDist { name, file })
}
} }
} }
pub fn all(self) -> impl Iterator<Item = (DistFilename, File)> { pub fn all(self) -> impl Iterator<Item = (DistFilename, File)> {
self.wheels self.wheels
.into_iter() .into_iter()
.map(|(filename, file)| (DistFilename::WheelFilename(filename), file)) .map(|VersionWheel { name, file }| (DistFilename::WheelFilename(name), file))
.chain( .chain(
self.source_dists self.source_dists
.into_iter() .into_iter()
.map(|(filename, file)| (DistFilename::SourceDistFilename(filename), file)), .map(|VersionSourceDist { name, file }| {
(DistFilename::SourceDistFilename(name), file)
}),
) )
} }
} }
#[derive(Default, Debug, Serialize, Deserialize)] #[derive(Debug, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)]
pub struct SimpleMetadata(BTreeMap<Version, VersionFiles>); #[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct VersionWheel {
pub name: WheelFilename,
pub file: File,
}
#[derive(Debug, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct VersionSourceDist {
pub name: SourceDistFilename,
pub file: File,
}
#[derive(
Default, Debug, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct SimpleMetadata(Vec<SimpleMetadatum>);
#[derive(Debug, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct SimpleMetadatum {
pub version: Version,
pub files: VersionFiles,
}
impl SimpleMetadata { impl SimpleMetadata {
pub fn iter(&self) -> impl DoubleEndedIterator<Item = (&Version, &VersionFiles)> { pub fn iter(&self) -> impl DoubleEndedIterator<Item = &SimpleMetadatum> {
self.0.iter() self.0.iter()
} }
fn from_files( fn from_files(files: Vec<pypi_types::File>, package_name: &PackageName, base: &str) -> Self {
files: Vec<pypi_types::File>, let mut map: BTreeMap<Version, VersionFiles> = BTreeMap::default();
package_name: &PackageName,
base: &BaseUrl,
) -> Self {
let mut metadata = Self::default();
// Group the distributions by version and kind // Group the distributions by version and kind
for file in files { for file in files {
@ -553,7 +587,7 @@ impl SimpleMetadata {
continue; continue;
} }
}; };
match metadata.0.entry(version.clone()) { match map.entry(version.clone()) {
std::collections::btree_map::Entry::Occupied(mut entry) => { std::collections::btree_map::Entry::Occupied(mut entry) => {
entry.get_mut().push(filename, file); entry.get_mut().push(filename, file);
} }
@ -565,14 +599,17 @@ impl SimpleMetadata {
} }
} }
} }
SimpleMetadata(
metadata map.into_iter()
.map(|(version, files)| SimpleMetadatum { version, files })
.collect(),
)
} }
} }
impl IntoIterator for SimpleMetadata { impl IntoIterator for SimpleMetadata {
type Item = (Version, VersionFiles); type Item = SimpleMetadatum;
type IntoIter = std::collections::btree_map::IntoIter<Version, VersionFiles>; type IntoIter = std::vec::IntoIter<SimpleMetadatum>;
fn into_iter(self) -> Self::IntoIter { fn into_iter(self) -> Self::IntoIter {
self.0.into_iter() self.0.into_iter()
@ -607,12 +644,10 @@ impl MediaType {
mod tests { mod tests {
use std::str::FromStr; use std::str::FromStr;
use url::Url;
use puffin_normalize::PackageName; use puffin_normalize::PackageName;
use pypi_types::{BaseUrl, SimpleJson}; use pypi_types::SimpleJson;
use crate::SimpleMetadata; use crate::{SimpleMetadata, SimpleMetadatum};
#[test] #[test]
fn ignore_failing_files() { fn ignore_failing_files() {
@ -650,15 +685,15 @@ mod tests {
} }
"#; "#;
let data: SimpleJson = serde_json::from_str(response).unwrap(); let data: SimpleJson = serde_json::from_str(response).unwrap();
let base = BaseUrl::from(Url::from_str("https://pypi.org/simple/pyflyby/").unwrap()); let base = "https://pypi.org/simple/pyflyby/";
let simple_metadata = SimpleMetadata::from_files( let simple_metadata = SimpleMetadata::from_files(
data.files, data.files,
&PackageName::from_str("pyflyby").unwrap(), &PackageName::from_str("pyflyby").unwrap(),
&base, base,
); );
let versions: Vec<String> = simple_metadata let versions: Vec<String> = simple_metadata
.iter() .iter()
.map(|(version, _)| version.to_string()) .map(|SimpleMetadatum { version, .. }| version.to_string())
.collect(); .collect();
assert_eq!(versions, ["1.7.8".to_string()]); assert_eq!(versions, ["1.7.8".to_string()]);
} }

View file

@ -49,7 +49,7 @@ async fn find_latest_version(
package_name: &PackageName, package_name: &PackageName,
) -> Option<Version> { ) -> Option<Version> {
let (_, simple_metadata) = client.simple(package_name).await.ok()?; let (_, simple_metadata) = client.simple(package_name).await.ok()?;
let (version, _) = simple_metadata.into_iter().next()?; let version = simple_metadata.into_iter().next()?.version;
Some(version.clone()) Some(version.clone())
} }

View file

@ -92,9 +92,9 @@ impl<'a, Context: BuildContext + Send + Sync> DistributionDatabase<'a, Context>
} }
let url = match &wheel.file.url { let url = match &wheel.file.url {
FileLocation::RelativeUrl(base, url) => base FileLocation::RelativeUrl(base, url) => {
.join_relative(url) pypi_types::base_url_join_relative(base, url)?
.map_err(|err| Error::Url(url.clone(), err))?, }
FileLocation::AbsoluteUrl(url) => { FileLocation::AbsoluteUrl(url) => {
Url::parse(url).map_err(|err| Error::Url(url.clone(), err))? Url::parse(url).map_err(|err| Error::Url(url.clone(), err))?
} }

View file

@ -14,6 +14,8 @@ pub enum Error {
// Network error // Network error
#[error("Failed to parse URL: `{0}`")] #[error("Failed to parse URL: `{0}`")]
Url(String, #[source] url::ParseError), Url(String, #[source] url::ParseError),
#[error(transparent)]
JoinRelativeUrl(#[from] pypi_types::JoinRelativeError),
#[error("Git operation failed")] #[error("Git operation failed")]
Git(#[source] anyhow::Error), Git(#[source] anyhow::Error),
#[error(transparent)] #[error(transparent)]

View file

@ -105,9 +105,9 @@ impl<'a, T: BuildContext> SourceDistCachedBuilder<'a, T> {
} }
SourceDist::Registry(registry_source_dist) => { SourceDist::Registry(registry_source_dist) => {
let url = match &registry_source_dist.file.url { let url = match &registry_source_dist.file.url {
FileLocation::RelativeUrl(base, url) => base FileLocation::RelativeUrl(base, url) => {
.join_relative(url) pypi_types::base_url_join_relative(base, url)?
.map_err(|err| Error::Url(url.clone(), err))?, }
FileLocation::AbsoluteUrl(url) => { FileLocation::AbsoluteUrl(url) => {
Url::parse(url).map_err(|err| Error::Url(url.clone(), err))? Url::parse(url).map_err(|err| Error::Url(url.clone(), err))?
} }
@ -182,9 +182,9 @@ impl<'a, T: BuildContext> SourceDistCachedBuilder<'a, T> {
} }
SourceDist::Registry(registry_source_dist) => { SourceDist::Registry(registry_source_dist) => {
let url = match &registry_source_dist.file.url { let url = match &registry_source_dist.file.url {
FileLocation::RelativeUrl(base, url) => base FileLocation::RelativeUrl(base, url) => {
.join_relative(url) pypi_types::base_url_join_relative(base, url)?
.map_err(|err| Error::Url(url.clone(), err))?, }
FileLocation::AbsoluteUrl(url) => { FileLocation::AbsoluteUrl(url) => {
Url::parse(url).map_err(|err| Error::Url(url.clone(), err))? Url::parse(url).map_err(|err| Error::Url(url.clone(), err))?
} }

View file

@ -6,3 +6,4 @@ description = "Normalization for distribution, package and extra anmes"
[dependencies] [dependencies]
serde = { workspace = true, features = ["derive"] } serde = { workspace = true, features = ["derive"] }
rkyv = { workspace = true, features = ["strict", "validation"] }

View file

@ -11,7 +11,21 @@ use crate::{validate_and_normalize_owned, validate_and_normalize_ref, InvalidNam
/// down to a single `-`, e.g., `---`, `.`, and `__` all get converted to just `-`. /// down to a single `-`, e.g., `---`, `.`, and `__` all get converted to just `-`.
/// ///
/// See: <https://packaging.python.org/en/latest/specifications/name-normalization/> /// See: <https://packaging.python.org/en/latest/specifications/name-normalization/>
#[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord, Serialize)] #[derive(
Debug,
Clone,
PartialEq,
Eq,
Hash,
PartialOrd,
Ord,
Serialize,
rkyv::Archive,
rkyv::Deserialize,
rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct PackageName(String); pub struct PackageName(String);
impl PackageName { impl PackageName {

View file

@ -11,7 +11,9 @@ use distribution_filename::DistFilename;
use distribution_types::{Dist, IndexUrl, Resolution}; use distribution_types::{Dist, IndexUrl, Resolution};
use pep508_rs::{Requirement, VersionOrUrl}; use pep508_rs::{Requirement, VersionOrUrl};
use platform_tags::Tags; use platform_tags::Tags;
use puffin_client::{FlatDistributions, FlatIndex, RegistryClient, SimpleMetadata}; use puffin_client::{
FlatDistributions, FlatIndex, RegistryClient, SimpleMetadata, SimpleMetadatum,
};
use puffin_interpreter::Interpreter; use puffin_interpreter::Interpreter;
use puffin_normalize::PackageName; use puffin_normalize::PackageName;
@ -158,7 +160,7 @@ impl<'a> DistFinder<'a> {
(None, None, None) (None, None, None)
}; };
for (version, files) in metadata.into_iter().rev() { for SimpleMetadatum { version, files } in metadata.into_iter().rev() {
// If we iterated past the first-compatible version, break. // If we iterated past the first-compatible version, break.
if best_version if best_version
.as_ref() .as_ref()
@ -174,31 +176,30 @@ impl<'a> DistFinder<'a> {
if !no_binary { if !no_binary {
// Find the most-compatible wheel // Find the most-compatible wheel
for (wheel, file) in files.wheels { for version_wheel in files.wheels {
// Only add dists compatible with the python version. // Only add dists compatible with the python version.
// This is relevant for source dists which give no other indication of their // This is relevant for source dists which give no other indication of their
// compatibility and wheels which may be tagged `py3-none-any` but // compatibility and wheels which may be tagged `py3-none-any` but
// have `requires-python: ">=3.9"` // have `requires-python: ">=3.9"`
if !file if !version_wheel.file.requires_python.as_ref().map_or(
.requires_python true,
.as_ref() |requires_python| {
.map_or(true, |requires_python| {
requires_python.contains(self.interpreter.python_version()) requires_python.contains(self.interpreter.python_version())
}) },
{ ) {
continue; continue;
} }
best_version = Some(version.clone()); best_version = Some(version.clone());
if let Some(priority) = wheel.compatibility(self.tags) { if let Some(priority) = version_wheel.name.compatibility(self.tags) {
if best_wheel if best_wheel
.as_ref() .as_ref()
.map_or(true, |(.., existing)| priority > *existing) .map_or(true, |(.., existing)| priority > *existing)
{ {
best_wheel = Some(( best_wheel = Some((
Dist::from_registry( Dist::from_registry(
DistFilename::WheelFilename(wheel), DistFilename::WheelFilename(version_wheel.name),
file, version_wheel.file,
index.clone(), index.clone(),
), ),
priority, priority,
@ -210,25 +211,24 @@ impl<'a> DistFinder<'a> {
// Find the most-compatible sdist, if no wheel was found. // Find the most-compatible sdist, if no wheel was found.
if best_wheel.is_none() { if best_wheel.is_none() {
for (source_dist, file) in files.source_dists { for version_sdist in files.source_dists {
// Only add dists compatible with the python version. // Only add dists compatible with the python version.
// This is relevant for source dists which give no other indication of their // This is relevant for source dists which give no other indication of their
// compatibility and wheels which may be tagged `py3-none-any` but // compatibility and wheels which may be tagged `py3-none-any` but
// have `requires-python: ">=3.9"` // have `requires-python: ">=3.9"`
if !file if !version_sdist.file.requires_python.as_ref().map_or(
.requires_python true,
.as_ref() |requires_python| {
.map_or(true, |requires_python| {
requires_python.contains(self.interpreter.python_version()) requires_python.contains(self.interpreter.python_version())
}) },
{ ) {
continue; continue;
} }
best_version = Some(source_dist.version.clone()); best_version = Some(version_sdist.name.version.clone());
best_sdist = Some(Dist::from_registry( best_sdist = Some(Dist::from_registry(
DistFilename::SourceDistFilename(source_dist), DistFilename::SourceDistFilename(version_sdist.name),
file, version_sdist.file,
index.clone(), index.clone(),
)); ));
} }

View file

@ -1,7 +1,7 @@
use anyhow::Result; use anyhow::Result;
use pubgrub::range::Range; use pubgrub::range::Range;
use pep440_rs::{Operator, Version, VersionSpecifier}; use pep440_rs::{Operator, PreRelease, Version, VersionSpecifier};
use crate::ResolveError; use crate::ResolveError;
@ -68,10 +68,9 @@ impl TryFrom<&VersionSpecifier> for PubGrubSpecifier {
if let Some(post) = high.post() { if let Some(post) = high.post() {
high = high.with_post(Some(post + 1)); high = high.with_post(Some(post + 1));
} else if let Some(pre) = high.pre() { } else if let Some(pre) = high.pre() {
high = high.with_pre(Some(match pre { high = high.with_pre(Some(PreRelease {
(pep440_rs::PreRelease::Rc, n) => (pep440_rs::PreRelease::Rc, n + 1), kind: pre.kind,
(pep440_rs::PreRelease::Alpha, n) => (pep440_rs::PreRelease::Alpha, n + 1), number: pre.number + 1,
(pep440_rs::PreRelease::Beta, n) => (pep440_rs::PreRelease::Beta, n + 1),
})); }));
} else { } else {
let mut release = high.release().to_vec(); let mut release = high.release().to_vec();
@ -86,10 +85,9 @@ impl TryFrom<&VersionSpecifier> for PubGrubSpecifier {
if let Some(post) = high.post() { if let Some(post) = high.post() {
high = high.with_post(Some(post + 1)); high = high.with_post(Some(post + 1));
} else if let Some(pre) = high.pre() { } else if let Some(pre) = high.pre() {
high = high.with_pre(Some(match pre { high = high.with_pre(Some(PreRelease {
(pep440_rs::PreRelease::Rc, n) => (pep440_rs::PreRelease::Rc, n + 1), kind: pre.kind,
(pep440_rs::PreRelease::Alpha, n) => (pep440_rs::PreRelease::Alpha, n + 1), number: pre.number + 1,
(pep440_rs::PreRelease::Beta, n) => (pep440_rs::PreRelease::Beta, n + 1),
})); }));
} else { } else {
let mut release = high.release().to_vec(); let mut release = high.release().to_vec();

View file

@ -5,7 +5,7 @@ use std::sync::Arc;
use anyhow::Result; use anyhow::Result;
use dashmap::{DashMap, DashSet}; use dashmap::{DashMap, DashSet};
use futures::channel::mpsc::UnboundedReceiver; use futures::channel::mpsc::UnboundedReceiver;
use futures::{pin_mut, FutureExt, StreamExt}; use futures::{FutureExt, StreamExt};
use itertools::Itertools; use itertools::Itertools;
use pubgrub::error::PubGrubError; use pubgrub::error::PubGrubError;
use pubgrub::range::Range; use pubgrub::range::Range;
@ -202,14 +202,10 @@ impl<'a, Provider: ResolverProvider> Resolver<'a, Provider> {
let (request_sink, request_stream) = futures::channel::mpsc::unbounded(); let (request_sink, request_stream) = futures::channel::mpsc::unbounded();
// Run the fetcher. // Run the fetcher.
let requests_fut = self.fetch(request_stream); let requests_fut = self.fetch(request_stream).fuse();
// Run the solver. // Run the solver.
let resolve_fut = self.solve(&request_sink); let resolve_fut = self.solve(&request_sink).fuse();
let requests_fut = requests_fut.fuse();
let resolve_fut = resolve_fut.fuse();
pin_mut!(requests_fut, resolve_fut);
let resolution = select! { let resolution = select! {
result = requests_fut => { result = requests_fut => {

View file

@ -8,7 +8,7 @@ use distribution_filename::DistFilename;
use distribution_types::{Dist, IndexUrl, PrioritizedDistribution, ResolvableDist}; use distribution_types::{Dist, IndexUrl, PrioritizedDistribution, ResolvableDist};
use pep440_rs::Version; use pep440_rs::Version;
use platform_tags::Tags; use platform_tags::Tags;
use puffin_client::{FlatDistributions, SimpleMetadata}; use puffin_client::{FlatDistributions, SimpleMetadata, SimpleMetadatum};
use puffin_normalize::PackageName; use puffin_normalize::PackageName;
use puffin_traits::NoBinary; use puffin_traits::NoBinary;
use puffin_warnings::warn_user_once; use puffin_warnings::warn_user_once;
@ -48,13 +48,13 @@ impl VersionMap {
}; };
// Collect compatible distributions. // Collect compatible distributions.
for (version, files) in metadata { for SimpleMetadatum { version, files } in metadata {
for (filename, file) in files.all() { for (filename, file) in files.all() {
// Support resolving as if it were an earlier timestamp, at least as long files have // Support resolving as if it were an earlier timestamp, at least as long files have
// upload time information. // upload time information.
if let Some(exclude_newer) = exclude_newer { if let Some(exclude_newer) = exclude_newer {
match file.upload_time.as_ref() { match file.upload_time_utc_ms.as_ref() {
Some(upload_time) if upload_time >= exclude_newer => { Some(&upload_time) if upload_time >= exclude_newer.timestamp_millis() => {
continue; continue;
} }
None => { None => {

View file

@ -13,14 +13,15 @@ license = { workspace = true }
workspace = true workspace = true
[dependencies] [dependencies]
pep440_rs = { path = "../pep440-rs", features = ["serde"] } pep440_rs = { path = "../pep440-rs", features = ["rkyv", "serde"] }
pep508_rs = { path = "../pep508-rs", features = ["serde"] } pep508_rs = { path = "../pep508-rs", features = ["rkyv", "serde"] }
puffin-normalize = { path = "../puffin-normalize" } puffin-normalize = { path = "../puffin-normalize" }
chrono = { workspace = true, features = ["serde"] } chrono = { workspace = true, features = ["serde"] }
mailparse = { workspace = true } mailparse = { workspace = true }
once_cell = { workspace = true } once_cell = { workspace = true }
regex = { workspace = true } regex = { workspace = true }
rkyv = { workspace = true, features = ["strict", "validation"] }
serde = { workspace = true } serde = { workspace = true }
thiserror = { workspace = true } thiserror = { workspace = true }
tracing = { workspace = true } tracing = { workspace = true }

View file

@ -1,6 +1,47 @@
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use url::Url; use url::Url;
/// Join a possibly relative URL to a base URL.
///
/// When `maybe_relative` is not relative, then it is parsed and returned with
/// `base` being ignored.
///
/// This is useful for parsing URLs that may be absolute or relative, with a
/// known base URL, and that doesn't require having already parsed a `BaseUrl`.
pub fn base_url_join_relative(base: &str, maybe_relative: &str) -> Result<Url, JoinRelativeError> {
match Url::parse(maybe_relative) {
Ok(absolute) => Ok(absolute),
Err(err) => {
if err == url::ParseError::RelativeUrlWithoutBase {
let base = Url::parse(base).map_err(|err| JoinRelativeError {
original: base.to_string(),
source: err,
})?;
base.join(maybe_relative).map_err(|err| JoinRelativeError {
original: format!("{base}/{maybe_relative}"),
source: err,
})
} else {
Err(JoinRelativeError {
original: maybe_relative.to_string(),
source: err,
})
}
}
}
}
/// An error that occurs when `base_url_join_relative` fails.
///
/// The error message includes the URL (`base` or `maybe_relative`) passed to
/// `base_url_join_relative` that provoked the error.
#[derive(Clone, Debug, thiserror::Error)]
#[error("Failed to parse URL: `{original}`")]
pub struct JoinRelativeError {
original: String,
source: url::ParseError,
}
#[derive(Debug, Clone, Hash, Eq, PartialEq, Serialize, Deserialize)] #[derive(Debug, Clone, Hash, Eq, PartialEq, Serialize, Deserialize)]
pub struct BaseUrl( pub struct BaseUrl(
#[serde( #[serde(

View file

@ -68,7 +68,11 @@ where
)) ))
} }
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(
Debug, Clone, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
#[serde(untagged)] #[serde(untagged)]
pub enum DistInfoMetadata { pub enum DistInfoMetadata {
Bool(bool), Bool(bool),
@ -84,7 +88,11 @@ impl DistInfoMetadata {
} }
} }
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(
Debug, Clone, Serialize, Deserialize, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
#[serde(untagged)] #[serde(untagged)]
pub enum Yanked { pub enum Yanked {
Bool(bool), Bool(bool),
@ -104,7 +112,23 @@ impl Yanked {
/// ///
/// PEP 691 says multiple hashes can be included and the interpretation is left to the client, we /// PEP 691 says multiple hashes can be included and the interpretation is left to the client, we
/// only support SHA 256 atm. /// only support SHA 256 atm.
#[derive(Debug, Clone, Ord, PartialOrd, Eq, PartialEq, Hash, Default, Serialize, Deserialize)] #[derive(
Debug,
Clone,
Ord,
PartialOrd,
Eq,
PartialEq,
Hash,
Default,
Serialize,
Deserialize,
rkyv::Archive,
rkyv::Deserialize,
rkyv::Serialize,
)]
#[archive(check_bytes)]
#[archive_attr(derive(Debug))]
pub struct Hashes { pub struct Hashes {
pub sha256: Option<String>, pub sha256: Option<String>,
} }

View file

@ -13,8 +13,8 @@ license = { workspace = true }
workspace = true workspace = true
[dependencies] [dependencies]
pep440_rs = { path = "../pep440-rs", features = ["serde"] } pep440_rs = { path = "../pep440-rs", features = ["rkyv", "serde"] }
pep508_rs = { path = "../pep508-rs", features = ["serde"] } pep508_rs = { path = "../pep508-rs", features = ["rkyv", "serde"] }
puffin-fs = { path = "../puffin-fs" } puffin-fs = { path = "../puffin-fs" }
puffin-normalize = { path = "../puffin-normalize" } puffin-normalize = { path = "../puffin-normalize" }