uv/crates/bench/benches/distribution_filename.rs
Andrew Gallant 33c0901a28
distribution-filename: speed up is_compatible (#367)
This PR tweaks the representation of `Tags` in order to offer a
faster implementation of `WheelFilename::is_compatible`. We now use a
nested map of tags that lets us avoid looping over every supported
platform tag. As the code comments suggest, that is the essential gain.
We still do not mind looping over the tags in each wheel name since they
tend to be quite small. And pushing our thumb on that side of things can
make things worse overall since it would likely slow down WheelFilename
construction itself.

For micro-benchmarks, we improve considerably for compatibility
checking:

    $ critcmp base test3
group base test3
----- ---- -----
build_platform_tags/burntsushi-archlinux 1.00 46.2±0.28µs ? ?/sec 2.48
114.8±0.45µs ? ?/sec
wheelname_parsing/flyte-long-compatible 1.00 624.8±3.31ns 174.0 MB/sec
1.01 629.4±4.30ns 172.7 MB/sec
wheelname_parsing/flyte-long-incompatible 1.00 743.6±4.23ns 165.4 MB/sec
1.00 746.9±4.62ns 164.7 MB/sec
wheelname_parsing/flyte-short-compatible 1.00 526.7±4.76ns 54.3 MB/sec
1.01 530.2±5.81ns 54.0 MB/sec
wheelname_parsing/flyte-short-incompatible 1.00 540.4±4.93ns 60.0 MB/sec
1.01 545.7±5.31ns 59.4 MB/sec
wheelname_parsing_failure/flyte-long-extension 1.00 13.6±0.13ns 3.2
GB/sec 1.01 13.7±0.14ns 3.2 GB/sec
wheelname_parsing_failure/flyte-short-extension 1.00 14.0±0.20ns 1160.4
MB/sec 1.01 14.1±0.14ns 1146.5 MB/sec
wheelname_tag_compatibility/flyte-long-compatible 11.33 159.8±2.79ns
680.5 MB/sec 1.00 14.1±0.23ns 7.5 GB/sec
wheelname_tag_compatibility/flyte-long-incompatible 237.60
1671.8±37.99ns 73.6 MB/sec 1.00 7.0±0.08ns 17.1 GB/sec
wheelname_tag_compatibility/flyte-short-compatible 16.07 223.5±8.60ns
128.0 MB/sec 1.00 13.9±0.30ns 2.0 GB/sec
wheelname_tag_compatibility/flyte-short-incompatible 149.83 628.3±2.13ns
51.6 MB/sec 1.00 4.2±0.10ns 7.6 GB/sec

We do regress slightly on the time it takes for `Tags::new` to run, but
this is somewhat expected. And in absolute terms, 114us is perfectly
acceptable given that it's only executed ~once for each `puffin`
invocation.

Ad hoc benchmarks indicate an overall 25% perf improvement in `puffin
pip-compile` times. This roughly corresponds with how much time
`is_compatible` was taking. Indeed, profiling confirms that it has
virtually disappeared from the profile.

Fixes #157
2023-11-09 09:01:03 -05:00

162 lines
6.8 KiB
Rust

use {distribution_filename::WheelFilename, platform_tags::Tags};
use bench::criterion::{
criterion_group, criterion_main, measurement::WallTime, BenchmarkId, Criterion, Throughput,
};
/// A set of platform tags extracted from burntsushi's Archlinux workstation.
/// We could just re-create these via `Tags::from_env`, but those might differ
/// depending on the platform. This way, we always use the same data. It also
/// lets us assert tag compatibility regardless of where the benchmarks run.
const PLATFORM_TAGS: &[(&str, &str, &str)] = include!("../inputs/platform_tags.rs");
/// A set of wheel names used in the benchmarks below. We pick short and long
/// names, as well as compatible and not-compatibles (with `PLATFORM_TAGS`)
/// names.
///
/// The tuple is (name, filename, compatible) where `name` is a descriptive
/// name for humans used in the benchmark definition. And `filename` is the
/// actual wheel filename we want to benchmark operation on. And `compatible`
/// indicates whether the tags in the wheel filename are expected to be
/// compatible with the tags in `PLATFORM_TAGS`.
const WHEEL_NAMES: &[(&str, &str, bool)] = &[
// This tests a case with a very short name that is *not* compatible
// with PLATFORM_TAGS. It only uses one tag for each component (one
// Python version, one ABI and one platform).
(
"flyte-short-incompatible",
"hypothesis-4.24.5-py2-none-any.whl",
false,
),
// This tests a case with a very short name that *is* compatible with
// PLATFORM_TAGS. It only uses one tag for each component (one Python
// version, one ABI and one platform).
(
"flyte-short-compatible",
"ipython-2.1.0-py3-none-any.whl",
true,
),
// This tests a case with a long name that is *not* compatible. That
// is, all platform tags need to be checked against the tags in the
// wheel filename. This is essentially the worst possible practical
// case.
(
"flyte-long-incompatible",
"protobuf-3.5.2.post1-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl",
false,
),
// This tests a case with a long name that *is* compatible. We
// expect this to be (on average) quicker because the compatibility
// check stops as soon as a positive match is found. (Where as the
// incompatible case needs to check all tags.)
(
"flyte-long-compatible",
"coverage-6.6.0b1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
true,
),
];
/// A list of names that are candidates for wheel filenames but will ultimately
/// fail to parse.
const INVALID_WHEEL_NAMES: &[(&str, &str)] = &[
("flyte-short-extension", "mock-5.1.0.tar.gz"),
(
"flyte-long-extension",
"Pillow-5.4.0.dev0-py3.7-macosx-10.13-x86_64.egg",
),
];
/// Benchmarks the construction of platform tags.
///
/// This only happens ~once per program startup. Originally, construction was
/// trivial. But to speed up `WheelFilename::is_compatible`, we added some
/// extra processing. We thus expect construction to become slower, but we
/// write a benchmark to ensure it is still "reasonable."
fn benchmark_build_platform_tags(c: &mut Criterion<WallTime>) {
let tags: Vec<(String, String, String)> = PLATFORM_TAGS
.iter()
.map(|&(py, abi, plat)| (py.to_string(), abi.to_string(), plat.to_string()))
.collect();
let mut group = c.benchmark_group("build_platform_tags");
group.bench_function(BenchmarkId::from_parameter("burntsushi-archlinux"), |b| {
b.iter(|| std::hint::black_box(Tags::new(tags.clone())));
});
group.finish();
}
/// Benchmarks `WheelFilename::from_str`. This has been observed to take some
/// non-trivial time in profiling (although, at time of writing, not as much
/// as tag compatibility). In the process of optimizing tag compatibility,
/// we tweaked wheel filename parsing. This benchmark was therefore added to
/// ensure we didn't regress here.
fn benchmark_wheelname_parsing(c: &mut Criterion<WallTime>) {
let mut group = c.benchmark_group("wheelname_parsing");
for (name, filename, _) in WHEEL_NAMES.iter().copied() {
let len = u64::try_from(filename.len()).expect("length fits in u64");
group.throughput(Throughput::Bytes(len));
group.bench_function(BenchmarkId::from_parameter(name), |b| {
b.iter(|| {
filename
.parse::<WheelFilename>()
.expect("valid wheel filename");
});
});
}
group.finish();
}
/// Benchmarks `WheelFilename::from_str` when it fails. This routine is called
/// on every filename in a package's metadata. A non-trivial portion of which
/// are not wheel filenames. Ensuring that the error path is fast is thus
/// probably a good idea.
fn benchmark_wheelname_parsing_failure(c: &mut Criterion<WallTime>) {
let mut group = c.benchmark_group("wheelname_parsing_failure");
for (name, filename) in INVALID_WHEEL_NAMES.iter().copied() {
let len = u64::try_from(filename.len()).expect("length fits in u64");
group.throughput(Throughput::Bytes(len));
group.bench_function(BenchmarkId::from_parameter(name), |b| {
b.iter(|| {
filename
.parse::<WheelFilename>()
.expect_err("invalid wheel filename");
});
});
}
group.finish();
}
/// Benchmarks the `WheelFilename::is_compatible` routine. This was revealed
/// to be the #1 bottleneck in the resolver. The main issue was that the
/// set of platform tags (generated once) is quite large, and the original
/// implementation did an exhaustive search over each of them for each tag in
/// the wheel filename.
fn benchmark_wheelname_tag_compatibility(c: &mut Criterion<WallTime>) {
let tags: Vec<(String, String, String)> = PLATFORM_TAGS
.iter()
.map(|&(py, abi, plat)| (py.to_string(), abi.to_string(), plat.to_string()))
.collect();
let tags = Tags::new(tags);
let mut group = c.benchmark_group("wheelname_tag_compatibility");
for (name, filename, expected) in WHEEL_NAMES.iter().copied() {
let wheelname: WheelFilename = filename.parse().expect("valid wheel filename");
let len = u64::try_from(filename.len()).expect("length fits in u64");
group.throughput(Throughput::Bytes(len));
group.bench_function(BenchmarkId::from_parameter(name), |b| {
b.iter(|| {
assert_eq!(expected, wheelname.is_compatible(&tags));
});
});
}
group.finish();
}
criterion_group!(
distribution_filename,
benchmark_build_platform_tags,
benchmark_wheelname_parsing,
benchmark_wheelname_parsing_failure,
benchmark_wheelname_tag_compatibility,
);
criterion_main!(distribution_filename);