change global allocator to jemalloc (and mimalloc on Windows) (#399)

This copies the allocator configuration used in the Ruff project. In
particular, this gives us an instant 10% win when resolving the top 1K
PyPI packages:

    $ hyperfine \
"./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null" \
"./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null"
Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 974.2 ms ± 26.4 ms [User: 17503.3 ms, System: 2205.3
ms]
      Range (min … max):   943.5 ms … 1015.9 ms    10 runs

Benchmark 2: ./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 883.1 ms ± 23.3 ms [User: 14626.1 ms, System: 2542.2
ms]
      Range (min … max):   849.5 ms … 916.9 ms    10 runs

    Summary
'./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null' ran
1.10 ± 0.04 times faster than './target/profiling/puffin-dev-main
resolve-many --cache-dir cache-docker-no-build --no-build
pypi_top_8k_flat.txt --limit 1000 2> /dev/null'

I was moved to do this because I noticed `malloc`/`free` taking up a
fairly sizeable percentage of time during light profiling.

As is becoming a pattern, it will be easier to review this
commit-by-commit.

Ref #396 (wouldn't call this issue fixed)

-----

I did also try adding a `smallvec` optimization to the
`Version::release` field, but it didn't bare any fruit. I still think
there is more to explore since the results I observed don't quite line
up with what I expect. (So probably either my mental model is off or my
measurement process is flawed.) You can see that attempt with a little
more explanation here:
f9528b4ecd

In the course of adding the `smallvec` optimization, I also shrunk the
`Version` fields from a `usize` to a `u32`. They should at least be a
fixed size integer since version numbers aren't used to index memory,
and I shrunk it to `u32` since it seems reasonable to assume that all
version numbers will be smaller than `2^32`.
This commit is contained in:
Andrew Gallant 2023-11-10 14:48:59 -05:00 committed by GitHub
parent d8408b1783
commit 63f7f65190
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
9 changed files with 129 additions and 44 deletions

View file

@ -907,9 +907,9 @@ mod tests {
fn error_empty() {
assert_err(
"",
indoc! {"
indoc! {"\
Empty field is not allowed for PEP508
^"
},
);
@ -1123,7 +1123,7 @@ mod tests {
"numpy ( >=1.19 ",
indoc! {"
Missing closing parenthesis (expected ')', found end of dependency specification)
numpy ( >=1.19
numpy ( >=1.19\x20
^"
},
);
@ -1212,9 +1212,9 @@ mod tests {
fn error_marker_incomplete2() {
assert_err(
r#"numpy; sys_platform == "#,
indoc! {"
indoc! {"\
Expected marker value, found end of dependency specification
numpy; sys_platform ==
numpy; sys_platform ==\x20
^"
},
);
@ -1224,10 +1224,10 @@ mod tests {
fn error_marker_incomplete3() {
assert_err(
r#"numpy; sys_platform == "win32" or "#,
indoc! {r#"
indoc! {"
Expected marker value, found end of dependency specification
numpy; sys_platform == "win32" or
^"#},
numpy; sys_platform == \"win32\" or\x20
^"},
);
}
@ -1246,10 +1246,10 @@ mod tests {
fn error_marker_incomplete5() {
assert_err(
r#"numpy; sys_platform == "win32" or (os_name == "linux" and "#,
indoc! {r#"
indoc! {"
Expected marker value, found end of dependency specification
numpy; sys_platform == "win32" or (os_name == "linux" and
^"#},
numpy; sys_platform == \"win32\" or (os_name == \"linux\" and\x20
^"},
);
}
@ -1331,7 +1331,7 @@ mod tests {
r#"name @ "#,
indoc! {"
Expected URL
name @
name @\x20
^"
},
);