An extremely fast Python package and project manager, written in Rust.
Find a file
Andrew Gallant f9528b4ecd
pep440-rs: switch Version::release to smallvec
This commit attempts an optimization that switches a version's `release`
field over to a `smallvec` optimization. The idea is that most versions
are very small and can be stored inline.

Interestingly, I was unable to observe any obvious benefit:

    $ hyperfine \
        "./target/profiling/puffin-dev-u32 resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null" \
        "./target/profiling/puffin-dev-smallvec-release resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null"
    Benchmark 1: ./target/profiling/puffin-dev-u32 resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
      Time (mean ± σ):     872.2 ms ±  26.5 ms    [User: 14646.0 ms, System: 2516.0 ms]
      Range (min … max):   833.0 ms … 912.0 ms    10 runs

    Benchmark 2: ./target/profiling/puffin-dev-smallvec-release resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
      Time (mean ± σ):     882.3 ms ±  17.4 ms    [User: 14764.4 ms, System: 2520.9 ms]
      Range (min … max):   859.7 ms … 912.7 ms    10 runs

    Summary
      './target/profiling/puffin-dev-u32 resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null' ran
        1.01 ± 0.04 times faster than './target/profiling/puffin-dev-smallvec-release resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null'

My hypothesis is that because of an earlier commit that switched the
global allocator to jemalloc, the cost of allocation had precipitously
decreased. To the point that the reduction in allocs from the smallvec
becomes a wash. To test my hypothesis, I dropped the jemalloc commit and
measured the perf of the smallvec optimization against main:

    $ hyperfine \
        "./target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null" \
        "./target/profiling/puffin-dev-smallvec-release-no-jemalloc resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null"
    Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
      Time (mean ± σ):     968.0 ms ±  20.0 ms    [User: 17637.4 ms, System: 2151.9 ms]
      Range (min … max):   940.2 ms … 1005.3 ms    10 runs

    Benchmark 2: ./target/profiling/puffin-dev-smallvec-release-no-jemalloc resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
      Time (mean ± σ):     958.4 ms ±  15.7 ms    [User: 17119.7 ms, System: 2246.1 ms]
      Range (min … max):   944.7 ms … 993.3 ms    10 runs

    Summary
      './target/profiling/puffin-dev-smallvec-release-no-jemalloc resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null' ran
        1.01 ± 0.03 times faster than './target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null'

Fiddlesticks. Even when allocation is (presumably) more expensive, the
smallvec optimization didn't help. This suggests something is off about
my mental model of the code. So there are more avenues to explore here!
2023-11-10 14:30:26 -05:00
.cargo Add basic CI via GitHub Actions (#10) 2023-10-05 13:42:58 -04:00
.github/workflows Add tests for puffin sync (#161) 2023-10-22 03:25:00 +00:00
crates pep440-rs: switch Version::release to smallvec 2023-11-10 14:30:26 -05:00
scripts scripts: make pypi 8k fetch script executable 2023-11-10 13:20:02 -05:00
vendor/pubgrub Upgrade PubGrub (#349) 2023-11-07 02:00:57 +00:00
workers/pypi-metadata Handle dist info casing mismatch in worker (#273) 2023-11-02 11:04:28 +00:00
.dockerignore Add docker builder (#238) 2023-11-02 12:03:56 +01:00
.gitignore Add docker builder (#238) 2023-11-02 12:03:56 +01:00
builder.dockerfile Add pkg-config to builder.dockerfile (#355) 2023-11-07 13:35:56 +00:00
Cargo.lock pep440-rs: switch Version::release to smallvec 2023-11-10 14:30:26 -05:00
Cargo.toml pep440-rs: switch Version::release to smallvec 2023-11-10 14:30:26 -05:00
CONTRIBUTING.md Add docker builder (#238) 2023-11-02 12:03:56 +01:00
LICENSE-APACHE Add README and LICENSE files 2023-10-05 12:45:38 -04:00
LICENSE-MIT Add README and LICENSE files 2023-10-05 12:45:38 -04:00
README.md Update README limitations (#363) 2023-11-08 03:01:33 +00:00
ruff.toml Unify python interpreter abstractions (#178) 2023-10-25 20:11:36 +00:00
rust-toolchain.toml Rust 1.73 2023-10-23 13:52:57 +02:00

puffin

An experimental Python packaging tool.

Motivation

Puffin is an extremely fast (experimental) Python package resolver and installer, intended to replace pip and pip-tools (pip-compile and pip-sync).

Puffin itself is not a complete "package manager", but rather a tool for locking dependencies (similar to pip-compile) and installing them (similar to pip-sync). Puffin can be used to generate a set of locked dependencies from a requirements.txt file, and then install those locked dependencies into a virtual environment.

Puffin represents an intermediary goal in our pursuit of building a "Cargo for Python": a Python package manager that is extremely fast, reliable, and easy to use -- capable of replacing not only pip, but also pipx, pip-tools, virtualenv, tox, setuptools, and even pyenv, by way of managing the Python installation itself.

Puffin's limited scope allows us to solve many of the low-level problems that are required to build such a package manager (like package installation) while shipping an immediately useful tool with a minimal barrier to adoption. Try it today in lieu of pip and pip-tools.

Features

  • Extremely fast dependency resolution and installation: install dependencies in sub-second time.
  • Disk-space efficient: Puffin uses a global cache to deduplicate dependencies, and uses Copy-on-Write on supported filesystems to reduce disk usage.

Limitations

Puffin does not yet support:

  • Windows
  • ...

Like pip-compile, Puffin generates a platform-specific requirements.txt file (unlike, e.g., poetry, which generates a platform-agnostic poetry.lock file). As such, Puffin's requirements.txt files are not portable across platforms and Python versions.

Usage

To resolve a requirements.in file:

cargo run -p puffin-cli -- pip-compile requirements.in

To install from a resolved requirements.txt file:

cargo run -p puffin-cli -- pip-sync requirements.txt

For more, see cargo run -p puffin-cli -- --help:

Usage: puffin [OPTIONS] <COMMAND>

Commands:
  pip-compile    Compile a `requirements.in` file to a `requirements.txt` file
  pip-sync       Sync dependencies from a `requirements.txt` file
  pip-uninstall  Uninstall packages from the current environment
  clean          Clear the cache
  freeze         Enumerate the installed packages in the current environment
  venv           Create a virtual environment
  add            Add a dependency to the workspace
  remove         Remove a dependency from the workspace
  help           Print this message or the help of the given subcommand(s)

Options:
  -q, --quiet                  Do not print any output
  -v, --verbose                Use verbose output
  -n, --no-cache               Avoid reading from or writing to the cache
      --cache-dir <CACHE_DIR>  Path to the cache directory [env: PUFFIN_CACHE_DIR=]
  -h, --help                   Print help
  -V, --version                Print version

License

Puffin is licensed under either of

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Puffin by you, as defined in the Apache-2.0 license, shall be dually licensed as above, without any additional terms or conditions.