Add an option to bytecode compile during installation (#2086)

Add a `--compile` option to `pip install` and `pip sync`.

I chose to implement this as a separate pass over the entire venv. If we
wanted to compile during installation, we'd have to make sure that
writing is exclusive, to avoid concurrent processes writing broken
`.pyc` files. Additionally, this ensures that the entire site-packages
are bytecode compiled, even if there are packages that aren't from this
`uv` invocation. The disadvantage is that we do not update RECORD and
rely on this comment from [PEP 491](https://peps.python.org/pep-0491/):

> Uninstallers should be smart enough to remove .pyc even if it is not
mentioned in RECORD.

If this is a problem we can change it to run during installation and
write RECORD entries.

Internally, this is implemented as an async work-stealing subprocess
worker pool. The producer is a directory traversal over site-packages,
sending each `.py` file to a bounded async FIFO queue/channel. Each
worker has a long-running python process. It pops the queue to get a
single path (or exists if the channel is closed), then sends it to
stdin, waits until it's informed that the compilation is done through a
line on stdout, and repeat. This is fast, e.g. installing `jupyter
plotly` on Python 3.12 it processes 15876 files in 319ms with 32 threads
(vs. 3.8s with a single core). The python processes internally calls
`compileall.compile_file`, the same as pip.

Like pip, we ignore and silence all compilation errors
(https://github.com/astral-sh/uv/issues/1559). There is a 10s timeout to
handle the case when the workers got stuck. For the reviewers, please
check if i missed any spots where we could deadlock, this is the hardest
part of this PR.

I've added `uv-dev compile <dir>` and `uv-dev clear-compile <dir>`
commands, mainly for my own benchmarking. I don't want to expose them in
`uv`, they almost certainly not the correct workflow and we don't want
to support them.

Fixes #1788
Closes #1559
Closes #1928
This commit is contained in:
konsti 2024-03-05 04:35:24 +01:00 committed by GitHub
parent 93b1395daa
commit 2a53e789b0
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 583 additions and 17 deletions

View file

@ -61,6 +61,7 @@ tracing-durations-export = { workspace = true, features = ["plot"] }
tracing-indicatif = { workspace = true }
tracing-subscriber = { workspace = true }
url = { workspace = true }
walkdir = { workspace = true }
which = { workspace = true }
[target.'cfg(target_os = "windows")'.dependencies]

View file

@ -0,0 +1,33 @@
use std::path::PathBuf;
use clap::Parser;
use tracing::info;
use walkdir::WalkDir;
#[derive(Parser)]
pub(crate) struct ClearCompileArgs {
/// Compile all `.py` in this or any subdirectory to bytecode
root: PathBuf,
}
pub(crate) fn clear_compile(args: &ClearCompileArgs) -> anyhow::Result<()> {
let mut removed_files = 0;
let mut removed_directories = 0;
for entry in WalkDir::new(&args.root).contents_first(true) {
let entry = entry?;
let metadata = entry.metadata()?;
if metadata.is_file() {
if entry.path().extension().is_some_and(|ext| ext == "pyc") {
fs_err::remove_file(entry.path())?;
removed_files += 1;
}
} else if metadata.is_dir() {
if entry.file_name() == "__pycache__" {
fs_err::remove_dir(entry.path())?;
removed_directories += 1;
}
}
}
info!("Removed {removed_files} files and {removed_directories} directories");
Ok(())
}

View file

@ -0,0 +1,37 @@
use std::path::PathBuf;
use clap::Parser;
use platform_host::Platform;
use tracing::info;
use uv_cache::{Cache, CacheArgs};
use uv_interpreter::PythonEnvironment;
#[derive(Parser)]
pub(crate) struct CompileArgs {
/// Compile all `.py` in this or any subdirectory to bytecode
root: PathBuf,
python: Option<PathBuf>,
#[command(flatten)]
cache_args: CacheArgs,
}
pub(crate) async fn compile(args: CompileArgs) -> anyhow::Result<()> {
let cache = Cache::try_from(args.cache_args)?;
let interpreter = if let Some(python) = args.python {
python
} else {
let platform = Platform::current()?;
let venv = PythonEnvironment::from_virtualenv(platform, &cache)?;
venv.python_executable().to_path_buf()
};
let files = uv_installer::compile_tree(
&fs_err::canonicalize(args.root)?,
&interpreter,
cache.root(),
)
.await?;
info!("Compiled {files} files");
Ok(())
}

View file

@ -19,6 +19,8 @@ use tracing_subscriber::EnvFilter;
use resolve_many::ResolveManyArgs;
use crate::build::{build, BuildArgs};
use crate::clear_compile::ClearCompileArgs;
use crate::compile::CompileArgs;
use crate::install_many::InstallManyArgs;
use crate::render_benchmarks::RenderBenchmarksArgs;
use crate::resolve_cli::ResolveCliArgs;
@ -41,6 +43,8 @@ static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
mod build;
mod clear_compile;
mod compile;
mod install_many;
mod render_benchmarks;
mod resolve_cli;
@ -67,6 +71,10 @@ enum Cli {
Resolve(ResolveCliArgs),
WheelMetadata(WheelMetadataArgs),
RenderBenchmarks(RenderBenchmarksArgs),
/// Compile all `.py` to `.pyc` files in the tree.
Compile(CompileArgs),
/// Remove all `.pyc` in the tree.
ClearCompile(ClearCompileArgs),
}
#[instrument] // Anchor span to check for overhead
@ -88,6 +96,8 @@ async fn run() -> Result<()> {
}
Cli::WheelMetadata(args) => wheel_metadata::wheel_metadata(args).await?,
Cli::RenderBenchmarks(args) => render_benchmarks::render_benchmarks(&args)?,
Cli::Compile(args) => compile::compile(args).await?,
Cli::ClearCompile(args) => clear_compile::clear_compile(&args)?,
}
Ok(())
}