Consolidate concurrency limits (#3493)

## Summary

This PR consolidates the concurrency limits used throughout `uv` and
exposes two limits, `UV_CONCURRENT_DOWNLOADS` and
`UV_CONCURRENT_BUILDS`, as environment variables.

Currently, `uv` has a number of concurrent streams that it buffers using
relatively arbitrary limits for backpressure. However, many of these
limits are conflated. We run a relatively small number of tasks overall
and should start most things as soon as possible. What we really want to
limit are three separate operations:
- File I/O. This is managed by tokio's blocking pool and we should not
really have to worry about it.
- Network I/O.
- Python build processes.

Because the current limits span a broad range of tasks, it's possible
that a limit meant for network I/O is occupied by tasks performing
builds, reading from the file system, or even waiting on a `OnceMap`. We
also don't limit build processes that end up being required to perform a
download. While this may not pose a performance problem because our
limits are relatively high, it does mean that the limits do not do what
we want, making it tricky to expose them to users
(https://github.com/astral-sh/uv/issues/1205,
https://github.com/astral-sh/uv/issues/3311).

After this change, the limits on network I/O and build processes are
centralized and managed by semaphores. All other tasks are unbuffered
(note that these tasks are still bounded, so backpressure should not be
a problem).
This commit is contained in:
Ibraheem Ahmed 2024-05-10 12:43:08 -04:00 committed by GitHub
parent eab2b832a6
commit 783df8f657
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
35 changed files with 575 additions and 218 deletions

View file

@ -2,7 +2,8 @@ use std::borrow::Cow;
use std::path::{Path, PathBuf};
use anyhow::{Context, Result};
use futures::{StreamExt, TryStreamExt};
use futures::stream::FuturesOrdered;
use futures::TryStreamExt;
use url::Url;
use distribution_types::{
@ -10,7 +11,6 @@ use distribution_types::{
};
use pep508_rs::RequirementOrigin;
use uv_client::RegistryClient;
use uv_distribution::{DistributionDatabase, Reporter};
use uv_fs::Simplified;
use uv_resolver::{InMemoryIndex, MetadataResponse};
@ -41,16 +41,15 @@ impl<'a, Context: BuildContext> SourceTreeResolver<'a, Context> {
source_trees: Vec<PathBuf>,
extras: &'a ExtrasSpecification,
hasher: &'a HashStrategy,
context: &'a Context,
client: &'a RegistryClient,
index: &'a InMemoryIndex,
database: DistributionDatabase<'a, Context>,
) -> Self {
Self {
source_trees,
extras,
hasher,
index,
database: DistributionDatabase::new(client, context),
database,
}
}
@ -65,9 +64,11 @@ impl<'a, Context: BuildContext> SourceTreeResolver<'a, Context> {
/// Resolve the requirements from the provided source trees.
pub async fn resolve(self) -> Result<Vec<Requirement>> {
let requirements: Vec<_> = futures::stream::iter(self.source_trees.iter())
let requirements: Vec<_> = self
.source_trees
.iter()
.map(|source_tree| async { self.resolve_source_tree(source_tree).await })
.buffered(50)
.collect::<FuturesOrdered<_>>()
.try_collect()
.await?;
Ok(requirements