Add hash-checking support to install and sync (#2945)

## Summary

This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.

Here are some of the most important highlights:

- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.

There are a few notable TODOs:

- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.

Closes #474.

## Test Plan

I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
This commit is contained in:
Charlie Marsh 2024-04-10 15:09:03 -04:00 committed by GitHub
parent 715a309dd5
commit 1f3b5bb093
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
56 changed files with 3186 additions and 333 deletions

View file

@ -16,12 +16,12 @@ use zip::ZipArchive;
use distribution_filename::WheelFilename;
use distribution_types::{
BuildableSource, DirectArchiveUrl, Dist, FileLocation, GitSourceUrl, LocalEditable,
BuildableSource, DirectArchiveUrl, Dist, FileLocation, GitSourceUrl, Hashed, LocalEditable,
PathSourceDist, PathSourceUrl, RemoteSource, SourceDist, SourceUrl,
};
use install_wheel_rs::metadata::read_archive_metadata;
use platform_tags::Tags;
use pypi_types::Metadata23;
use pypi_types::{HashDigest, Metadata23};
use uv_cache::{
ArchiveTimestamp, CacheBucket, CacheEntry, CacheShard, CachedByTimestamp, Freshness, WheelCache,
};
@ -29,6 +29,7 @@ use uv_client::{
CacheControl, CachedClientError, Connectivity, DataWithCachePolicy, RegistryClient,
};
use uv_configuration::{BuildKind, NoBuild};
use uv_extract::hash::Hasher;
use uv_fs::write_atomic;
use uv_types::{BuildContext, SourceBuildTrait};
@ -49,9 +50,7 @@ pub struct SourceDistributionBuilder<'a, T: BuildContext> {
}
/// The name of the file that contains the revision ID, encoded via `MsgPack`.
///
/// TODO(charlie): Update the filename whenever we bump the cache version.
pub(crate) const REVISION: &str = "manifest.msgpack";
pub(crate) const REVISION: &str = "revision.msgpack";
/// The name of the file that contains the cached distribution metadata, encoded via `MsgPack`.
pub(crate) const METADATA: &str = "metadata.msgpack";
@ -76,10 +75,11 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
}
/// Download and build a [`SourceDist`].
pub async fn download_and_build(
pub(super) async fn download_and_build(
&self,
source: &BuildableSource<'_>,
tags: &Tags,
hashes: &[HashDigest],
) -> Result<BuiltWheelMetadata, Error> {
let built_wheel_metadata = match &source {
BuildableSource::Dist(SourceDist::Registry(dist)) => {
@ -100,6 +100,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
path: Cow::Borrowed(path),
},
tags,
hashes,
)
.boxed()
.await;
@ -115,9 +116,17 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
.join(dist.filename.version.to_string()),
);
self.url(source, &dist.file.filename, &url, &cache_shard, None, tags)
.boxed()
.await?
self.url(
source,
&dist.file.filename,
&url,
&cache_shard,
None,
tags,
hashes,
)
.boxed()
.await?
}
BuildableSource::Dist(SourceDist::DirectUrl(dist)) => {
let filename = dist.filename().expect("Distribution must have a filename");
@ -136,22 +145,23 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&cache_shard,
subdirectory.as_deref(),
tags,
hashes,
)
.boxed()
.await?
}
BuildableSource::Dist(SourceDist::Git(dist)) => {
self.git(source, &GitSourceUrl::from(dist), tags)
self.git(source, &GitSourceUrl::from(dist), tags, hashes)
.boxed()
.await?
}
BuildableSource::Dist(SourceDist::Path(dist)) => {
if dist.path.is_dir() {
self.source_tree(source, &PathSourceUrl::from(dist), tags)
self.source_tree(source, &PathSourceUrl::from(dist), tags, hashes)
.boxed()
.await?
} else {
self.archive(source, &PathSourceUrl::from(dist), tags)
self.archive(source, &PathSourceUrl::from(dist), tags, hashes)
.boxed()
.await?
}
@ -176,18 +186,21 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&cache_shard,
subdirectory.as_deref(),
tags,
hashes,
)
.boxed()
.await?
}
BuildableSource::Url(SourceUrl::Git(resource)) => {
self.git(source, resource, tags).boxed().await?
self.git(source, resource, tags, hashes).boxed().await?
}
BuildableSource::Url(SourceUrl::Path(resource)) => {
if resource.path.is_dir() {
self.source_tree(source, resource, tags).boxed().await?
self.source_tree(source, resource, tags, hashes)
.boxed()
.await?
} else {
self.archive(source, resource, tags).boxed().await?
self.archive(source, resource, tags, hashes).boxed().await?
}
}
};
@ -198,9 +211,10 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
/// Download a [`SourceDist`] and determine its metadata. This typically involves building the
/// source distribution into a wheel; however, some build backends support determining the
/// metadata without building the source distribution.
pub async fn download_and_build_metadata(
pub(super) async fn download_and_build_metadata(
&self,
source: &BuildableSource<'_>,
hashes: &[HashDigest],
) -> Result<Metadata23, Error> {
let metadata = match &source {
BuildableSource::Dist(SourceDist::Registry(dist)) => {
@ -220,6 +234,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
url: &url,
path: Cow::Borrowed(path),
},
hashes,
)
.boxed()
.await;
@ -234,9 +249,16 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
.join(dist.filename.version.to_string()),
);
self.url_metadata(source, &dist.file.filename, &url, &cache_shard, None)
.boxed()
.await?
self.url_metadata(
source,
&dist.file.filename,
&url,
&cache_shard,
None,
hashes,
)
.boxed()
.await?
}
BuildableSource::Dist(SourceDist::DirectUrl(dist)) => {
let filename = dist.filename().expect("Distribution must have a filename");
@ -254,22 +276,23 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&url,
&cache_shard,
subdirectory.as_deref(),
hashes,
)
.boxed()
.await?
}
BuildableSource::Dist(SourceDist::Git(dist)) => {
self.git_metadata(source, &GitSourceUrl::from(dist))
self.git_metadata(source, &GitSourceUrl::from(dist), hashes)
.boxed()
.await?
}
BuildableSource::Dist(SourceDist::Path(dist)) => {
if dist.path.is_dir() {
self.source_tree_metadata(source, &PathSourceUrl::from(dist))
self.source_tree_metadata(source, &PathSourceUrl::from(dist), hashes)
.boxed()
.await?
} else {
self.archive_metadata(source, &PathSourceUrl::from(dist))
self.archive_metadata(source, &PathSourceUrl::from(dist), hashes)
.boxed()
.await?
}
@ -293,18 +316,23 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&url,
&cache_shard,
subdirectory.as_deref(),
hashes,
)
.boxed()
.await?
}
BuildableSource::Url(SourceUrl::Git(resource)) => {
self.git_metadata(source, resource).boxed().await?
self.git_metadata(source, resource, hashes).boxed().await?
}
BuildableSource::Url(SourceUrl::Path(resource)) => {
if resource.path.is_dir() {
self.source_tree_metadata(source, resource).boxed().await?
self.source_tree_metadata(source, resource, hashes)
.boxed()
.await?
} else {
self.archive_metadata(source, resource).boxed().await?
self.archive_metadata(source, resource, hashes)
.boxed()
.await?
}
}
};
@ -322,19 +350,29 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
cache_shard: &CacheShard,
subdirectory: Option<&'data Path>,
tags: &Tags,
hashes: &[HashDigest],
) -> Result<BuiltWheelMetadata, Error> {
// Fetch the revision for the source distribution.
let revision = self
.url_revision(source, filename, url, cache_shard)
.url_revision(source, filename, url, cache_shard, hashes)
.await?;
// Before running the build, check that the hashes match.
if !revision.satisfies(hashes) {
return Err(Error::hash_mismatch(
source.to_string(),
hashes,
revision.hashes(),
));
}
// Scope all operations to the revision. Within the revision, there's no need to check for
// freshness, since entries have to be fresher than the revision itself.
let cache_shard = cache_shard.shard(revision.id());
// If the cache contains a compatible wheel, return it.
if let Some(built_wheel) = BuiltWheelMetadata::find_in_cache(tags, &cache_shard) {
return Ok(built_wheel);
return Ok(built_wheel.with_hashes(revision.into_hashes()));
}
let task = self
@ -364,6 +402,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
path: cache_shard.join(&disk_filename),
target: cache_shard.join(wheel_filename.stem()),
filename: wheel_filename,
hashes: revision.into_hashes(),
})
}
@ -379,12 +418,22 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
url: &'data Url,
cache_shard: &CacheShard,
subdirectory: Option<&'data Path>,
hashes: &[HashDigest],
) -> Result<Metadata23, Error> {
// Fetch the revision for the source distribution.
let revision = self
.url_revision(source, filename, url, cache_shard)
.url_revision(source, filename, url, cache_shard, hashes)
.await?;
// Before running the build, check that the hashes match.
if !revision.satisfies(hashes) {
return Err(Error::hash_mismatch(
source.to_string(),
hashes,
revision.hashes(),
));
}
// Scope all operations to the revision. Within the revision, there's no need to check for
// freshness, since entries have to be fresher than the revision itself.
let cache_shard = cache_shard.shard(revision.id());
@ -449,6 +498,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
filename: &str,
url: &Url,
cache_shard: &CacheShard,
hashes: &[HashDigest],
) -> Result<Revision, Error> {
let cache_entry = cache_shard.entry(REVISION);
let cache_control = match self.client.connectivity() {
@ -469,24 +519,40 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
// Download the source distribution.
debug!("Downloading source distribution: {source}");
let source_dist_entry = cache_shard.shard(revision.id()).entry(filename);
self.persist_url(response, source, filename, &source_dist_entry)
let entry = cache_shard.shard(revision.id()).entry(filename);
let hashes = self
.download_archive(response, source, filename, entry.path(), hashes)
.await?;
Ok(revision)
Ok(revision.with_hashes(hashes))
}
.boxed()
.instrument(info_span!("download", source_dist = %source))
};
let req = self.request(url.clone())?;
self.client
let revision = self
.client
.cached_client()
.get_serde(req, &cache_entry, cache_control, download)
.await
.map_err(|err| match err {
CachedClientError::Callback(err) => err,
CachedClientError::Client(err) => Error::Client(err),
})
})?;
// If the archive is missing the required hashes, force a refresh.
if revision.has_digests(hashes) {
Ok(revision)
} else {
self.client
.cached_client()
.skip_cache(self.request(url.clone())?, &cache_entry, download)
.await
.map_err(|err| match err {
CachedClientError::Callback(err) => err,
CachedClientError::Client(err) => Error::Client(err),
})
}
}
/// Build a source distribution from a local archive (e.g., `.tar.gz` or `.zip`).
@ -495,6 +561,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
source: &BuildableSource<'_>,
resource: &PathSourceUrl<'_>,
tags: &Tags,
hashes: &[HashDigest],
) -> Result<BuiltWheelMetadata, Error> {
let cache_shard = self.build_context.cache().shard(
CacheBucket::BuiltWheels,
@ -503,9 +570,18 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
// Fetch the revision for the source distribution.
let revision = self
.archive_revision(source, resource, &cache_shard)
.archive_revision(source, resource, &cache_shard, hashes)
.await?;
// Before running the build, check that the hashes match.
if !revision.satisfies(hashes) {
return Err(Error::hash_mismatch(
source.to_string(),
hashes,
revision.hashes(),
));
}
// Scope all operations to the revision. Within the revision, there's no need to check for
// freshness, since entries have to be fresher than the revision itself.
let cache_shard = cache_shard.shard(revision.id());
@ -543,6 +619,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
path: cache_shard.join(&disk_filename),
target: cache_shard.join(filename.stem()),
filename,
hashes: revision.into_hashes(),
})
}
@ -554,6 +631,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&self,
source: &BuildableSource<'_>,
resource: &PathSourceUrl<'_>,
hashes: &[HashDigest],
) -> Result<Metadata23, Error> {
let cache_shard = self.build_context.cache().shard(
CacheBucket::BuiltWheels,
@ -562,9 +640,18 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
// Fetch the revision for the source distribution.
let revision = self
.archive_revision(source, resource, &cache_shard)
.archive_revision(source, resource, &cache_shard, hashes)
.await?;
// Before running the build, check that the hashes match.
if !revision.satisfies(hashes) {
return Err(Error::hash_mismatch(
source.to_string(),
hashes,
revision.hashes(),
));
}
// Scope all operations to the revision. Within the revision, there's no need to check for
// freshness, since entries have to be fresher than the revision itself.
let cache_shard = cache_shard.shard(revision.id());
@ -627,6 +714,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
source: &BuildableSource<'_>,
resource: &PathSourceUrl<'_>,
cache_shard: &CacheShard,
hashes: &[HashDigest],
) -> Result<Revision, Error> {
// Determine the last-modified time of the source distribution.
let modified = ArchiveTimestamp::from_file(&resource.path).map_err(Error::CacheRead)?;
@ -637,7 +725,9 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
// If the revision already exists, return it. There's no need to check for freshness, since
// we use an exact timestamp.
if let Some(revision) = read_timestamped_revision(&revision_entry, modified)? {
return Ok(revision);
if revision.has_digests(hashes) {
return Ok(revision);
}
}
// Otherwise, we need to create a new revision.
@ -646,7 +736,10 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
// Unzip the archive to a temporary directory.
debug!("Unpacking source distribution: {source}");
let entry = cache_shard.shard(revision.id()).entry("source");
self.persist_archive(&resource.path, source, &entry).await?;
let hashes = self
.persist_archive(&resource.path, entry.path(), hashes)
.await?;
let revision = revision.with_hashes(hashes);
// Persist the revision.
write_atomic(
@ -668,7 +761,13 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
source: &BuildableSource<'_>,
resource: &PathSourceUrl<'_>,
tags: &Tags,
hashes: &[HashDigest],
) -> Result<BuiltWheelMetadata, Error> {
// Before running the build, check that the hashes match.
if !hashes.is_empty() {
return Err(Error::HashesNotSupportedSourceTree(source.to_string()));
}
let cache_shard = self.build_context.cache().shard(
CacheBucket::BuiltWheels,
WheelCache::Path(resource.url).root(),
@ -714,6 +813,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
path: cache_shard.join(&disk_filename),
target: cache_shard.join(filename.stem()),
filename,
hashes: vec![],
})
}
@ -725,7 +825,13 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&self,
source: &BuildableSource<'_>,
resource: &PathSourceUrl<'_>,
hashes: &[HashDigest],
) -> Result<Metadata23, Error> {
// Before running the build, check that the hashes match.
if !hashes.is_empty() {
return Err(Error::HashesNotSupportedSourceTree(source.to_string()));
}
let cache_shard = self.build_context.cache().shard(
CacheBucket::BuiltWheels,
WheelCache::Path(resource.url).root(),
@ -742,16 +848,9 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
// If the cache contains compatible metadata, return it.
let metadata_entry = cache_shard.entry(METADATA);
if self
.build_context
.cache()
.freshness(&metadata_entry, source.name())
.is_ok_and(Freshness::is_fresh)
{
if let Some(metadata) = read_cached_metadata(&metadata_entry).await? {
debug!("Using cached metadata for: {source}");
return Ok(metadata);
}
if let Some(metadata) = read_cached_metadata(&metadata_entry).await? {
debug!("Using cached metadata for: {source}");
return Ok(metadata);
}
// If the backend supports `prepare_metadata_for_build_wheel`, use it.
@ -828,7 +927,13 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
source: &BuildableSource<'_>,
resource: &GitSourceUrl<'_>,
tags: &Tags,
hashes: &[HashDigest],
) -> Result<BuiltWheelMetadata, Error> {
// Before running the build, check that the hashes match.
if !hashes.is_empty() {
return Err(Error::HashesNotSupportedGit(source.to_string()));
}
// Resolve to a precise Git SHA.
let url = if let Some(url) = resolve_precise(
resource.url,
@ -882,6 +987,7 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
path: cache_shard.join(&disk_filename),
target: cache_shard.join(filename.stem()),
filename,
hashes: vec![],
})
}
@ -893,7 +999,13 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
&self,
source: &BuildableSource<'_>,
resource: &GitSourceUrl<'_>,
hashes: &[HashDigest],
) -> Result<Metadata23, Error> {
// Before running the build, check that the hashes match.
if !hashes.is_empty() {
return Err(Error::HashesNotSupportedGit(source.to_string()));
}
// Resolve to a precise Git SHA.
let url = if let Some(url) = resolve_precise(
resource.url,
@ -975,21 +1087,14 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
}
/// Download and unzip a source distribution into the cache from an HTTP response.
async fn persist_url(
async fn download_archive(
&self,
response: Response,
source: &BuildableSource<'_>,
filename: &str,
cache_entry: &CacheEntry,
) -> Result<(), Error> {
let cache_path = cache_entry.path();
if cache_path.is_dir() {
debug!("Distribution is already cached: {source}");
return Ok(());
}
// Download and unzip the source distribution into a temporary directory.
let span = info_span!("persist_url", filename = filename, source_dist = %source);
target: &Path,
hashes: &[HashDigest],
) -> Result<Vec<HashDigest>, Error> {
let temp_dir =
tempfile::tempdir_in(self.build_context.cache().bucket(CacheBucket::BuiltWheels))
.map_err(Error::CacheWrite)?;
@ -997,9 +1102,29 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
.bytes_stream()
.map_err(|err| std::io::Error::new(std::io::ErrorKind::Other, err))
.into_async_read();
uv_extract::stream::archive(reader.compat(), filename, temp_dir.path()).await?;
// Create a hasher for each hash algorithm.
let algorithms = {
let mut hash = hashes.iter().map(HashDigest::algorithm).collect::<Vec<_>>();
hash.sort();
hash.dedup();
hash
};
let mut hashers = algorithms.into_iter().map(Hasher::from).collect::<Vec<_>>();
let mut hasher = uv_extract::hash::HashReader::new(reader.compat(), &mut hashers);
// Download and unzip the source distribution into a temporary directory.
let span = info_span!("download_source_dist", filename = filename, source_dist = %source);
uv_extract::stream::archive(&mut hasher, filename, temp_dir.path()).await?;
drop(span);
// If necessary, exhaust the reader to compute the hash.
if !hashes.is_empty() {
hasher.finish().await.map_err(Error::HashExhaustion)?;
}
let hashes = hashers.into_iter().map(HashDigest::from).collect();
// Extract the top-level directory.
let extracted = match uv_extract::strip_component(temp_dir.path()) {
Ok(top_level) => top_level,
@ -1008,39 +1133,51 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
};
// Persist it to the cache.
fs_err::tokio::create_dir_all(cache_path.parent().expect("Cache entry to have parent"))
fs_err::tokio::create_dir_all(target.parent().expect("Cache entry to have parent"))
.await
.map_err(Error::CacheWrite)?;
fs_err::tokio::rename(extracted, &cache_path)
fs_err::tokio::rename(extracted, target)
.await
.map_err(Error::CacheWrite)?;
Ok(())
Ok(hashes)
}
/// Extract a local archive, and store it at the given [`CacheEntry`].
async fn persist_archive(
&self,
path: &Path,
source: &BuildableSource<'_>,
cache_entry: &CacheEntry,
) -> Result<(), Error> {
let cache_path = cache_entry.path();
if cache_path.is_dir() {
debug!("Distribution is already cached: {source}");
return Ok(());
}
target: &Path,
hashes: &[HashDigest],
) -> Result<Vec<HashDigest>, Error> {
debug!("Unpacking for build: {}", path.display());
// Unzip the archive into a temporary directory.
let temp_dir =
tempfile::tempdir_in(self.build_context.cache().bucket(CacheBucket::BuiltWheels))
.map_err(Error::CacheWrite)?;
let reader = fs_err::tokio::File::open(&path)
.await
.map_err(Error::CacheRead)?;
uv_extract::seek::archive(reader, path, &temp_dir.path()).await?;
// Create a hasher for each hash algorithm.
let algorithms = {
let mut hash = hashes.iter().map(HashDigest::algorithm).collect::<Vec<_>>();
hash.sort();
hash.dedup();
hash
};
let mut hashers = algorithms.into_iter().map(Hasher::from).collect::<Vec<_>>();
let mut hasher = uv_extract::hash::HashReader::new(reader, &mut hashers);
// Unzip the archive into a temporary directory.
uv_extract::stream::archive(&mut hasher, path, &temp_dir.path()).await?;
// If necessary, exhaust the reader to compute the hash.
if !hashes.is_empty() {
hasher.finish().await.map_err(Error::HashExhaustion)?;
}
let hashes = hashers.into_iter().map(HashDigest::from).collect();
// Extract the top-level directory from the archive.
let extracted = match uv_extract::strip_component(temp_dir.path()) {
@ -1050,14 +1187,14 @@ impl<'a, T: BuildContext> SourceDistributionBuilder<'a, T> {
};
// Persist it to the cache.
fs_err::tokio::create_dir_all(cache_path.parent().expect("Cache entry to have parent"))
fs_err::tokio::create_dir_all(target.parent().expect("Cache entry to have parent"))
.await
.map_err(Error::CacheWrite)?;
fs_err::tokio::rename(extracted, &cache_path)
fs_err::tokio::rename(extracted, &target)
.await
.map_err(Error::CacheWrite)?;
Ok(())
Ok(hashes)
}
/// Build a source distribution, storing the built wheel in the cache.