Support --find-links-style "flat" indexes in [[tool.uv.index]] (#12407)
Some checks are pending
CI / cargo dev generate-all (push) Blocked by required conditions
CI / cargo shear (push) Waiting to run
CI / Determine changes (push) Waiting to run
CI / lint (push) Waiting to run
CI / cargo clippy | ubuntu (push) Blocked by required conditions
CI / cargo clippy | windows (push) Blocked by required conditions
CI / cargo test | ubuntu (push) Blocked by required conditions
CI / cargo test | macos (push) Blocked by required conditions
CI / cargo test | windows (push) Blocked by required conditions
CI / check windows trampoline | aarch64 (push) Blocked by required conditions
CI / smoke test | windows x86_64 (push) Blocked by required conditions
CI / check windows trampoline | i686 (push) Blocked by required conditions
CI / check windows trampoline | x86_64 (push) Blocked by required conditions
CI / test windows trampoline | i686 (push) Blocked by required conditions
CI / test windows trampoline | x86_64 (push) Blocked by required conditions
CI / typos (push) Waiting to run
CI / check system | python3.12 via chocolatey (push) Blocked by required conditions
CI / mkdocs (push) Waiting to run
CI / build binary | linux libc (push) Blocked by required conditions
CI / build binary | linux musl (push) Blocked by required conditions
CI / build binary | macos aarch64 (push) Blocked by required conditions
CI / build binary | macos x86_64 (push) Blocked by required conditions
CI / build binary | windows x86_64 (push) Blocked by required conditions
CI / build binary | windows aarch64 (push) Blocked by required conditions
CI / cargo build (msrv) (push) Blocked by required conditions
CI / build binary | freebsd (push) Blocked by required conditions
CI / ecosystem test | pydantic/pydantic-core (push) Blocked by required conditions
CI / ecosystem test | prefecthq/prefect (push) Blocked by required conditions
CI / ecosystem test | pallets/flask (push) Blocked by required conditions
CI / smoke test | linux (push) Blocked by required conditions
CI / check system | alpine (push) Blocked by required conditions
CI / smoke test | macos (push) Blocked by required conditions
CI / smoke test | windows aarch64 (push) Blocked by required conditions
CI / integration test | conda on ubuntu (push) Blocked by required conditions
CI / integration test | deadsnakes python3.9 on ubuntu (push) Blocked by required conditions
CI / integration test | free-threaded on linux (push) Blocked by required conditions
CI / integration test | free-threaded on windows (push) Blocked by required conditions
CI / integration test | pypy on ubuntu (push) Blocked by required conditions
CI / integration test | pypy on windows (push) Blocked by required conditions
CI / integration test | graalpy on ubuntu (push) Blocked by required conditions
CI / integration test | graalpy on windows (push) Blocked by required conditions
CI / integration test | github actions (push) Blocked by required conditions
CI / integration test | determine publish changes (push) Blocked by required conditions
CI / integration test | uv publish (push) Blocked by required conditions
CI / integration test | uv_build (push) Blocked by required conditions
CI / check cache | ubuntu (push) Blocked by required conditions
CI / check cache | macos aarch64 (push) Blocked by required conditions
CI / check system | python on debian (push) Blocked by required conditions
CI / check system | python on fedora (push) Blocked by required conditions
CI / check system | python on ubuntu (push) Blocked by required conditions
CI / check system | python on opensuse (push) Blocked by required conditions
CI / check system | python on rocky linux 8 (push) Blocked by required conditions
CI / check system | python on rocky linux 9 (push) Blocked by required conditions
CI / check system | pypy on ubuntu (push) Blocked by required conditions
CI / check system | pyston (push) Blocked by required conditions
CI / check system | python on macos aarch64 (push) Blocked by required conditions
CI / check system | homebrew python on macos aarch64 (push) Blocked by required conditions
CI / check system | python on macos x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86 (push) Blocked by required conditions
CI / check system | python3.13 on windows x86-64 (push) Blocked by required conditions
CI / check system | x86-64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | windows registry (push) Blocked by required conditions
CI / check system | python3.9 via pyenv (push) Blocked by required conditions
CI / check system | python3.13 (push) Blocked by required conditions
CI / check system | conda3.11 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.8 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.11 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.11 on windows x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on windows x86-64 (push) Blocked by required conditions
CI / check system | amazonlinux (push) Blocked by required conditions
CI / check system | embedded python3.10 on windows x86-64 (push) Blocked by required conditions
CI / benchmarks (push) Blocked by required conditions

## Summary

This PR extends `[[tool.uv.index]]` to support `--find-links`-style
"flat" indexes, so that users can point to such indexes without using
`--find-links` _and_ get access to the full functionality of
`[[tool.uv.index]]` (e.g., they can now pin packages to
`--find-links`-style indexes).

Note that, at present, `--find-links` indexes actually have some quirky
behavior, in that we combine them into a single entity and then merge
the discovered distributions into each Simple API-style index. The
motivation here, IIRC, was to match pip's behavior quite closely. I'm
interested in _removing_ that behavior, but it'd be breaking (and may
also be inconvenient for some use-cases). So, the behavior for indexes
passed in via `--find-links` remains completely unchanged. However,
`[[tool.uv.index]]` entries with `format = "flat"` are now treated
identically to those defined with `format = "simple"` (the default), in
that we stop after we find the first-matching index, etc.

Closes https://github.com/astral-sh/uv/issues/11634.
This commit is contained in:
Charlie Marsh 2025-03-25 21:14:44 -04:00 committed by GitHub
parent f2a2d982b8
commit bd9c365b92
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
19 changed files with 826 additions and 122 deletions

View file

@ -46,6 +46,7 @@ reqwest-middleware = { workspace = true }
reqwest-retry = { workspace = true }
rkyv = { workspace = true }
rmp-serde = { workspace = true }
rustc-hash = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
sys-info = { workspace = true }

View file

@ -8,8 +8,8 @@ use url::Url;
use uv_distribution_filename::{WheelFilename, WheelFilenameError};
use uv_normalize::PackageName;
use crate::html;
use crate::middleware::OfflineError;
use crate::{html, FlatIndexError};
#[derive(Debug, thiserror::Error)]
#[error(transparent)]
@ -155,6 +155,9 @@ pub enum ErrorKind {
#[error(transparent)]
JoinRelativeUrl(#[from] uv_pypi_types::JoinRelativeError),
#[error(transparent)]
Flat(#[from] FlatIndexError),
#[error("Expected a file URL, but received: {0}")]
NonFileUrl(Url),

View file

@ -4,11 +4,11 @@ pub use base_client::{
};
pub use cached_client::{CacheControl, CachedClient, CachedClientError, DataWithCachePolicy};
pub use error::{Error, ErrorKind, WrappedReqwestError};
pub use flat_index::{FlatIndexClient, FlatIndexEntries, FlatIndexError};
pub use flat_index::{FlatIndexClient, FlatIndexEntries, FlatIndexEntry, FlatIndexError};
pub use linehaul::LineHaul;
pub use registry_client::{
Connectivity, RegistryClient, RegistryClientBuilder, SimpleMetadata, SimpleMetadatum,
VersionFiles,
Connectivity, MetadataFormat, RegistryClient, RegistryClientBuilder, SimpleMetadata,
SimpleMetadatum, VersionFiles,
};
pub use rkyvutil::{Deserializer, OwnedArchive, Serializer, Validator};

View file

@ -2,6 +2,7 @@ use std::collections::BTreeMap;
use std::fmt::Debug;
use std::path::PathBuf;
use std::str::FromStr;
use std::sync::Arc;
use std::time::Duration;
use async_http_range_reader::AsyncHttpRangeReader;
@ -10,7 +11,8 @@ use http::HeaderMap;
use itertools::Either;
use reqwest::{Proxy, Response, StatusCode};
use reqwest_middleware::ClientWithMiddleware;
use tokio::sync::Semaphore;
use rustc_hash::FxHashMap;
use tokio::sync::{Mutex, Semaphore};
use tracing::{info_span, instrument, trace, warn, Instrument};
use url::Url;
@ -20,7 +22,8 @@ use uv_configuration::KeyringProviderType;
use uv_configuration::{IndexStrategy, TrustedHost};
use uv_distribution_filename::{DistFilename, SourceDistFilename, WheelFilename};
use uv_distribution_types::{
BuiltDist, File, FileLocation, IndexCapabilities, IndexMetadataRef, IndexUrl, IndexUrls, Name,
BuiltDist, File, FileLocation, IndexCapabilities, IndexFormat, IndexMetadataRef, IndexUrl,
IndexUrls, Name,
};
use uv_metadata::{read_metadata_async_seek, read_metadata_async_stream};
use uv_normalize::PackageName;
@ -33,10 +36,14 @@ use uv_torch::TorchStrategy;
use crate::base_client::{BaseClientBuilder, ExtraMiddleware};
use crate::cached_client::CacheControl;
use crate::flat_index::FlatIndexEntry;
use crate::html::SimpleHtml;
use crate::remote_metadata::wheel_metadata_from_remote_zip;
use crate::rkyvutil::OwnedArchive;
use crate::{BaseClient, CachedClient, CachedClientError, Error, ErrorKind};
use crate::{
BaseClient, CachedClient, CachedClientError, Error, ErrorKind, FlatIndexClient,
FlatIndexEntries,
};
/// A builder for an [`RegistryClient`].
#[derive(Debug, Clone)]
@ -169,6 +176,7 @@ impl<'a> RegistryClientBuilder<'a> {
connectivity,
client,
timeout,
flat_indexes: Arc::default(),
}
}
@ -191,6 +199,7 @@ impl<'a> RegistryClientBuilder<'a> {
connectivity,
client,
timeout,
flat_indexes: Arc::default(),
}
}
}
@ -226,6 +235,17 @@ pub struct RegistryClient {
connectivity: Connectivity,
/// Configured client timeout, in seconds.
timeout: Duration,
/// The flat index entries for each `--find-links`-style index URL.
flat_indexes: Arc<Mutex<FlatIndexCache>>,
}
/// The format of the package metadata returned by querying an index.
#[derive(Debug)]
pub enum MetadataFormat {
/// The metadata adheres to the Simple Repository API format.
Simple(OwnedArchive<SimpleMetadata>),
/// The metadata consists of a list of distributions from a "flat" index.
Flat(Vec<FlatIndexEntry>),
}
impl RegistryClient {
@ -280,19 +300,21 @@ impl RegistryClient {
.unwrap_or(self.index_strategy)
}
/// Fetch a package from the `PyPI` simple API.
/// Fetch package metadata from an index.
///
/// "simple" here refers to [PEP 503 Simple Repository API](https://peps.python.org/pep-0503/)
/// Supports both the "Simple" API and `--find-links`-style flat indexes.
///
/// "Simple" here refers to [PEP 503 Simple Repository API](https://peps.python.org/pep-0503/)
/// and [PEP 691 JSON-based Simple API for Python Package Indexes](https://peps.python.org/pep-0691/),
/// which the pypi json api approximately implements.
#[instrument("simple_api", skip_all, fields(package = % package_name))]
pub async fn simple<'index>(
/// which the PyPI JSON API implements.
#[instrument(skip_all, fields(package = % package_name))]
pub async fn package_metadata<'index>(
&'index self,
package_name: &PackageName,
index: Option<IndexMetadataRef<'index>>,
capabilities: &IndexCapabilities,
download_concurrency: &Semaphore,
) -> Result<Vec<(IndexMetadataRef<'index>, OwnedArchive<SimpleMetadata>)>, Error> {
) -> Result<Vec<(&'index IndexUrl, MetadataFormat)>, Error> {
// If `--no-index` is specified, avoid fetching regardless of whether the index is implicit,
// explicit, etc.
if self.index_urls.no_index() {
@ -312,12 +334,23 @@ impl RegistryClient {
IndexStrategy::FirstIndex => {
for index in indexes {
let _permit = download_concurrency.acquire().await;
if let Some(metadata) = self
.simple_single_index(package_name, index.url(), capabilities)
.await?
{
results.push((index, metadata));
break;
match index.format {
IndexFormat::Simple => {
if let Some(metadata) = self
.simple_single_index(package_name, index.url, capabilities)
.await?
{
results.push((index.url, MetadataFormat::Simple(metadata)));
break;
}
}
IndexFormat::Flat => {
let entries = self.flat_single_index(package_name, index.url).await?;
if !entries.is_empty() {
results.push((index.url, MetadataFormat::Flat(entries)));
break;
}
}
}
}
}
@ -327,10 +360,19 @@ impl RegistryClient {
results = futures::stream::iter(indexes)
.map(|index| async move {
let _permit = download_concurrency.acquire().await;
let metadata = self
.simple_single_index(package_name, index.url(), capabilities)
.await?;
Ok((index, metadata))
match index.format {
IndexFormat::Simple => {
let metadata = self
.simple_single_index(package_name, index.url, capabilities)
.await?;
Ok((index.url, metadata.map(MetadataFormat::Simple)))
}
IndexFormat::Flat => {
let entries =
self.flat_single_index(package_name, index.url).await?;
Ok((index.url, Some(MetadataFormat::Flat(entries))))
}
}
})
.buffered(8)
.filter_map(|result: Result<_, Error>| async move {
@ -357,6 +399,46 @@ impl RegistryClient {
Ok(results)
}
/// Fetch the [`FlatIndexEntry`] entries for a given package from a single `--find-links` index.
async fn flat_single_index(
&self,
package_name: &PackageName,
index: &IndexUrl,
) -> Result<Vec<FlatIndexEntry>, Error> {
// Store the flat index entries in a cache, to avoid redundant fetches. A flat index will
// typically contain entries for multiple packages; as such, it's more efficient to cache
// the entire index rather than re-fetching it for each package.
let mut cache = self.flat_indexes.lock().await;
if let Some(entries) = cache.get(index) {
return Ok(entries.get(package_name).cloned().unwrap_or_default());
}
let client = FlatIndexClient::new(self.cached_client(), self.connectivity, &self.cache);
// Fetch the entries for the index.
let FlatIndexEntries { entries, .. } =
client.fetch_index(index).await.map_err(ErrorKind::Flat)?;
// Index by package name.
let mut entries_by_package: FxHashMap<PackageName, Vec<FlatIndexEntry>> =
FxHashMap::default();
for entry in entries {
entries_by_package
.entry(entry.filename.name().clone())
.or_default()
.push(entry);
}
let package_entries = entries_by_package
.get(package_name)
.cloned()
.unwrap_or_default();
// Write to the cache.
cache.insert(index.clone(), entries_by_package);
Ok(package_entries)
}
/// Fetch the [`SimpleMetadata`] from a single index for a given package.
///
/// The index can either be a PEP 503-compatible remote repository, or a local directory laid
@ -883,6 +965,27 @@ impl RegistryClient {
}
}
/// A map from [`IndexUrl`] to [`FlatIndexEntry`] entries found at the given URL, indexed by
/// [`PackageName`].
#[derive(Default, Debug, Clone)]
struct FlatIndexCache(FxHashMap<IndexUrl, FxHashMap<PackageName, Vec<FlatIndexEntry>>>);
impl FlatIndexCache {
/// Get the entries for a given index URL.
fn get(&self, index: &IndexUrl) -> Option<&FxHashMap<PackageName, Vec<FlatIndexEntry>>> {
self.0.get(index)
}
/// Insert the entries for a given index URL.
fn insert(
&mut self,
index: IndexUrl,
entries: FxHashMap<PackageName, Vec<FlatIndexEntry>>,
) -> Option<FxHashMap<PackageName, Vec<FlatIndexEntry>>> {
self.0.insert(index, entries)
}
}
#[derive(Default, Debug, rkyv::Archive, rkyv::Deserialize, rkyv::Serialize)]
#[rkyv(derive(Debug))]
pub struct VersionFiles {