Include the canonical path in the interpreter query cache key (#14331)
Some checks are pending
CI / test windows trampoline | i686 (push) Blocked by required conditions
CI / test windows trampoline | x86_64 (push) Blocked by required conditions
CI / typos (push) Waiting to run
CI / mkdocs (push) Waiting to run
CI / build binary | linux libc (push) Blocked by required conditions
CI / build binary | linux musl (push) Blocked by required conditions
CI / build binary | macos aarch64 (push) Blocked by required conditions
CI / build binary | macos x86_64 (push) Blocked by required conditions
CI / build binary | windows x86_64 (push) Blocked by required conditions
CI / build binary | windows aarch64 (push) Blocked by required conditions
CI / cargo build (msrv) (push) Blocked by required conditions
CI / build binary | freebsd (push) Blocked by required conditions
CI / Determine changes (push) Waiting to run
CI / lint (push) Waiting to run
CI / cargo clippy | ubuntu (push) Blocked by required conditions
CI / cargo clippy | windows (push) Blocked by required conditions
CI / cargo dev generate-all (push) Blocked by required conditions
CI / cargo shear (push) Waiting to run
CI / cargo test | ubuntu (push) Blocked by required conditions
CI / cargo test | macos (push) Blocked by required conditions
CI / cargo test | windows (push) Blocked by required conditions
CI / check windows trampoline | aarch64 (push) Blocked by required conditions
CI / check windows trampoline | i686 (push) Blocked by required conditions
CI / check windows trampoline | x86_64 (push) Blocked by required conditions
CI / ecosystem test | pydantic/pydantic-core (push) Blocked by required conditions
CI / ecosystem test | prefecthq/prefect (push) Blocked by required conditions
CI / ecosystem test | pallets/flask (push) Blocked by required conditions
CI / smoke test | linux (push) Blocked by required conditions
CI / check system | alpine (push) Blocked by required conditions
CI / smoke test | macos (push) Blocked by required conditions
CI / smoke test | windows x86_64 (push) Blocked by required conditions
CI / smoke test | windows aarch64 (push) Blocked by required conditions
CI / integration test | conda on ubuntu (push) Blocked by required conditions
CI / integration test | deadsnakes python3.9 on ubuntu (push) Blocked by required conditions
CI / integration test | free-threaded on windows (push) Blocked by required conditions
CI / integration test | pypy on ubuntu (push) Blocked by required conditions
CI / integration test | pypy on windows (push) Blocked by required conditions
CI / integration test | graalpy on ubuntu (push) Blocked by required conditions
CI / integration test | graalpy on windows (push) Blocked by required conditions
CI / integration test | pyodide on ubuntu (push) Blocked by required conditions
CI / integration test | github actions (push) Blocked by required conditions
CI / integration test | free-threaded python on github actions (push) Blocked by required conditions
CI / integration test | determine publish changes (push) Blocked by required conditions
CI / integration test | registries (push) Blocked by required conditions
CI / integration test | uv publish (push) Blocked by required conditions
CI / integration test | uv_build (push) Blocked by required conditions
CI / check cache | ubuntu (push) Blocked by required conditions
CI / check cache | macos aarch64 (push) Blocked by required conditions
CI / check system | python on debian (push) Blocked by required conditions
CI / check system | python on fedora (push) Blocked by required conditions
CI / check system | python on ubuntu (push) Blocked by required conditions
CI / check system | python on rocky linux 8 (push) Blocked by required conditions
CI / check system | python on rocky linux 9 (push) Blocked by required conditions
CI / check system | graalpy on ubuntu (push) Blocked by required conditions
CI / check system | pypy on ubuntu (push) Blocked by required conditions
CI / check system | pyston (push) Blocked by required conditions
CI / check system | python on macos aarch64 (push) Blocked by required conditions
CI / check system | homebrew python on macos aarch64 (push) Blocked by required conditions
CI / check system | python on macos x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86 (push) Blocked by required conditions
CI / check system | python3.13 on windows x86-64 (push) Blocked by required conditions
CI / check system | x86-64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | windows registry (push) Blocked by required conditions
CI / check system | python3.12 via chocolatey (push) Blocked by required conditions
CI / check system | python3.9 via pyenv (push) Blocked by required conditions
CI / check system | python3.13 (push) Blocked by required conditions
CI / check system | conda3.11 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.8 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.11 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.11 on windows x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on windows x86-64 (push) Blocked by required conditions
CI / check system | amazonlinux (push) Blocked by required conditions
CI / check system | embedded python3.10 on windows x86-64 (push) Blocked by required conditions
CI / benchmarks | walltime aarch64 linux (push) Blocked by required conditions
CI / benchmarks | instrumented (push) Blocked by required conditions

This fixes an obscure cache collision in Python interpreter queries,
which we believe to be the root cause of CI flakes we've been seeing
where a project environment is invalidated and recreated.

This work follows from the logs in [this CI
run](4495059999)
which captured one of the flakes with tracing enabled. There, we can see
that the project environment is invalidated because the Python
interpreter in the environment has a different version than expected:

```
DEBUG Checking for Python environment at `.venv`
TRACE Cached interpreter info for Python 3.12.9, skipping probing: .venv/bin/python3
DEBUG The interpreter in the project environment has different version (3.12.9) than it was created with (3.9.21)
```

(this message is updated to reflect #14329)

The flow is roughly:

- We create an environment with 3.12.9
- We query the environment, and cache the interpreter version for
`.venv/bin/python`
- We create an environment for 3.9.12, replacing the existing one
- We query the environment, and read the cached information

The Python cache entries are keyed by the absolute path to the
interpreter, and rely on the modification time (ctime, nsec resolution)
of the canonicalized path to determine if the cache entry should be
invalidated. The key is a hex representation of a u64 sea hasher output
— which is very unlikely to collide.

After an audit of the Python query caching logic, we determined that the
most likely cause of a collision in cache entries is that the
modification times of underlying interpreters are identical. This seems
pretty feasible, especially if the file system does not support
nanosecond precision — though it appears that the GitHub runners do
support it.

The fix here is to include the canonicalized path in the cache key,
which ensures we're looking at the modification time of the _same_
underlying interpreter.

This will "invalidate" all existing interpreter cache entries but that's
not a big deal.

This should also have the effect of reducing cache churn for
interpreters in virtual environments. Now, when you change Python
versions, we won't invalidate the previous cache entry so if you change
_back_ to the old version we can re-use our cached information.

It's a bit speculative, since we don't have a deterministic reproduction
in CI, but this is the strongest candidate given the logs and should
increase correctness regardless.

Closes https://github.com/astral-sh/uv/issues/14160
Closes https://github.com/astral-sh/uv/issues/13744
Closes https://github.com/astral-sh/uv/issues/13745

Once it's confirmed the flakes are resolved, we should revert

- https://github.com/astral-sh/uv/pull/14275
- #13817
This commit is contained in:
Zanie Blue 2025-06-30 10:39:47 -05:00 committed by GitHub
parent 0372a5b05d
commit 1c7c174bc8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -967,26 +967,10 @@ impl InterpreterInfo {
pub(crate) fn query_cached(executable: &Path, cache: &Cache) -> Result<Self, Error> {
let absolute = std::path::absolute(executable)?;
let cache_entry = cache.entry(
CacheBucket::Interpreter,
// Shard interpreter metadata by host architecture, operating system, and version, to
// invalidate the cache (e.g.) on OS upgrades.
cache_digest(&(
ARCH,
sys_info::os_type().unwrap_or_default(),
sys_info::os_release().unwrap_or_default(),
)),
// We use the absolute path for the cache entry to avoid cache collisions for relative
// paths. But we don't want to query the executable with symbolic links resolved because
// that can change reported values, e.g., `sys.executable`.
format!("{}.msgpack", cache_digest(&absolute)),
);
// We check the timestamp of the canonicalized executable to check if an underlying
// interpreter has been modified.
let modified = canonicalize_executable(&absolute)
.and_then(Timestamp::from_path)
.map_err(|err| {
// Provide a better error message if the link is broken or the file does not exist. Since
// `canonicalize_executable` does not resolve the file on Windows, we must re-use this logic
// for the subsequent metadata read as we may not have actually resolved the path.
let handle_io_error = |err: io::Error| -> Error {
if err.kind() == io::ErrorKind::NotFound {
// Check if it looks like a venv interpreter where the underlying Python
// installation was removed.
@ -1004,7 +988,32 @@ impl InterpreterInfo {
} else {
err.into()
}
})?;
};
let canonical = canonicalize_executable(&absolute).map_err(handle_io_error)?;
let cache_entry = cache.entry(
CacheBucket::Interpreter,
// Shard interpreter metadata by host architecture, operating system, and version, to
// invalidate the cache (e.g.) on OS upgrades.
cache_digest(&(
ARCH,
sys_info::os_type().unwrap_or_default(),
sys_info::os_release().unwrap_or_default(),
)),
// We use the absolute path for the cache entry to avoid cache collisions for relative
// paths. But we don't want to query the executable with symbolic links resolved because
// that can change reported values, e.g., `sys.executable`. We include the canonical
// path in the cache entry as well, otherwise we can have cache collisions if an
// absolute path refers to different interpreters with matching ctimes, e.g., if you
// have a `.venv/bin/python` pointing to both Python 3.12 and Python 3.13 that were
// modified at the same time.
format!("{}.msgpack", cache_digest(&(&absolute, &canonical))),
);
// We check the timestamp of the canonicalized executable to check if an underlying
// interpreter has been modified.
let modified = Timestamp::from_path(canonical).map_err(handle_io_error)?;
// Read from the cache.
if cache