Commit graph

14 commits

Author SHA1 Message Date
Ivan Smirnov
4d82e88863
Follow links when cache-key is a glob (#13438)
Some checks are pending
CI / check system | python on debian (push) Blocked by required conditions
CI / check system | python on fedora (push) Blocked by required conditions
CI / check system | python on ubuntu (push) Blocked by required conditions
CI / check system | python on rocky linux 8 (push) Blocked by required conditions
CI / check system | python on rocky linux 9 (push) Blocked by required conditions
CI / check system | pyston (push) Blocked by required conditions
CI / check system | python on macos aarch64 (push) Blocked by required conditions
CI / check system | homebrew python on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.11 on windows x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on windows x86-64 (push) Blocked by required conditions
CI / check system | amazonlinux (push) Blocked by required conditions
CI / check system | embedded python3.10 on windows x86-64 (push) Blocked by required conditions
CI / benchmarks | walltime aarch64 linux (push) Blocked by required conditions
CI / benchmarks | instrumented (push) Blocked by required conditions
CI / check system | graalpy on ubuntu (push) Blocked by required conditions
CI / check system | pypy on ubuntu (push) Blocked by required conditions
CI / check system | python on macos x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86 (push) Blocked by required conditions
CI / check system | python3.13 on windows x86-64 (push) Blocked by required conditions
CI / check system | x86-64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | aarch64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | windows registry (push) Blocked by required conditions
CI / check system | python3.12 via chocolatey (push) Blocked by required conditions
CI / check system | python3.9 via pyenv (push) Blocked by required conditions
CI / check system | python3.13 (push) Blocked by required conditions
CI / check system | conda3.11 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.8 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.11 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on linux x86-64 (push) Blocked by required conditions
## Summary

There's some inconsistent behaviour in handling symlinks when
`cache-key` is a glob or a file path. This PR attempts to address that.

- When cache-key is a path,
[`Path::metadata()`](https://doc.rust-lang.org/std/path/struct.Path.html#method.metadata)
is used to check if it's a file or not. According to the docs:
> This function will traverse symbolic links to query information about
the destination file.

So, if the target file is a symlink, it will be resolved and the
metadata will be queried for the underlying file.

- When cache-key is a glob, `globwalk` is used, specifically allowing
for symlinks:

  ```rust
  .file_type(globwalk::FileType::FILE | globwalk::FileType::SYMLINK)
  ```

- However, without enabling link following, `DirEntry::metadata()` will
return an equivalent of `Path::symlink_metadata()` (and not
`Path::metadata()`), which will have a file type that looks like

   ```rust
   FileType {
       is_file: false,
       is_dir: false,
       is_symlink: true,
      ..
    }
    ```

- Then, there's a check for `metadata.is_file()` which fails and
complains that the target entry "is a directory when file was expected".

- TLDR: glob cache-keys don't work with symlinks.

## Solutions

Option 1 (current PR): follow symlinks.

Option 2 (also doable): don't follow symlinks, but resolve the resulting
target entry manually in case its file type is a symlink. However, this
would be a little weird and unobvious in that we resolve files but not
directories for some reason. Also, symlinking directories is pretty
useful if you want to symlink directories of local dependencies that are
not under the project's path.

## Test Plan

This has been tested manually:

```rust
fn main() {
    for follow_links in [false, true] {
        let walker = globwalk::GlobWalkerBuilder::from_patterns(".", &["a/*"])
            .file_type(globwalk::FileType::FILE | globwalk::FileType::SYMLINK)
            .follow_links(follow_links)
            .build()
            .unwrap();
        let entry = walker.into_iter().next().unwrap().unwrap();
        dbg!(&entry);
        dbg!(entry.file_type());
        dbg!(entry.path_is_symlink());
        dbg!(entry.path());
        let meta = entry.metadata().unwrap();
        dbg!(meta.is_file());
    }

    let path = std::path::PathBuf::from("./a/b");
    dbg!(path.metadata().unwrap().file_type());
    dbg!(path.symlink_metadata().unwrap().file_type());
}
```

Current behaviour (glob cache-key, don't follow links):
```
[src/main.rs:9:9] &entry = DirEntry("./a/b")
[src/main.rs:10:9] entry.file_type() = FileType {
    is_file: false,
    is_dir: false,
    is_symlink: true,
    ..
}
[src/main.rs:11:9] entry.path_is_symlink() = true
[src/main.rs:12:9] entry.path() = "./a/b"
[src/main.rs:14:9] meta.is_file() = false
```

Glob cache-key, follow links:
```
[src/main.rs:9:9] &entry = DirEntry("./a/b")
[src/main.rs:10:9] entry.file_type() = FileType {
    is_file: true,
    is_dir: false,
    is_symlink: false,
    ..
}
[src/main.rs:11:9] entry.path_is_symlink() = true
[src/main.rs:12:9] entry.path() = "./a/b"
[src/main.rs:14:9] meta.is_file() = true
```

Using `path.metadata()` for a non-glob cache key:
```
[src/main.rs:18:5] path.metadata().unwrap().file_type() = FileType {
    is_file: true,
    is_dir: false,
    is_symlink: false,
    ..
}
[src/main.rs:19:5] path.symlink_metadata().unwrap().file_type() = FileType {
    is_file: false,
    is_dir: false,
    is_symlink: true,
    ..
}
```
2025-07-14 11:35:34 -04:00
Ivan Smirnov
567468ce72
More efficient cache-key globbing + support parent paths in globs (#13469)
Some checks are pending
CI / check system | python on debian (push) Blocked by required conditions
CI / check system | python on fedora (push) Blocked by required conditions
CI / check system | python3.13 (push) Blocked by required conditions
CI / check system | python on ubuntu (push) Blocked by required conditions
CI / check system | python on rocky linux 8 (push) Blocked by required conditions
CI / check system | python on rocky linux 9 (push) Blocked by required conditions
CI / check system | graalpy on ubuntu (push) Blocked by required conditions
CI / check system | pypy on ubuntu (push) Blocked by required conditions
CI / check system | pyston (push) Blocked by required conditions
CI / check system | python on macos aarch64 (push) Blocked by required conditions
CI / check system | homebrew python on macos aarch64 (push) Blocked by required conditions
CI / check system | python on macos x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86 (push) Blocked by required conditions
CI / check system | python3.13 on windows x86-64 (push) Blocked by required conditions
CI / check system | x86-64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | aarch64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | windows registry (push) Blocked by required conditions
CI / check system | python3.12 via chocolatey (push) Blocked by required conditions
CI / check system | python3.9 via pyenv (push) Blocked by required conditions
CI / check system | conda3.11 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.8 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.11 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.11 on windows x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on windows x86-64 (push) Blocked by required conditions
CI / check system | amazonlinux (push) Blocked by required conditions
CI / check system | embedded python3.10 on windows x86-64 (push) Blocked by required conditions
CI / benchmarks | walltime aarch64 linux (push) Blocked by required conditions
CI / benchmarks | instrumented (push) Blocked by required conditions
## Summary

(Related PR: #13438 - would be nice to have it merged as well since it
touches on the same globwalker code)

There's a few issues with `cache-key` globs, which this PR attempts to
address:

- As of the current state, parent or absolute paths are not allowed,
which is not obvious and is not documented. E.g., cache-key paths of the
form `{file = "../dep/**"}` will be essentially ignored.
- Absolute glob patterns also don't work (funnily enough, there's logic
in `globwalk` itself that attempts to address it in
[`globwalk::glob_builder()`](8973fa2bc5/src/lib.rs (L415)),
which serves as inspiration to some parts of this PR).
- The reason for parent paths being ignored is the way globwalker is
currently being triggered in `uv-cache-info`: the base directory is
being walked over completely and each entry is then being matched to one
of the provided match patterns.
- This may also end up being very inefficient if you have a huge root
folder with thousands of files: if your match patterns are `a/b/*.rs`
and `a/c/*.py` then instead of walking over the root directory, you can
just walk over `a/b` and `a/c` and match the relevant patterns there.
- Why supporting parent paths may be important to the point of being a
blocker: in large codebases with python projects depending on other
local non-python projects (e.g. rust crates), cache-keys can be very
useful to track dependency on the source code of the latter (e.g.
`cache-keys = [{ file = "../../crates/some-dep/**" }]`.
- TLDR: parent/absolute cache-key globs don't work, glob walk can be
slow.

## Solution

- In this PR, user-provided glob patterns are first clustered
(LCP-style) into pattern groups with longest common path prefix; each of
these groups can then be walked over separately.
- Pattern groups do not overlap, so we would never walk over the same
directory twice (unless there's symlinks pointing to same folders).
- Paths are not canonicalized nor virtually normalized (which is
impossible on Unix without FS access), so the method is symlink-safe
(i.e. we don't treat `a/b/..` as `a`) and should work fine with #13438.
- Because of LCP logic, the minimal amount of directory space will be
traversed to cover all patterns.
- Absolute glob patterns will now work.
- Parent-relative glob patterns will now work.
- Glob walking will be more efficient in some cases.

## Possible improvements

- Efficiency can be further greatly improved if we limit max depth for
globwalk. Currently, a simple ".toml" will deep-traverse the whole
folder. Essentially, max depth can be always set to either N or
infinity. If a pattern at a pivot node contains `**`, we collect all
children nodes from the subtree into the same group and don't limit max
depth; otherwise, we set max depth to the length of the glob pattern.
This wouldn't change correctness though and can we done separately if
needed.
- If this is considered important enough, docs can be updated to
indicate that parent and absolute globs are supported (and symlinks are
resolved, if the relevant PR is algo merged in).

## Test Plan

- Glob splitting and clustering tests are included in the PR.
- Relative and absolute glob cache-keys were tested in an actual
codebase.
2025-07-11 10:01:54 -05:00
Charlie Marsh
e0f81f0d4a
Avoid allocations for default cache keys (#12063)
Some checks are pending
CI / check system | conda3.8 on windows x86-64 (push) Blocked by required conditions
CI / check cache | ubuntu (push) Blocked by required conditions
CI / check cache | macos aarch64 (push) Blocked by required conditions
CI / check system | python on debian (push) Blocked by required conditions
CI / check system | python on fedora (push) Blocked by required conditions
CI / check system | python on ubuntu (push) Blocked by required conditions
CI / check system | python on opensuse (push) Blocked by required conditions
CI / check system | python on rocky linux 8 (push) Blocked by required conditions
CI / check system | python on rocky linux 9 (push) Blocked by required conditions
CI / check system | pypy on ubuntu (push) Blocked by required conditions
CI / check system | pyston (push) Blocked by required conditions
CI / check system | python on macos aarch64 (push) Blocked by required conditions
CI / check system | homebrew python on macos aarch64 (push) Blocked by required conditions
CI / check system | python on macos x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86-64 (push) Blocked by required conditions
CI / check system | python3.10 on windows x86 (push) Blocked by required conditions
CI / check system | python3.13 on windows x86-64 (push) Blocked by required conditions
CI / check system | x86-64 python3.13 on windows aarch64 (push) Blocked by required conditions
CI / check system | windows registry (push) Blocked by required conditions
CI / check system | python3.12 via chocolatey (push) Blocked by required conditions
CI / check system | python3.9 via pyenv (push) Blocked by required conditions
CI / check system | python3.13 (push) Blocked by required conditions
CI / check system | conda3.11 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.8 on macos aarch64 (push) Blocked by required conditions
CI / check system | conda3.11 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.8 on linux x86-64 (push) Blocked by required conditions
CI / check system | conda3.11 on windows x86-64 (push) Blocked by required conditions
CI / check system | amazonlinux (push) Blocked by required conditions
CI / check system | embedded python3.10 on windows x86-64 (push) Blocked by required conditions
CI / benchmarks (push) Blocked by required conditions
2025-03-17 19:59:32 -04:00
Charlie Marsh
7ea2f657fa
Add src to default cache keys (#12062)
## Summary

This has come up a few times, so it seems worth addressing. If you
migrate from a flat layout to a `src` layout or vice versa, we now
invalidate the package metadata.

Closes https://github.com/astral-sh/uv/issues/12047
2025-03-17 17:56:10 -04:00
Sergei Nizovtsev
051aaa5fe5
Fix git-tag cache-key reader in case of slashes (#10467) (#10500)
## Summary

The assumption that all tags are listed under a flat `.git/ref/tags`
structure was wrong. Git creates a hierarchy of directories for tags
containing slashes. To fix the cache key calculation, we need to
recursively traverse all files under that folder instead.

## Test Plan

1. Create an `uv` project with git-tag cache-keys;
2. Add any tag with slash;
3. Run `uv sync` and see uv_cache_info error in verbose log;
4. `uv sync` doesn't trigger reinstall on next tag addition or removal;
5. With fix applied, reinstall triggers on every tag update and there
are no errors in the log.

Fixes #10467

---------

Co-authored-by: Sergei Nizovtsev <sergei.nizovtsev@eqvilent.com>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2025-01-11 21:30:46 -05:00
Charlie Marsh
0b5c0220b5
Allow environment variables to be included in cache keys (#10170)
## Summary

Closes https://github.com/astral-sh/uv/issues/8130.
2024-12-26 15:31:49 +00:00
Charlie Marsh
29193acc67
Improve error message for cache info serialization (#8500)
Some checks are pending
CI / integration test | uv publish (push) Blocked by required conditions
CI / check cache | ubuntu (push) Blocked by required conditions
CI / check cache | macos aarch64 (push) Blocked by required conditions
CI / check system | python on debian (push) Blocked by required conditions
CI / check system | python on fedora (push) Blocked by required conditions
CI / check system | python on ubuntu (push) Blocked by required conditions
CI / check system | python on opensuse (push) Blocked by required conditions
CI / check system | python on rocky linux 8 (push) Blocked by required conditions
CI / check system | python on rocky linux 9 (push) Blocked by required conditions
CI / check system | pypy on ubuntu (push) Blocked by required conditions
CI / check system | pyston (push) Blocked by required conditions
CI / check system | alpine (push) Blocked by required conditions
CI / check system | python on macos aarch64 (push) Blocked by required conditions
CI / check system | homebrew python on macos aarch64 (push) Blocked by required conditions
CI / check system | python on macos x86_64 (push) Blocked by required conditions
CI / check system | python3.10 on windows (push) Blocked by required conditions
CI / check system | python3.10 on windows x86 (push) Blocked by required conditions
CI / check system | python3.13 on windows (push) Blocked by required conditions
CI / check system | python3.12 via chocolatey (push) Blocked by required conditions
CI / check system | python3.9 via pyenv (push) Blocked by required conditions
CI / check system | python3.13 (push) Blocked by required conditions
CI / check system | conda3.11 on linux (push) Blocked by required conditions
CI / check system | conda3.8 on linux (push) Blocked by required conditions
CI / check system | conda3.11 on macos (push) Blocked by required conditions
CI / check system | conda3.8 on macos (push) Blocked by required conditions
CI / check system | conda3.11 on windows (push) Blocked by required conditions
CI / check system | conda3.8 on windows (push) Blocked by required conditions
CI / check system | amazonlinux (push) Blocked by required conditions
CI / check system | embedded python3.10 on windows (push) Blocked by required conditions
CI / benchmarks (push) Blocked by required conditions
## Summary

We no longer need this struct; we bumped the cache bucket version
anyway, so the `Timestamp` variant is never encountered. This means we
get real Serde error messages.

Closes https://github.com/astral-sh/uv/issues/8488.
2024-10-23 13:17:31 +00:00
Charlie Marsh
7730861bc5
Allow users to incorporate Git tags into dynamic cache keys (#8259)
## Summary

You can now use `cache-keys = [{ git = { commit = true, tags = true }
}]` to include both the current commit and set of tags in the cache key.

Closes https://github.com/astral-sh/uv/issues/7866.

Closes https://github.com/astral-sh/uv/issues/7997.
2024-10-16 11:13:29 -04:00
Amos Wenger
715f28fd39
chore: Move all integration tests to a single binary (#8093)
As per
https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html

Before that, there were 91 separate integration tests binary.

(As discussed on Discord — I've done the `uv` crate, there's still a few
more commits coming before this is mergeable, and I want to see how it
performs in CI and locally).
2024-10-11 16:41:35 +02:00
Charlie Marsh
8f62fc920e
Avoid verbose warnings for non-existent cache keys (#8077)
## Summary

These show up on ~every `--verbose` run.
2024-10-10 09:35:03 +00:00
Charlie Marsh
d80139698d
Trim commits when reading from Git refs (#7922)
## Summary

Closes #7919.
2024-10-04 11:10:53 +00:00
Charlie Marsh
65d53a7474
Use globwalk for cache-keys matching (#7337)
## Summary

This should be more efficient as we can do a single traversal.

Closes https://github.com/astral-sh/uv/issues/7321.
2024-09-12 15:06:05 -04:00
Charlie Marsh
58a157a0ad
Support globs as cache keys in tool.uv.cache-keys (#7268)
## Summary

This has been asked for a few times. There are risks that these checks
could be slow, but they're buyer-beware.

Closes https://github.com/astral-sh/uv/issues/7246.
2024-09-11 15:30:59 -04:00
Charlie Marsh
4f2349119c
Add support for dynamic cache keys (#7136)
## Summary

This PR adds a more flexible cache invalidation abstraction for uv, and
uses that new abstraction to improve support for dynamic metadata.

Specifically, instead of relying solely on a timestamp, we now pass
around a `CacheInfo` struct which (as of now) contains
`Option<Timestamp>` and `Option<Commit>`. The `CacheInfo` is saved in
`dist-info` as `uv_cache.json`, so we can test already-installed
distributions for cache validity (along with testing _cached_
distributions for cache validity).

Beyond the defaults (`pyproject.toml`, `setup.py`, and `setup.cfg`
changes), users can also specify additional cache keys, and it's easy
for us to extend support in the future. Right now, cache keys can either
be instructions to include the current commit (for `setuptools_scm` and
similar) or file paths (for `hatch-requirements-txt` and similar):

```toml
[tool.uv]
cache-keys = [{ file = "requirements.txt" }, { git = true }]
```

This change should be fully backwards compatible.

Closes https://github.com/astral-sh/uv/issues/6964.

Closes https://github.com/astral-sh/uv/issues/6255.

Closes https://github.com/astral-sh/uv/issues/6860.
2024-09-09 20:19:15 +00:00