fix(git): lock cache on resolve (#5051)

Fixes a concurrency issue when multiple processes are installing the
same package in different virtual environments from Git ref (not a
specific Git commit).

## Symptoms

That's how some of symptoms looked like in our case:

```
DEBUG uv 0.2.21
DEBUG Checking for Python interpreter at path `/tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/37bf51bfba4699a940ce31349422b24a5bc55a2b179ed7aec74459a9ae8d57b7/bin/python`
DEBUG Using Python 3.12.4 environment at /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/37bf51bfba4699a940ce31349422b24a5bc55a2b179ed7aec74459a9ae8d57b7/bin/python
DEBUG Acquired lock for `/tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/37bf51bfba4699a940ce31349422b24a5bc55a2b179ed7aec74459a9ae8d57b7`
DEBUG At least one requirement is not satisfied: torch
DEBUG Using request timeout of 300s
DEBUG Found 37 packages in `--find-links` entry: /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/pip/wheels
DEBUG Updating git source `Url { scheme: "https", cannot_be_a_base: false, username: "***", password: None, host: Some(Domain("github.com")), port: None, path: "/iterative/datachain", query: None, fragment: None }`
DEBUG Attempting GitHub fast path for: https://api.github.com/repos/iterative/datachain/commits/fix-distributed-test
DEBUG failed to check github HTTP status client error (404 Not Found) for url (https://api.github.com/repos/iterative/datachain/commits/fix-distributed-test)
DEBUG Performing a Git fetch for: https://***@github.com/iterative/datachain
error: Failed to download and build: `datachain @ git+https://***@github.com/iterative/datachain@fix-distributed-test`
Caused by: Git operation failed
Caused by: process didn't exit successfully: `git clone --local /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/uv/git-v0/db/9d45a3e6f56b0a69 /tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/uv/git-v0/checkouts/9d45a3e6f56b0a69/56b15b8` (exit status: 128)
--- stderr
fatal: destination path '/tmp/pytest-of-runner/pytest-0/tmp_venv_dir/python3.12/.cache/uv/git-v0/checkouts/9d45a3e6f56b0a69/56b15b8' already exists and is not an empty directory.
```

## Cause of the issue

It is the same command that is failing - `git clone`, and I think it's
happening because it was trying to first get the repo to dereference the
`fix-distributed-test` branch:

`Given a remote source distribution, return a precise variant, if
possible.`

And it's happening w/i acquiring a lock around cache.

## Fix

I thinks we can reuse the existing `fetch` method that has already lock
around cache:


https://github.com/astral-sh/uv/pull/5051/files#diff-f58bb99dee2c4922d156ace3e7de651f0d9a81fc8e9447a2ad865de5c53543fcR61-R68

```python
        // Avoid races between different processes, too.
        let lock_dir = cache.join("locks");
        ....
```

## Questions

- Are there any tests that cover concurrency? I'm quite new to Rust and
if someone can point me to some examples and I can create a similar test
or a new one.
- Is error handling done correctly in this PR (again, I'm new to Rust -
I'll review and read about it, but it's better also for someone else to
review this)
This commit is contained in:
Ivan Shcheklein 2024-07-16 13:06:06 -07:00 committed by GitHub
parent a86eebd6c3
commit b5ec859273
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -108,16 +108,7 @@ impl GitResolver {
}
}
// Fetch the precise SHA of the Git reference (which could be a branch, a tag, a partial
// commit, etc.).
let source = if let Some(reporter) = reporter {
GitSource::new(url.clone(), client, cache).with_reporter(reporter)
} else {
GitSource::new(url.clone(), client, cache)
};
let fetch = tokio::task::spawn_blocking(move || source.fetch())
.await?
.map_err(GitResolverError::Git)?;
let fetch = self.fetch(url, client, cache, reporter).await?;
let git = fetch.into_git();
// Insert the resolved URL into the in-memory cache.