uv/crates/puffin-git/src/lib.rs
Charlie Marsh fa1bbbbe08
Write fully-precise Git SHAs to pip-compile output (#299)
This PR adds a mechanism by which we can ensure that we _always_ try to
refresh Git dependencies when resolving; further, we now write the fully
resolved SHA to the "lockfile". However, nothing in the code _assumes_
we do this, so the installer will remain agnostic to this behavior.

The specific approach taken here is minimally invasive. Specifically,
when we try to fetch a source distribution, we check if it's a Git
dependency; if it is, we fetch, and return the exact SHA, which we then
map back to a new URL. In the resolver, we keep track of URL
"redirects", and then we use the redirect (1) for the actual source
distribution building, and (2) when writing back out to the lockfile. As
such, none of the types outside of the resolver change at all, since
we're just mapping `RemoteDistribution` to `RemoteDistribution`, but
swapping out the internal URLs.

There are some inefficiencies here since, e.g., we do the Git fetch,
send back the "precise" URL, then a moment later, do a Git checkout of
that URL (which will be _mostly_ a no-op -- since we have a full SHA, we
don't have to fetch anything, but we _do_ check back on disk to see if
the SHA is still checked out). A more efficient approach would be to
return the path to the checked-out revision when we do this conversion
to a "precise" URL, since we'd then only interact with the Git repo
exactly once. But this runs the risk that the checked-out SHA changes
between the time we make the "precise" URL and the time we build the
source distribution.

Closes #286.
2023-11-03 16:26:57 +00:00

90 lines
2.4 KiB
Rust

use url::Url;
use crate::git::GitReference;
pub use crate::source::GitSource;
mod git;
mod source;
mod util;
/// A reference to a Git repository.
#[derive(Debug, Clone)]
pub struct Git {
/// The URL of the Git repository, with any query parameters and fragments removed.
url: Url,
/// The reference to the commit to use, which could be a branch, tag or revision.
reference: GitReference,
/// The precise commit to use, if known.
precise: Option<git2::Oid>,
}
impl Git {
#[must_use]
pub(crate) fn with_precise(mut self, precise: git2::Oid) -> Self {
self.precise = Some(precise);
self
}
}
impl TryFrom<Url> for Git {
type Error = anyhow::Error;
/// Initialize a [`Git`] source from a URL.
fn try_from(mut url: Url) -> Result<Self, Self::Error> {
// Remove any query parameters and fragments.
url.set_fragment(None);
url.set_query(None);
// If the URL ends with a reference, like `https://git.example.com/MyProject.git@v1.0`,
// extract it.
let mut reference = GitReference::DefaultBranch;
if let Some((prefix, rev)) = url.as_str().rsplit_once('@') {
reference = GitReference::from_rev(rev);
url = Url::parse(prefix)?;
}
Ok(Self {
url,
reference,
precise: None,
})
}
}
impl From<Git> for Url {
fn from(git: Git) -> Self {
let mut url = git.url;
// If we have a precise commit, add `@` and the commit hash to the URL.
if let Some(precise) = git.precise {
url.set_path(&format!("{}@{}", url.path(), precise));
} else {
// Otherwise, add the branch or tag name.
match git.reference {
GitReference::Branch(rev)
| GitReference::Tag(rev)
| GitReference::BranchOrTag(rev)
| GitReference::Rev(rev) => {
url.set_path(&format!("{}@{}", url.path(), rev));
}
GitReference::DefaultBranch => {}
}
}
url
}
}
impl std::fmt::Display for Git {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.url)
}
}
#[derive(Debug, Clone, Copy)]
pub enum FetchStrategy {
/// Fetch Git repositories using libgit2.
Libgit2,
/// Fetch Git repositories using the `git` CLI.
Cli,
}