Commit graph

2900 commits

Author SHA1 Message Date
Charlie Marsh
643cf3b3aa
Unify subdirectory handling in source.rs (#314)
Avoids having to encode all the `git+` and `subdirectory=` logic in
multiple places.
2023-11-03 19:33:38 +00:00
Charlie Marsh
edce4ccb24
Add support for subdirectories in URL dependencies (#312)
Closes https://github.com/astral-sh/puffin/issues/307.
2023-11-03 15:28:38 -04:00
Zanie Blue
cbfd6af125
Error if --all-extras is used without a pyproject.toml source (#292)
Closes https://github.com/astral-sh/puffin/issues/260
2023-11-03 12:07:32 -05:00
Charlie Marsh
aa9882eee8
Use locks to prevent concurrent accesses to the same Git repo (#304)
Ensures that if we need to access the same Git repo twice in a
resolution, we only have one handler to that repo at a time. (Otherwise,
`git2` panics.)
2023-11-03 16:33:14 +00:00
Charlie Marsh
fa1bbbbe08
Write fully-precise Git SHAs to pip-compile output (#299)
This PR adds a mechanism by which we can ensure that we _always_ try to
refresh Git dependencies when resolving; further, we now write the fully
resolved SHA to the "lockfile". However, nothing in the code _assumes_
we do this, so the installer will remain agnostic to this behavior.

The specific approach taken here is minimally invasive. Specifically,
when we try to fetch a source distribution, we check if it's a Git
dependency; if it is, we fetch, and return the exact SHA, which we then
map back to a new URL. In the resolver, we keep track of URL
"redirects", and then we use the redirect (1) for the actual source
distribution building, and (2) when writing back out to the lockfile. As
such, none of the types outside of the resolver change at all, since
we're just mapping `RemoteDistribution` to `RemoteDistribution`, but
swapping out the internal URLs.

There are some inefficiencies here since, e.g., we do the Git fetch,
send back the "precise" URL, then a moment later, do a Git checkout of
that URL (which will be _mostly_ a no-op -- since we have a full SHA, we
don't have to fetch anything, but we _do_ check back on disk to see if
the SHA is still checked out). A more efficient approach would be to
return the path to the checked-out revision when we do this conversion
to a "precise" URL, since we'd then only interact with the Git repo
exactly once. But this runs the risk that the checked-out SHA changes
between the time we make the "precise" URL and the time we build the
source distribution.

Closes #286.
2023-11-03 16:26:57 +00:00
Zanie Blue
addcfe533a
Implement custom resolution failure reporter to hide root package versions (#300)
Extends #295 
Closes #214 

Copies some of the implementations from `pubgrub::report` so we can
implement Puffin `PubGrubPackage` specific display when explaining
failed resolutions.

Here, we just drop the dummy version number if it's a
`PubGrubPackage::Root` package. In the future, we can further customize
reporting.
2023-11-03 10:47:01 -05:00
Zanie Blue
e1382cc747
Report project name instead of root when using pyproject.toml files (#295)
Part of https://github.com/astral-sh/puffin/issues/214

Adds a `project: Option<PackageName>` to the `Manifest`, `Resolver`, and
`RequirementsSpecification`.
To populate an optional `name` for `PubGubPackage::Root`.

I'll work on removing the version number next.

Should we consider using the parent directory name when a
`pyproject.toml` file is not present?
2023-11-03 10:22:10 -05:00
konsti
e008c43f29
Add PackageName::as_dist_info_name (#305)
From
https://packaging.python.org/en/latest/specifications/recording-installed-packages/#recording-installed-packages

> This directory is named as {name}-{version}.dist-info, with name and
version fields corresponding to Core metadata specifications. Both
fields must be normalized (see Package name normalization and PEP 440
for the definition of normalization for each field respectively), and
replace dash (-) characters with underscore (_) characters, so the
.dist-info directory always has exactly one dash (-) character in its
stem, separating the name and version fields.

Follow up to #278
2023-11-03 08:16:44 +00:00
Charlie Marsh
e47d3f1f66
Respect pip-like Git branch, tag, and commit references (#297)
We need to parse revisions out from URLs like `MyProject @
git+https://git.example.com/MyProject.git@v1.0`, per [VCS
Support](https://pip.pypa.io/en/stable/topics/vcs-support/). Cargo has
the advantage that it uses a TOML table in its configuration, so the
user has to specify whether they're fetching a commit, a tag, a branch,
etc. We have to instead assume that anything that isn't clearly a commit
is _either_ a branch or a tag.

Closes https://github.com/astral-sh/puffin/issues/296.
2023-11-02 15:10:02 -04:00
Charlie Marsh
a4002fe132
Make cache non-optional in most crates (#293)
This PR makes the cache non-optional in most of Puffin, which simplifies
the code, allows us to reuse the cache within a single command (even
with `--no-cache`), and also allows us to use the cache for disk storage
across an invocation.

I left the cache as optional for the `Virtualenv` and `InterpreterInfo`
abstractions, since those are generic enough that it seems nice to have
a non-cached version, but it's kind of arbitrary.
2023-11-02 13:40:20 -04:00
Charlie Marsh
a02bf2e415
Split source_distribution.rs into separate wheel and sdist fetchers (#291) 2023-11-02 16:04:51 +00:00
konsti
c6f2dfd727
Use shared insta filters (#270)
Internal refactoring for consistency between tests
2023-11-02 16:42:59 +01:00
Charlie Marsh
62c474d880
Add support for Git dependencies (#283)
## Summary

This PR adds support for Git dependencies, like:

```
flask @ git+https://github.com/pallets/flask.git
```

Right now, they're only supported in the resolver (and not the
installer), since the installer doesn't yet support source distributions
at all.

The general approach here is based on Cargo's Git implementation.
Specifically, I adapted Cargo's
[`git`](23eb492cf9/src/cargo/sources/git/mod.rs)
module to perform the cloning, which is based on `libgit2`.

As compared to Cargo's implementation, I made the following changes:

- Removed any unnecessary code.
- Fixed any Clippy errors for our stricter ruleset.
- Removed the dependency on `curl`, in favor of `reqwest` which we use
elsewhere.
- Removed the ability to use `gix`. Cargo allows the use of `gix` as an
experimental flag, but it only supports a small subset of the
operations. When Cargo fully adopts `gix`, we should plan to do the
same.
- Removed Cargo's host key checking. We need to re-add this! I'll do it
shortly.
- Removed Cargo's progress bars. We should re-add this too, but we use
`indicatif` and Cargo had their own thing.

There are a few follow-ups to consider:

- Adding support in the installer.
- When we lock, we should write out the Git URL that includes the exact
SHA. This lets us cache in perpetuity and avoids dependencies changing
without re-locking.
- When we resolve, we should _always_ try to refresh Git dependencies.
(Right now, we skip if the wheel was already built.)

I'll work on the latter two in follow-up PRs.

Closes #202.
2023-11-02 15:14:55 +00:00
konsti
4adaa9a700
Wheel filename distribution package name (#278)
The normalized name abstractions were not consistently, this PR uses
them where they were previously missing:
* `WheelFilename::distribution`
* `Requirement::name`
* `Requirement::extras`
* `Metadata21::name`
* `Metadata21::provides_dist`

With `puffin-package` depending on `pep508_rs` this would be cyclical
crate dependency, so `puffin-normalize` gets split out from
`puffin-package`.

`DistInfoName` has the same task and semantics as `PackageName`, so it's
merged into the latter.

`PackageName` and `ExtraName` documentation is moved onto the type and
their constructors are called `new` instead of `normalize`. We now use
these constructors rarely enough the implicit allocation by
`to_string()` shouldn't matter anymore, while more actual cloning
becomes visible.
2023-11-02 11:15:27 +00:00
konsti
8a8b532330
Handle dist info casing mismatch in worker (#273)
The metadata name may be uppercase, while the wheel and dist info names
are lowercase, or the metadata name and the dist info name are
lowercase, while the wheel name is uppercase. Either way, we just search
the wheel for the name. See `find_dist_info`:
2652caa3e3/crates/install-wheel-rs/src/wheel.rs (L1024-L1057)

I tested this with `wrangler dev` and `bio_embeddings[all]`
2023-11-02 11:04:28 +00:00
konsti
9488804024
Add docker builder (#238)
This docker container provides isolation of source distribution builds,
whether [intended to be
helpful](https://pypi.org/project/nvidia-pyindex/) or other more or less
malicious forms of host system modification.

Fixes #194

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
2023-11-02 12:03:56 +01:00
Charlie Marsh
2ee555df7b
Use puffin_cache::digest in another site (#289) 2023-11-02 04:48:14 +00:00
Charlie Marsh
0c9e975f75
Rename distribution.rs to file.rs in puffin-resolver (#288) 2023-11-01 23:52:53 -04:00
Zanie Blue
b8ff32f6be
Respect markers on constraints (#282)
Closes #252
2023-11-01 20:20:32 -05:00
Charlie Marsh
8123e1a8f6
Add stable hash crate (#281)
This PR adds a `puffin-cache` crate that we can share across a variety of
other crates to generate stable hashes.
2023-11-01 23:41:45 +00:00
Zanie Blue
67e3e45839
Add support for --all-extras to pip-compile (#259)
Closes #244

Notable decision to error if `--all-extra` and `--extra <name>` are both
provided.
2023-11-01 13:39:49 -05:00
konsti
c6aa1cd7a3
Only fall back to copy when the first hard linking failed (#268)
Hard linking might not be supported but we (afaik) can't detect this
ahead of time, so we'll try hard linking the first file, if this
succeeds we'll know later hard linking errors are not due to lack of
os/fs support, if it fails we'll switch to copying for the rest of the
install. Follow up to
https://github.com/astral-sh/puffin/pull/237#discussion_r1376705137
2023-11-01 18:35:52 +01:00
konsti
b0678aa6fc
Show dev resolve output (#277)
Show the resolution in a concise format for puffin-dev. Note that this
doesn't affect the main puffin output, it's just more convenient for me
when developing.
2023-11-01 15:54:47 +00:00
konsti
d1af90163b
Improve client reqwest errors (#276)
I had to debug a failure involving these errors and had to improve their
output.
2023-11-01 16:52:58 +01:00
konsti
997228f4be
Add resolve from cli dev command (#272)
I don't want to create a new file for every requirement i test
2023-11-01 15:46:37 +00:00
Zanie Blue
3d5f8249ef
Add validation of extra names (#257)
Extends #254 

Adds validation of extra names provided by users in `pip-compile` e.g. 

```
error: invalid value 'foo!' for '--extra <EXTRA>': Extra names must start and end with a
letter or digit and may only contain -, _, ., and alphanumeric characters
```

We'll want to add something similar to `PackageName`. I'd be curious to
improve the AP, making the unvalidated nature of `::normalize` clear?
Perhaps worth pursuing later though as I don't have a better idea.
2023-11-01 10:40:43 -05:00
Charlie Marsh
2652caa3e3
Add support for URL dependencies (#251)
## Summary

This PR adds support for resolving and installing dependencies via
direct URLs, like:

```
werkzeug @ 960bb4017c/Werkzeug-2.0.0-py3-none-any.whl
```

These are fairly common (e.g., with `torch`), but you most often see
them as Git dependencies.

Broadly, structs like `RemoteDistribution` and friends are now enums
that can represent either registry-based dependencies or URL-based
dependencies:

```rust
/// A built distribution (wheel) that exists as a remote file (e.g., on `PyPI`).
#[derive(Debug, Clone)]
#[allow(clippy::large_enum_variant)]
pub enum RemoteDistribution {
    /// The distribution exists in a registry, like `PyPI`.
    Registry(PackageName, Version, File),
    /// The distribution exists at an arbitrary URL.
    Url(PackageName, Url),
}
```

In the resolver, we now allow packages to take on an extra, optional
`Url` field:

```rust
#[derive(Debug, Clone, Eq, Derivative)]
#[derivative(PartialEq, Hash)]
pub enum PubGrubPackage {
    Root,
    Package(
        PackageName,
        Option<DistInfoName>,
        #[derivative(PartialEq = "ignore")]
        #[derivative(PartialOrd = "ignore")]
        #[derivative(Hash = "ignore")]
        Option<Url>,
    ),
}
```

However, for the purpose of version satisfaction, we ignore the URL.
This allows for the URL dependency to satisfy the transitive request in
cases like:

```
flask==3.0.0
werkzeug @ 254c3e9b5f/werkzeug-3.0.1-py3-none-any.whl
```

There are a couple limitations in the current approach:

- The caching for remote URLs is done separately in the resolver vs. the
installer. I decided not to sweat this too much... We need to figure out
caching holistically.
- We don't support any sort of time-based cache for remote URLs -- they
just exist forever. This will be a problem for URL dependencies, where
we need some way to evict and refresh them. But I've deferred it for
now.
- I think I need to redo how this is modeled in the resolver, because
right now, we don't detect a variety of invalid cases, e.g., providing
two different URLs for a dependency, asking for a URL dependency and a
_different version_ of the same dependency in the list of first-party
dependencies, etc.
- (We don't yet support VCS dependencies.)
2023-11-01 09:21:44 -04:00
Zanie Blue
fa9f8df396
Fix test snapshot filter when runtime is greater than 1s (#267)
Tests would sometimes flake with this locally e.g. "1.50s" was not
filtered correctly.

Verified with

```diff
diff --git a/crates/puffin-cli/src/commands/pip_compile.rs b/crates/puffin-cli/src/commands/pip_compile.rs
index 0193216..2d6f8af 100644
--- a/crates/puffin-cli/src/commands/pip_compile.rs
+++ b/crates/puffin-cli/src/commands/pip_compile.rs
@@ -150,6 +150,8 @@ pub(crate) async fn pip_compile(
         result => result,
     }?;
 
+    std:🧵:sleep(std::time::Duration::from_secs(1));
+
     let s = if resolution.len() == 1 { "" } else { "s" };
     writeln!(
         printer,
```
2023-11-01 13:15:06 +00:00
Charlie Marsh
079b685c8c
Use distributions for Reporter signatures (#266) 2023-11-01 03:19:13 +00:00
Charlie Marsh
bee1b0f5ad
Avoid re-parsing wheel filename in source distribution tree (#265) 2023-10-31 21:02:09 +00:00
Charlie Marsh
aff26f2301
Reuse distribution structs in Resolver's source_distribution.rs (#264) 2023-10-31 20:50:34 +00:00
Zanie Blue
4be9ba483f
Remove implicit clone from ExtraName and document requirement in PackageName (#262)
per discussion in #137


1169000261
2023-10-31 15:24:27 -05:00
Zanie Blue
0dc7e6335e
Default to puffin venv path to .venv (#261)
Closes https://github.com/astral-sh/puffin/issues/236
2023-10-31 15:24:19 -05:00
Zanie Blue
e00d208318
Add documentation to PackageName::normalize (#263) 2023-10-31 15:24:08 -05:00
Charlie Marsh
89dad0c9ad
Move distribution abstraction in shared crate (#258)
This also allows us to get rid of `PinnedPackage` _and_ to remove some
`Result<...>` types due to needless conversions between
otherwise-identical types.
2023-10-31 15:30:06 -04:00
Zanie Blue
1ddb7d2827
Add error when user requests extras that do not exist (#254)
Extends #253 
Closes #241 

Adds `extras` to `RequirementsSpecification` to track extras used to
construct the requirements so we can throw an error when not all of the
requested extras are used.
2023-10-31 19:17:36 +00:00
Zanie Blue
322532d6f9
Normalize optional dependency group names in pyproject files (#253)
Going to add some tests.

Extends #239 
Closes #245 

Normalizes optional dependency group names found in pyproject files
before comparing them to the normalized user-requested extras.
2023-10-31 14:15:00 -05:00
Charlie Marsh
3312ce30f5
Upgrade crates and remove unused dependencies (#256) 2023-10-31 13:16:58 -04:00
Charlie Marsh
16aac834ee
Move PyPI-oriented types out of puffin-client crate (#255)
Just an internal change to avoid a dependency on `puffin-client` for
those crates that need access to PyPI-metadata types.
2023-10-31 17:10:23 +00:00
Zanie Blue
08f09e4743
Add support for pip-compile --extra <name> (#239)
Adds support for `pip-compile --extra <name> ...` which includes
optional dependencies in the specified group in the resolution.

Following precedent in `pip-compile`, if a given extra is not found,
there is no error. ~We could consider warning in this case.~ We should
probably add an error but it expands scope and will be considered
separately in #241
2023-10-31 11:59:40 -05:00
Charlie Marsh
9244404102
Resolve interpreter symlinks when creating virtual environments (#250)
Closes https://github.com/astral-sh/puffin/issues/249.
2023-10-31 08:22:52 -04:00
Charlie Marsh
2f38701008
Remove unused wheel cache argument from downloader (#248) 2023-10-31 02:23:50 +00:00
Charlie Marsh
ae203f998a
Rename Unzipper#download to Unzipper#unzip (#247) 2023-10-31 01:19:27 +00:00
Charlie Marsh
1f059b30dd
Remove Box<Pin<...>> from process_request (#242) 2023-10-30 16:34:57 -04:00
konsti
35d6bd761b
Fallback to copy if hardlinking failed (#237) 2023-10-30 19:10:01 +00:00
konstin
1529def563 Implement mixed PEP 517 and setup.py build
There are packages such as DTLSSocket 0.1.16 that say
```toml
[build-system]
requires = ["Cython<3", "setuptools", "wheel"]
```
In this case we need to install requires PEP 517 style but then call setup.py in the
legacy way

Part of making home-assistant work
2023-10-30 19:11:52 +01:00
konsti
29bd0a4ed8
Fix musl compilation (#234)
musl (which we already use in ruff) allows statically linked binaries on
linux. This PR switches to rustls and vendors and fixes the glibc
detection. Using static musl builds makes it easier to avoid glibc
errors in docker and we'll need it later for alpine users anyway.

An alternative is using vendored openssl.
2023-10-30 18:10:17 +01:00
konsti
d47dc64974
Ignore self requirements (#233)
gps3 0.33.3 depends on itself, which we can ignore. I've also added the
home assistant requirements since it occurred when testing with this.
2023-10-30 17:13:52 +01:00
Charlie Marsh
0be20a41a4
Make version selection wheel-vs.-sdist-agnostic (#232)
Closes https://github.com/astral-sh/puffin/issues/231.
2023-10-30 11:21:10 -04:00
Charlie Marsh
8d992dca3f
Fail gracefully when invalid markers are stored (#230) 2023-10-30 04:02:51 +00:00