By default, we will build source distributions for both resolving and
installing, running arbitrary code. `--no-build` adds an option to ban
this and only install from wheels, no source distributions or git builds
allowed. We also don't fetch these and instead report immediately.
I've heard from users for whom this is a requirement, i'm implementing
it now because it's helpful for testing.
I'm thinking about adding a shared `PuffinSharedArgs` struct so we don't
have to repeat each option everywhere.
Closes https://github.com/astral-sh/puffin/issues/356.
The example from the issue now renders as:
```
❯ cargo run --bin puffin-dev -q -- resolve-cli tensorflow-cpu-aws
puffin-dev failed
Caused by: No solution found when resolving build dependencies for source distribution:
Caused by: Because there is no available version for tensorflow-cpu-aws and root depends on tensorflow-cpu-aws, version solving failed.
```
This was dumb of me. We pass out indexes when adding progress bars, but
were then removing entries on completion, so any outstanding indexes
were now _invalid_. We just shouldn't remove them. The `MultiProgress`
retains a reference anyway, IIUC.
Closes https://github.com/astral-sh/puffin/issues/360.
Rejigger Linux platform detection
This change makes some very small improvements to the Linux platform
detection logic. In particular, the existing logic did not work on my
Archlinux machine since /lib64/ld-linux-x86-64.so.2 isn't a symlink. In
that case, the detection logic should have fallen back to the slower
`ldd --version` technique, but `read_link` fails outright when its
argument isn't a symbolic link. So we tweak the logic to allow it to
fail, and if it does, we still try the `ldd --version` approach instead
of giving up completely.
I also made some cosmetic improvements to the regex matching, as well as
ensuring that the regexes are only compiled exactly once.
It looks like Cargo, notice the bold green lines at the top (which
appear during the resolution, to indicate Git fetches and source
distribution builds):
<img width="868" alt="Screen Shot 2023-11-06 at 11 28 47 PM"
src="9647a480-7be7-41e9-b1d3-69faefd054ae">
<img width="868" alt="Screen Shot 2023-11-06 at 11 28 51 PM"
src="6bc491aa-5b51-4b37-9ee1-257f1bc1c049">
Closes https://github.com/astral-sh/puffin/issues/287 although we can do
a lot more here.
We now write the `direct_url.json` when installing, and _skip_
installing if we find a package installed via the direct URL that the
user is requesting.
A lot of TODOs, especially around cleaning up the `Source` abstraction
and its relationship to `DirectUrl`. I'm gonna keep working on these
today, but this works and makes the requirements clear.
Closes#332.
## Summary
This PR just adds the logic in `install-wheel-rs` to write
`direct_url.json`. We're not actually taking advantage of it yet (or
wiring it through) in Puffin.
Part of https://github.com/astral-sh/puffin/issues/332.
Ran `cargo upgrade --incompatible`, seems there are no changes required.
From cacache 0.12.0:
> BREAKING CHANGE: some signatures for copy have changed, and copy no
longer automatically reflinks
`which` 5.0.0 seems to have only error message changes.
## Summary
This is a first-pass at adding source distribution support to the
installer.
The previous installation flow was:
1. Come up with a plan.
1. Find a distribution (specific file) for every package that we'll need
to download.
1. Download those distributions.
1. Unzip them (since we assumed they were all wheels).
1. Install them into the virtual environment.
Now, Step (3) downloads both wheels and source distributions, and we
insert a step between Steps (3) and (4) to build any source
distributions into zipped wheels.
There are a bunch of TODOs, the most important (IMO) is that we
basically have two implementations of downloading and building, between
the stuff in `puffin_installer` and `puffin_resolver` (namely in
`crates/puffin-resolver/src/distribution`). I didn't attempt to clean
that up here -- it's already a problem, and it's related to the overall
problem we need to solve around unified caching and resource management.
Closes#243.
`PackageName` and `ExtraName` can now only be constructed from valid
names. They share the same rules, so i gave them the same
implementation. Constructors are split between `new` (owned) and
`from_str` (borrowed), with the owned version avoiding allocations.
Closes#279
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
This just enables the `DistributionFinder` (previously known as the
`WheelFinder`) to select source distributions when there are no matching
wheels for a given platform. As a reminder, the `DistributionFinder` is
a simple resolver that doesn't look at any dependencies: it just takes a
set of pinned packages, and finds a distribution to install to satisfy
each requirement.
In the resolver, our current model for solving URL dependencies requires
that we visit the URL dependency _before_ the registry-based dependency.
This PR encodes a strict requirement that all URL dependencies be
declared upfront, either as requirements or constraints.
I wrote more about how it works and why it's necessary in documentation
[here](https://github.com/astral-sh/puffin/pull/319/files#diff-2b1c4f36af0c62a2b7bebeae9473ae083588f2a6b18a3ec52393a24266adecbbR20).
I think we could relax this constraint over time, but it requires a more
sophisticated model -- and for now, I just want something that's (1)
correct, (2) easy for us to reason about, and (3) easy for users to
reason about.
As additional motivation... allowing arbitrary URL dependencies anywhere
in the tree creates some really confusing situations in which I'm not
even sure what the right answers are. For example, assume you declare a
direct dependency on `Werkzeug==2.0.0`. You then depend on a version of
Flask that depends on a version of `Werkzeug` from some arbitrary URL.
You build the source distribution at that arbitrary URL, and it turns
out it _does_ build to a declared version of 2.0.0. What should happen?
(And if it resolves to a version that _isn't_ 2.0.0, what should happen
_then_?) I suspect different tools handle this differently, but it must
lead to a lot of "silent" failures. In my testing of Poetry, it seems
like Poetry just ignores the URL dependency, which seems wrong, but is
also a behavior we could implement in the future.
Closes https://github.com/astral-sh/puffin/issues/303.
Closes https://github.com/astral-sh/puffin/issues/284.
Ensures that if we need to access the same Git repo twice in a
resolution, we only have one handler to that repo at a time. (Otherwise,
`git2` panics.)
This PR adds a mechanism by which we can ensure that we _always_ try to
refresh Git dependencies when resolving; further, we now write the fully
resolved SHA to the "lockfile". However, nothing in the code _assumes_
we do this, so the installer will remain agnostic to this behavior.
The specific approach taken here is minimally invasive. Specifically,
when we try to fetch a source distribution, we check if it's a Git
dependency; if it is, we fetch, and return the exact SHA, which we then
map back to a new URL. In the resolver, we keep track of URL
"redirects", and then we use the redirect (1) for the actual source
distribution building, and (2) when writing back out to the lockfile. As
such, none of the types outside of the resolver change at all, since
we're just mapping `RemoteDistribution` to `RemoteDistribution`, but
swapping out the internal URLs.
There are some inefficiencies here since, e.g., we do the Git fetch,
send back the "precise" URL, then a moment later, do a Git checkout of
that URL (which will be _mostly_ a no-op -- since we have a full SHA, we
don't have to fetch anything, but we _do_ check back on disk to see if
the SHA is still checked out). A more efficient approach would be to
return the path to the checked-out revision when we do this conversion
to a "precise" URL, since we'd then only interact with the Git repo
exactly once. But this runs the risk that the checked-out SHA changes
between the time we make the "precise" URL and the time we build the
source distribution.
Closes#286.
Extends #295Closes#214
Copies some of the implementations from `pubgrub::report` so we can
implement Puffin `PubGrubPackage` specific display when explaining
failed resolutions.
Here, we just drop the dummy version number if it's a
`PubGrubPackage::Root` package. In the future, we can further customize
reporting.
Part of https://github.com/astral-sh/puffin/issues/214
Adds a `project: Option<PackageName>` to the `Manifest`, `Resolver`, and
`RequirementsSpecification`.
To populate an optional `name` for `PubGubPackage::Root`.
I'll work on removing the version number next.
Should we consider using the parent directory name when a
`pyproject.toml` file is not present?
From
https://packaging.python.org/en/latest/specifications/recording-installed-packages/#recording-installed-packages
> This directory is named as {name}-{version}.dist-info, with name and
version fields corresponding to Core metadata specifications. Both
fields must be normalized (see Package name normalization and PEP 440
for the definition of normalization for each field respectively), and
replace dash (-) characters with underscore (_) characters, so the
.dist-info directory always has exactly one dash (-) character in its
stem, separating the name and version fields.
Follow up to #278
This PR makes the cache non-optional in most of Puffin, which simplifies
the code, allows us to reuse the cache within a single command (even
with `--no-cache`), and also allows us to use the cache for disk storage
across an invocation.
I left the cache as optional for the `Virtualenv` and `InterpreterInfo`
abstractions, since those are generic enough that it seems nice to have
a non-cached version, but it's kind of arbitrary.