Commit graph

66 commits

Author SHA1 Message Date
Charlie Marsh
f62458f600
Add explicit error message for URLs without package names (#669)
`pip` supports installing packages without names (e.g.,
`git+https://github.com/pallets/flask.git`), but it doesn't adhere to
the PEP grammar, and we don't yet support it (and may never) (#313).

This PR adds a dedicated error path for such cases, to ensure that we
can give meaningful feedback to the user:

```
error: Couldn't parse requirement in requirements.in position 0 to 18
  Caused by: URL requirement is missing a package name; expected: `package_name @ https://google.com`
https://google.com
^^^^^^^^^^^^^^^^^^
```

Closes https://github.com/astral-sh/puffin/issues/650.
2023-12-16 21:14:34 +00:00
Charlie Marsh
e4673a0c52
Modify PEP 508 cursor to use byte offsets (#668)
This enables us to remove a number of allocations (in particular,
`peek_while` and `take_while` no longer allocate). It also makes it
trivial to move the cursor to a new location, since you can just slice
and call `.chars()`. At present, moving to a new location would require
converting the iterator to a string, then back to a character iterator.
2023-12-15 22:05:28 +00:00
Charlie Marsh
875c9a635e
Rename CharIter to Cursor (#667)
This better aligns with the analogous struct that we have in Ruff.
2023-12-15 21:57:59 +00:00
konsti
620f73b38b
Speed up version parsing for a 1.27±0.03 speedup in transformers-extras with conservative changes (#660)
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.

```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
  Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     217.1 ms ±   3.2 ms    [User: 194.0 ms, System: 55.1 ms]
    Range (min … max):   211.0 ms … 228.1 ms    50 runs

  Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     276.7 ms ±   5.7 ms    [User: 252.4 ms, System: 54.6 ms]
    Range (min … max):   268.9 ms … 303.5 ms    50 runs

  Summary
    target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
      1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```

---------

Co-authored-by: Andrew Gallant <andrew@astral.sh>
2023-12-15 14:03:35 -05:00
Charlie Marsh
22c7057b35
Expand environment variables in URLs (#640)
## Summary

This PR enables users to express relative dependencies via environment
variables. Like pip, PDM, Hatch, Rye, and others, we now allow users to
express dependencies like:

```text
flask @ file://${PROJECT_ROOT}/flask-3.0.0-py3-none-any.whl
```

In the compiled requirements file, we'll also preserve the unexpanded
environment variable.

Closes https://github.com/astral-sh/puffin/issues/592.
2023-12-14 15:09:12 +00:00
Charlie Marsh
ed8dfbfcf7
Preserve verbatim URLs (#639)
## Summary

This PR adds a `VerbatimUrl` struct to preserve verbatim URLs throughout
the resolution and installation pipeline. In short, alongside the parsed
`Url`, we also keep the URL as written by the user. This enables us to
display the URL exactly as written by the user, rather than the
serialized path that we use internally.

This will be especially useful once we start expanding environment
variables since, at that point, we'll be able to write the version of
the URL that includes the _unexpected_ environment variable to the
output file.
2023-12-14 15:03:39 +00:00
Andrew Gallant
ff4d079dc9
pep508-rs: remove \x20 trailing whitespace hack (#400)
... we just remove the trailing whitespace from the input and that
resolves things.

Thanks @konstin for pointing this out!

Ref https://github.com/astral-sh/puffin/pull/399#discussion_r1389854823
2023-11-10 15:11:29 -05:00
Andrew Gallant
63f7f65190
change global allocator to jemalloc (and mimalloc on Windows) (#399)
This copies the allocator configuration used in the Ruff project. In
particular, this gives us an instant 10% win when resolving the top 1K
PyPI packages:

    $ hyperfine \
"./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null" \
"./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null"
Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 974.2 ms ± 26.4 ms [User: 17503.3 ms, System: 2205.3
ms]
      Range (min … max):   943.5 ms … 1015.9 ms    10 runs

Benchmark 2: ./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 883.1 ms ± 23.3 ms [User: 14626.1 ms, System: 2542.2
ms]
      Range (min … max):   849.5 ms … 916.9 ms    10 runs

    Summary
'./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null' ran
1.10 ± 0.04 times faster than './target/profiling/puffin-dev-main
resolve-many --cache-dir cache-docker-no-build --no-build
pypi_top_8k_flat.txt --limit 1000 2> /dev/null'

I was moved to do this because I noticed `malloc`/`free` taking up a
fairly sizeable percentage of time during light profiling.

As is becoming a pattern, it will be easier to review this
commit-by-commit.

Ref #396 (wouldn't call this issue fixed)

-----

I did also try adding a `smallvec` optimization to the
`Version::release` field, but it didn't bare any fruit. I still think
there is more to explore since the results I observed don't quite line
up with what I expect. (So probably either my mental model is off or my
measurement process is flawed.) You can see that attempt with a little
more explanation here:
f9528b4ecd

In the course of adding the `smallvec` optimization, I also shrunk the
`Version` fields from a `usize` to a `u32`. They should at least be a
fixed size integer since version numbers aren't used to index memory,
and I shrunk it to `u32` since it seems reasonable to assume that all
version numbers will be smaller than `2^32`.
2023-11-10 14:48:59 -05:00
konsti
81f380b10e
Validate package and extra name (#290)
`PackageName` and `ExtraName` can now only be constructed from valid
names. They share the same rules, so i gave them the same
implementation. Constructors are split between `new` (owned) and
`from_str` (borrowed), with the owned version avoiding allocations.

Closes #279

---------

Co-authored-by: Zanie <contact@zanie.dev>
2023-11-06 10:04:31 +00:00
konsti
4adaa9a700
Wheel filename distribution package name (#278)
The normalized name abstractions were not consistently, this PR uses
them where they were previously missing:
* `WheelFilename::distribution`
* `Requirement::name`
* `Requirement::extras`
* `Metadata21::name`
* `Metadata21::provides_dist`

With `puffin-package` depending on `pep508_rs` this would be cyclical
crate dependency, so `puffin-normalize` gets split out from
`puffin-package`.

`DistInfoName` has the same task and semantics as `PackageName`, so it's
merged into the latter.

`PackageName` and `ExtraName` documentation is moved onto the type and
their constructors are called `new` instead of `normalize`. We now use
these constructors rarely enough the implicit allocation by
`to_string()` shouldn't matter anymore, while more actual cloning
becomes visible.
2023-11-02 11:15:27 +00:00
konsti
1fbe328257
Build source distributions in the resolver (#138)
This is isn't ready, but it can resolve
`meine_stadt_transparent==0.2.14`.

The source distributions are currently being built serially one after
the other, i don't know if that is incidentally due to the resolution
order, because sdist building is blocking or because of something in the
resolver that could be improved.

It's a bit annoying that the thing that was supposed to do http requests
now suddenly also has to a whole download/unpack/resolve/install/build
routine, it messes up the type hierarchy. The much bigger problem though
is avoid recursive crate dependencies, it's the reason for the callback
and for splitting the builder into two crates (badly named atm)
2023-10-25 20:05:13 +00:00
Charlie Marsh
471a1d657d
Migrate resolver proof-of-concept to PubGrub (#97)
## Summary

This PR enables the proof-of-concept resolver to backtrack by way of
using the `pubgrub-rs` crate.

Rather than using PubGrub as a _framework_ (implementing the
`DependencyProvider` trait, letting PubGrub call us), I've instead
copied over PubGrub's primary solver hook (which is only ~100 lines or
so) and modified it for our purposes (e.g., made it async).

There's a lot to improve here, but it's a start that will let us
understand PubGrub's appropriateness for this problem space. A few
observations:

- In simple cases, the resolver is slower than our current (naive)
resolver. I think it's just that the pipelining isn't as efficient as in
the naive case, where we can just stream package and version fetches
concurrently without any bottlenecks.
- A lot of the code here relates to bridging PubGrub with our own
abstractions -- so we need a `PubGrubPackage`, a `PubGrubVersion`, etc.
2023-10-15 22:05:44 -04:00
Charlie Marsh
239b5893d8
Fix version satisfier for unpinned dependencies (#74) 2023-10-09 11:48:39 -04:00
Charlie Marsh
ba72950546
Avoid passing cached wheels to the resolver step (#70)
When we go to install a locked `requirements.txt`, if a wheel is already
available in the local cache, and matches the version specifiers, we can
just use it directly without fetching the package metadata. This speeds
up the no-op case by about 33%.

Closes https://github.com/astral-sh/puffin/issues/48.
2023-10-08 22:17:19 -04:00
Charlie Marsh
c8477991a9
Use local versions of PEP 440 and PEP 508 crates (#32)
This PR modifies the PEP 440 and PEP 508 crates to pass CI, primarily by
fixing all lint violations.

We're also now using these crates in the workspace via `path`.
(Previously, we were still fetching them from Cargo.)
2023-10-07 00:16:44 +00:00
Charlie Marsh
4fcdb3c045
Copy over pep508-rs crate (#31)
This PR copies over the `pep440-rs` crate at commit
`82aa5d4dcbe676b121dc931b0afa09a82de8e3d7` with no modifications.

It won't pass CI, but modifications will intentionally be confined to
later PRs.
2023-10-06 20:12:19 -04:00