Commit graph

92 commits

Author SHA1 Message Date
Charlie Marsh
5205165d42
Add PEP 714 support for JSON API client (#3698)
## Summary

Closes https://github.com/astral-sh/uv/issues/3689.

## Test Plan

Manually verified we pick up `core-metadata` from PyPI if I remove the
aliases.
2024-05-21 15:52:37 +00:00
Shantanu
5a07923eb4
Use Metadata10 to parse PKG-INFO of legacy editable (#3450)
It turns out setuptools often uses Metadata-Version 2.1 in their
PKG-INFO:
4e766834d7/setuptools/dist.py (L64)
`Metadata23` requires Metadata-Version of at least 2.2.

This means that uv doesn't actually recognise legacy editable
installations from the most common way you'd actually get legacy
editable installations (works great for most legacy editables I make at
work though!)

Anyway, over here we only need the version and don't care about anything
else. Rather than make a `Metadata21`, I just add a version field to
`Metadata10`. The one slightly tricky thing is that only
Metadata-Version 1.2 and greater guarantee that the [version number is
PEP 440 compatible](https://peps.python.org/pep-0345/#version), so I
store the version in `Metadata10` as a `String` and only parse to
`Version` at time of use.

Also did you know that back in 2004, paramiko had a pokemon based
versioning system?
2024-05-08 10:58:18 +02:00
konsti
55f6e4e66b
Make Requirement generic over url type (#3253)
This change allows switching out the url type for requirements. The
original idea was to allow different types for different requirement
origins, so that core metadata reads can ban non-pep 508 requirements
while we only allow them for requirements.txt. This didn't work out
because we expect `&Requirement`s from all sources to match.

I also tried to split `pep508_rs` into a PEP 508 compliant crate and
into our extensions, but they are to tightly coupled.

I think this change is an improvement still as it reduces the hardcoded
dependence on `VerbatimUrl`.
2024-05-07 16:45:49 +02:00
konsti
4f87edbe66
Add basic tool.uv.sources support (#3263)
## Introduction

PEP 621 is limited. Specifically, it lacks
* Relative path support
* Editable support
* Workspace support
* Index pinning or any sort of index specification

The semantics of urls are a custom extension, PEP 440 does not specify
how to use git references or subdirectories, instead pip has a custom
stringly format. We need to somehow support these while still stying
compatible with PEP 621.

## `tool.uv.source`

Drawing inspiration from cargo, poetry and rye, we add `tool.uv.sources`
or (for now stub only) `tool.uv.workspace`:

```toml
[project]
name = "albatross"
version = "0.1.0"
dependencies = [
  "tqdm >=4.66.2,<5",
  "torch ==2.2.2",
  "transformers[torch] >=4.39.3,<5",
  "importlib_metadata >=7.1.0,<8; python_version < '3.10'",
  "mollymawk ==0.1.0"
]

[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }

[tool.uv.workspace]
include = [
  "packages/mollymawk"
]

[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"
```

See `docs/specifying_dependencies.md` for a detailed explanation of the
format. The basic gist is that `project.dependencies` is what ends up on
pypi, while `tool.uv.sources` are your non-published additions. We do
support the full range or PEP 508, we just hide it in the docs and
prefer the exploded table for easier readability and less confusing with
actual url parts.

This format should eventually be able to subsume requirements.txt's
current use cases. While we will continue to support the legacy `uv pip`
interface, this is a piece of the uv's own top level interface. Together
with `uv run` and a lockfile format, you should only need to write
`pyproject.toml` and do `uv run`, which generates/uses/updates your
lockfile behind the scenes, no more pip-style requirements involved. It
also lays the groundwork for implementing index pinning.

## Changes

This PR implements:
* Reading and lowering `project.dependencies`,
`project.optional-dependencies` and `tool.uv.sources` into a new
requirements format, including:
  * Git dependencies
  * Url dependencies
  * Path dependencies, including relative and editable
* `pip install` integration
* Error reporting for invalid `tool.uv.sources`
* Json schema integration (works in pycharm, see below)
* Draft user-level docs (see `docs/specifying_dependencies.md`)

It does not implement:
* No `pip compile` testing, deprioritizing towards our own lockfile
* Index pinning (stub definitions only)
* Development dependencies
* Workspace support (stub definitions only)
* Overrides in pyproject.toml
* Patching/replacing dependencies

One technically breaking change is that we now require user provided
pyproject.toml to be valid wrt to PEP 621. Included files still fall
back to PEP 517. That means `pip install -r requirements.txt` requires
it to be valid while `pip install -r requirements.txt` with `-e .` as
content falls back to PEP 517 as before.

## Implementation

The `pep508` requirement is replaced by a new `UvRequirement` (name up
for bikeshedding, not particularly attached to the uv prefix). The still
existing `pep508_rs::Requirement` type is a url format copied from pip's
requirements.txt and doesn't appropriately capture all features we
want/need to support. The bulk of the diff is changing the requirement
type throughout the codebase.

We still use `VerbatimUrl` in many places, where we would expect a
parsed/decomposed url type, specifically:
* Reading core metadata except top level pyproject.toml files, we fail a
step later instead if the url isn't supported.
* Allowed `Urls`.
* `PackageId` with a custom `CanonicalUrl` comparison, instead of
canonicalizing urls eagerly.
* `PubGrubPackage`: We eventually convert the `VerbatimUrl` back to a
`Dist` (`Dist::from_url`), instead of remembering the url.
* Source dist types: We use verbatim url even though we know and require
that these are supported urls we can and have parsed.

I tried to make improve the situation be replacing `VerbatimUrl`, but
these changes would require massive invasive changes (see e.g.
https://github.com/astral-sh/uv/pull/3253). A main problem is the ref
`VersionOrUrl` and applying overrides, which assume the same
requirement/url type everywhere. In its current form, this PR increases
this tech debt.

I've tried to split off PRs and commits, but the main refactoring is
still a single monolith commit to make it compile and the tests pass.

## Demo

Adding
d1ae3b85d5/pyproject.json
as json schema (v7) to pycharm for `pyproject.toml`, you can try the IDE
support already:


![pycharm](599082c7-6be5-41c1-a3cd-516092382f8d)


[dove.webm](c293c272-c80b-459d-8c95-8c46a8d198a1)
2024-05-03 21:10:50 +00:00
Andrew Gallant
1089abda3f
require serde and rkyv everywhere; remove optional serde and rkyv features (#3345)
In *some* places in our crates, `serde` (and `rkyv`) are optional
dependencies. I believe this was done out of reasons of "good sense,"
that is, it follows a Rust ecosystem pattern where serde integration
tends to be an opt-in crate feature. (And similarly for `rkyv`.)

However, ultimately, `uv` itself requires `serde` and `rkyv` to
function. Since our crates are strictly internal, there are limited
consumers for our crates without `serde` (and `rkyv`) enabled. I think
one possibility is that optional `serde` (and `rkyv`) integration means
that someone can do this:

    cargo test -p pep440_rs

And this will run tests _without_ `serde` or `rkyv` enabled. That in
turn could lead to faster iteration time by reducing compile times. But,
I'm not sure this is worth supporting. The iterative compilation times
of
individual crates are probably fast enough in debug mode, even with
`serde` and `rkyv` enabled. Namely, `serde` and `rkyv` themselves
shouldn't need to be re-compiled in most cases. On `main`:

```
from-scratch: `cargo test -p pep440_rs --lib` 0.685
incremental: `cargo test -p pep440_rs --lib` 0.278s
from-scratch: `cargo test -p pep440_rs --features serde,rkyv --lib` 3.948s
incremental: `cargo test -p pep440_rs --features serde,rkyv --lib` 0.321s
```

So while a from-scratch build does take significantly longer, an
incremental build is about the same.

The benefit of doing this change is two-fold:

1. It brings out crates into alignment with "reality." In particular,
   some crates were _implicitly_ relying on `serde` being enabled
   without explicitly declaring it. This technically means that our
   `Cargo.toml`s were wrong in some cases, but it is hard to observe it
   because of feature unification in a Cargo workspace.
2. We no longer need to deal with the cognitive burden of writing
   `#[cfg_attr(feature = "serde", ...)]` everywhere.
2024-05-03 10:21:03 -04:00
konsti
2e27abd34a
Remove Into::into (#3337)
Motivated by
https://github.com/astral-sh/uv/pull/3263#discussion_r1585896159

I don't think we can lint againt this since we do want to allow
`.into()`, just not the bare `Into::into` call.
2024-05-02 10:26:42 +00:00
Ibraheem Ahmed
fa53de9223
Avoid Removing Quotes From Requirement Markers (#3214)
## Summary

Avoid removing quotes from markers, e.g. `numpy (>=1.19) ;
python_version >= "3.7"` should not be rewritten. Fixes
https://github.com/astral-sh/uv/issues/2551.

This PR also makes fixups a bit more flexible internally for fixes that
aren't simple to implement with a pure regex replacement, like this one.
https://github.com/astral-sh/uv/pull/1529 fixed a similar problem but
the current regex is still not smart enough to avoid all markers
completely (like `python_version`).

## Test Plan

Added a few unit tests.
2024-04-23 14:07:51 -04:00
Charlie Marsh
3df8df656b
Replace unwrap with ? in hash generation (#3003)
And add tests to catch it.
2024-04-12 00:41:08 +00:00
Charlie Marsh
13ae5ac8dc
Replace PyPI-internal Hashes representation with flat vector (#2925)
## Summary

Right now, we have a `Hashes` representation that looks like:

```rust
/// A dictionary mapping a hash name to a hex encoded digest of the file.
///
/// PEP 691 says multiple hashes can be included and the interpretation is left to the client.
#[derive(Debug, Clone, Eq, PartialEq, Default, Deserialize)]
pub struct Hashes {
    pub md5: Option<Box<str>>,
    pub sha256: Option<Box<str>>,
    pub sha384: Option<Box<str>>,
    pub sha512: Option<Box<str>>,
}
```

It stems from the PyPI API, which returns a dictionary of hashes.

We tend to pass these around as a vector of `Vec<Hashes>`. But it's a
bit strange because each entry in that vector could contain multiple
hashes. And it makes it difficult to ask questions like "Is
`sha256:ab21378ca980a8` in the set of hashes"?

This PR instead treats `Hashes` as the PyPI-internal type, and uses a
new `Vec<HashDigest>` everywhere in our own APIs.
2024-04-09 16:56:16 +00:00
Charlie Marsh
cc3c5700e1
Use scheme parsing to determine absolute vs. relative URLs (#2904)
## Summary

We have a heuristic in `File` that attempts to detect whether a URL is
absolute or relative. However, `contains("://")` is prone to false
positive. In the linked issues, the URLs look like:

```
/packages/5a/d8/4d75d1e4287ad9d051aab793c68f902c9c55c4397636b5ee540ebd15aedf/pytz-2005k.tar.bz2?hash=597b596dc1c2c130cd0a57a043459c3bd6477c640c07ac34ca3ce8eed7e6f30c&remote=4d75d1e428/pytz-2005k.tar.bz2 (sha256)=597b596dc1c2c130cd0a57a043459c3bd6477c640c07ac34ca3ce8eed7e6f30c
```

Which is relative, but includes `://`.

Instead, we should determine whether the URL has a _scheme_ which
matches the `Url` crate internally.

Closes https://github.com/astral-sh/uv/issues/2899.
2024-04-08 17:04:27 -04:00
Charlie Marsh
a0b8d1a994
Clean up Error enum in Metadata23 (#2835)
## Summary

Rename, and remove a bunch of unused variants.
2024-04-05 14:40:33 +00:00
Charlie Marsh
365c292525
Read package metadata from pyproject.toml when statically defined (#2676)
## Summary

Now that we're resolving metadata more aggressively for local sources,
it's worth doing this. We now pull metadata from the `pyproject.toml`
directly if it's statically-defined.

Closes https://github.com/astral-sh/uv/issues/2629.
2024-03-27 14:34:18 +00:00
konsti
48bd02b8a8
Update miette v7, pubgrub and small Cargo.toml cleanup (#2610)
I was going through the output of `cargo tree --duplicate -p uv`, not
much success except these small cleanups.
2024-03-22 10:42:48 +00:00
Charlie Marsh
e5b0cf7f89
Add support for unnamed local directory requirements (#2571)
## Summary

For example: `cargo run pip install .`

The strategy taken here is to attempt to extract the package name from
the distribution without executing the PEP 517 build steps. We could
choose to do that in the future if this proves lacking, but it adds
complexity.

Part of: https://github.com/astral-sh/uv/issues/313.
2024-03-21 03:46:11 +00:00
konsti
70e0967dbd
Avoid repeating paths of workspace packages (#2573)
Scott schafer got me the idea: We can avoid repeating the path for
workspaces dependencies everywhere if we declare them in the virtual
package once and treat them as workspace dependencies from there on.
2024-03-20 16:16:02 -04:00
Charlie Marsh
c4107f9c40
Re-test validity after every lenient parsing change (#2550)
## Summary

We had the right fixup for `torchsde`, but a subsequent fixup was making
it invalid. In general, we should apply as few of these as we can, so
lets stop as soon as we succeed.

Closes https://github.com/astral-sh/uv/issues/2546.

## Test Plan

`cargo run pip install torchsde==0.2.5 --verbose --reinstall -n
--verbose`
2024-03-19 15:41:49 -04:00
Micha Reiser
acbee166c0
Remove unused dependencies (#2543)
## Summary

I tried out `cargo shear` to see if there are any unused dependencies
that `cargo udeps` isn't reporting. It turned out, there are a few. This
PR removes those dependencies.

## Test Plan

`cargo build`
2024-03-19 13:10:10 -04:00
Charlie Marsh
659e00c4c1
Use Box<str> in Hashes to reduce size (#2536) 2024-03-19 02:51:46 +00:00
Charlie Marsh
80aa03dcba
Add SHA384 and SHA512 hash algorithms (#2534)
Closes #2533.
2024-03-19 02:23:16 +00:00
Charlie Marsh
d0789fc078
Preserve hashes for pinned packages (#2532)
## Summary

When a user runs with `--output-file` and `--generate-hashes`, we should
_only_ update the hashes if the pinned version itself changes.

Closes https://github.com/astral-sh/uv/issues/1530.
2024-03-19 01:02:18 +00:00
Charlie Marsh
f1aec3e779
Add in-URL credentials to store prior to creating requests (#2446)
## Summary

The authentication middleware extracts in-URL credentials from URLs that
pass through it; however, by the time a request reaches the store, the
credentials will have already been removed, and relocated to the header.
So we were never propagating in-URL credentials.

This PR adds an explicit pass wherein we pass in-URL credentials to the
store prior to doing any work.

Closes https://github.com/astral-sh/uv/issues/2444.

## Test Plan

`cargo run pip install` against an authenticated AWS registry.
2024-03-14 03:46:33 +00:00
Zanie Blue
10c4effbd3
Refactor incompatiblity tracking for distributions (#1298)
Extends the "compatibility" types introduced in #1293 to apply to source
distributions as well as wheels.

- We now track the most-relevant incompatible source distribution
- Exclude newer, Python requirements, and yanked versions are all
tracked as incompatibilities in the new model (this lets us remove
`DistMetadata`!)
2024-03-08 11:02:31 -06:00
Charlie Marsh
2e9678e5d3
Add support for Metadata 2.2 (#2293)
## Summary

PyPI now supports Metadata 2.2, which means distributions with Metadata
2.2-compliant metadata will start to appear. The upside is that if a
source distribution includes a `PKG-INFO` file with (1) a metadata
version of 2.2 or greater, and (2) no dynamic fields (at least, of the
fields we rely on), we can read the metadata from the `PKG-INFO` file
directly rather than running _any_ of the PEP 517 build hooks.

Closes https://github.com/astral-sh/uv/issues/2009.
2024-03-08 16:02:32 +00:00
Charlie Marsh
8a807094e9
Encapsulate header parsing for metadata files (#2295) 2024-03-08 03:59:53 +00:00
Charlie Marsh
0f6fc117c1
Query interpreter to determine correct virtualenv paths (#2188)
## Summary

This PR migrates our virtualenv creation from a setup that assumes prior
knowledge of the correct paths, to a technique borrowed from
`virtualenv` whereby we use `sysconfig` and `distutils` to determine the
paths. The general trick is to grab the expected paths with `sysconfig`,
then make them all relative, then make them absolute for a given
directory.

Closes #2095.
Closes #2153.
2024-03-05 16:13:24 -05:00
dependabot[bot]
e66afa8767
Bump insta from 1.35.1 to 1.36.1 (#2180) 2024-03-04 23:01:49 +00:00
Charlie Marsh
5fed1f6259
Use simpler pip-like Scheme for install paths (#2173)
## Summary

This will make it easier to use the paths returned by `distutils.py`
(for some cases). No code or behavior changes; just removing some fields
we don't need.
2024-03-04 15:50:13 -05:00
dependabot[bot]
6678d545fb
Bump serde_json from 1.0.113 to 1.0.114 (#1996) 2024-02-26 23:12:54 +00:00
Jonathon Belotti
c80d5c6ffb
fix 'uv pip install' handling of gzip'd response and PEP 691 (#1978)
Thank you for writing `uv`! We're already using it internally on some
container image builds and finding that it's noticeably faster 💯

## Summary

I was attempting to use `uv` alongside [modal](https://modal.com/)'s
internal PyPi mirror and ran into some issues. The first issue was the
following error:

```
error: Failed to download: nltk==3.8.1
  Caused by: content-length header is missing from response
```

This error was coming from within
`RegistryClient::wheel_metadata_no_pep658`. By logging requests on the
client (uv) and server (internal mirror) sides I've concluded that it's
occurring because `uv` is sending a header suggesting that it can accept
a gzip'd response, but decompressing the gzip'd response strips the
`content-length` header:
https://github.com/seanmonstar/reqwest/issues/294.

**Logged request, client-side:**

```
0.981664s   0ms  INFO uv_client::registry_client JONO, REQ: Request { method: HEAD, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(172.21.0.1)), port: Some(5555), path: "/simple/joblib/joblib-1.3.2-py3-none-any.whl", query: None, fragment: None }, headers: {} }
```

No headers set explicitly by `uv`.

**Logged request, server-side:**

```
2024-02-26T03:45:08.598272Z DEBUG pypi_mirror: origin request = Request { method: HEAD, uri: /simple/joblib/joblib-1.3.2-py3-none-any.whl, version: HTTP/1.1, headers: {"accept": "*/*", "user-agent": "uv", "accept-encoding": "gzip, br", "host": "172.21.0.1:5555"}, body: Body(Empty) }
```

Server receives `"accept-encoding": "gzip, br",`. 

My change adding the header to the request fixed this issue. But our
internal mirror is just passing through PyPI responses and PyPI
responses do contain PEP 658 data, and so `wheel_metadata_no_pep658`
shouldn't execute.

The issue there is that the PyPi response field has _dashes_ not
_underscores_ (https://peps.python.org/pep-0691/).

<img width="1261" alt="image"
src="35230f27-441a-457a-827b-870a1a16c16a">

After changing the `alias` the PEP 658 codepath now runs correctly :)

## Test Plan

I tested by installing against both our mirror and against PyPi: 

```
RUST_LOG="uv=trace" UV_NO_CACHE=true UV_INDEX_URL="http://172.21.0.1:5555/simple" target/release/uv pip install -v nltk
RUST_LOG="uv=trace" UV_NO_CACHE=true UV_INDEX_URL="http://localhost:5555/simple" target/release/uv pip uninstall -v nltk
```

```
target/release/uv pip install -v nltk
target/release/uv pip uninstall -v nltk
```
2024-02-25 23:28:22 -05:00
danieleades
8d721830db
Clippy pedantic (#1963)
Address a few pedantic lints

lints are separated into separate commits so they can be reviewed
individually.

I've not added enforcement for any of these lints, but that could be
added if desirable.
2024-02-25 14:04:05 -05:00
dependabot[bot]
019e2fd1b5
Bump insta from 1.34.0 to 1.35.1 (#1942) 2024-02-23 21:00:35 +00:00
Zanie Blue
8a12b2ebf9
Ensure authentication is passed from the index url to distribution files (#1886)
Closes https://github.com/astral-sh/uv/issues/1709
Closes https://github.com/astral-sh/uv/issues/1371

Tested with the reproduction provided in #1709 which gets past the HTTP
401.

Reuses the same copying logic we introduced in
https://github.com/astral-sh/uv/pull/1874 to ensure authentication is
attached to file URLs with a realm that matches that of the index. I had
to move the authentication logic into a new crate so it could be used in
`distribution-types`.

We will want to something more robust in the future, like track all
realms with authentication in a central store and perform lookups there.
That's what `pip` does and it allows consolidation of logic like netrc
lookups. That refactor feels significant though, and I'd like to get
this fixed ASAP so this is a minimal fix.
2024-02-22 18:10:17 -06:00
Charlie Marsh
3a34918480
Add fixup for prefect<1.0.0 (#1825)
Closes https://github.com/astral-sh/uv/issues/1798.
2024-02-21 19:47:34 +00:00
Charlie Marsh
a2a1b2fb0f
Avoid enforcing URL correctness for installed distributions (#1793)
## Summary

Allows the corresponding `pypi_types` struct to use any URL, since other
installers can put those into the environment, and Poetry seems to write
invalid URLs.

If we see a distribution with an invalid URL, we just treat it as a
registry distribution, which isn't ideal, but is better than (1)
erroring, and (2) changing `Url` to `String` everywhere internally. (I'm
torn on this second option.)

Closes https://github.com/astral-sh/uv/issues/1744.

## Test Plan

- Added `flask = { git = "git@github.com:pallets/flask.git", rev =
"b90a4f1f4a370e92054b9cc9db0efcb864f87ebe" }` to
`scripts/editable-installs/poetry_editable/pyproject.toml`.
- Ran `poetry install`.
- Ran `cargo pip freeze`. Verified that it errored on `main`, but passed
here.
- Ran `cargo run pip install "flask==3.0.0"`. Verified that it
uninstalled the existing Flask, and installed a new version from the
registry.
2024-02-21 09:06:31 -05:00
Charlie Marsh
d5a2a5fed3
Add support for >dev specifier (#1776)
## Summary

Not a fan of this one but we don't have a clear policy on it yet so
feels weird to discriminate.

Closes https://github.com/astral-sh/uv/issues/1686.
2024-02-20 20:27:30 +00:00
Charlie Marsh
a5372d4e4d
Ignore invalid extras from PyPI (#1731)
## Summary

We don't control these, so it seems preferable _not_ to fail on them,
but rather, to just ignore them entirely. (I considered adding a long
allow-list, but then questioned the point of it? We'd end up having to
extend it if more invalid extras were published in the future.)

Closes https://github.com/astral-sh/uv/issues/1633.
2024-02-19 22:26:29 -05:00
Charlie Marsh
c1eb6130e1
Support MD5 hashes (#1556)
## Summary

We can add other hashes if necessary, but I don't know that they're
really used in practice.

Closes https://github.com/astral-sh/uv/issues/1547.
2024-02-17 00:25:16 +00:00
Andrew Gallant
a97c207674
pypi-types: fix lenient requirement parsing (#1529)
This fixes a bug where `uv pip install` failed to install `polars`:

```
$ uv pip install polars==0.14.0
error: Failed to download: polars==0.14.0
  Caused by: Couldn't parse metadata of polars-0.14.0-cp37-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.whl from 749022b096/polars-0.14.0-cp37-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
  Caused by: Operator >= cannot be used with a wildcard version specifier
pyarrow>=4.0.*; extra == 'pyarrow'
       ^^^^^^^
```

Since `pyarrow>=4.0.*; extra == 'pyarrow'` is invalid *and* it comes
from the metadata of a dependency (that isn't under the control of the
end user), we actually attempt to "fix" it. Namely, wildcard
dependency specifications are only allowed with `==` and `!=`, as per
the [Version Specifiers spec]. (They aren't explicitly forbidden in
these cases, but instead only have specified behavior for the `==` and
`!=` operators.)

This is all fine, but it turns out that when we fix the `>=4.0.*`
component, we also strip the quotes around `pyarrow`. (Because some
dependency specifications include stray quotes.) We fix this by making
our quote stripping a bit more selective. (We require that it appear
adjacent to a digit or a `*`.)

Note that #1477 also reports this error:

```
$ uv pip install 'requests>=2.30.*'
error: Failed to parse `requests>=2.30.*`
  Caused by: Operator >= cannot be used with a wildcard version specifier
requests>=2.30.*
```

However, we specifically keep that error message since it's something
under the end user's control. And similarly for a dependency
specification in a `requirements.txt` file.

Fixes #1477

[Version Specifiers spec]:
https://packaging.python.org/en/latest/specifications/version-specifiers/
2024-02-16 15:52:44 -05:00
Charlie Marsh
958e88ddbf
Ignore invalid extra named .none (#1428)
## Summary

Some packages erroneously include an extra named `.none`. It turns out
that certain versions of `flit` included this by accident:
https://github.com/pypa/flit/issues/228/.

This PR adds leniency for that specific extra name.

Closes https://github.com/astral-sh/uv/issues/1363.

Closes https://github.com/astral-sh/uv/issues/1399.
2024-02-16 05:01:21 +00:00
Zanie Blue
0bfce353fb
Fix broken URLs parsed from relative paths in registries (#1413)
Closes https://github.com/astral-sh/uv/issues/1388

Fixes incorrect handling of relative paths returned by indexes without
an explicit `<base>`.

`Url.join` will drop the last segment in an url e.g. `http://foo/bar` ->
`http://foo/baz` if there is not a trailing slash but what we want is
`http://foo/bar/baz`. We don't add the trailing `/` in
`base_url_join_relative` because flat indexes are `http://foo/bar.html`
and we _want_ `bar.html` to be replaced.
2024-02-15 22:37:09 -06:00
Charlie Marsh
1837641138
Add fix-up for invalid star comparison with major-only version (#1410)
## Summary

Closes https://github.com/astral-sh/uv/issues/1402.

## Test Plan

Ran `cargo run pip install junos-eznc==2.6.5`, which still fails for me,
but fails identically to `pip` (and not on the `requires-python`):

```
/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py:421: SyntaxWarning: invalid escape sequence '\s'
  LONG_VERSION_PY['git'] = '''
Traceback (most recent call last):
  File "<string>", line 10, in <module>
  File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmplD5mMO/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 366, in prepare_metadata_for_build_wheel
    self.run_setup()
  File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmplD5mMO/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 480, in run_setup
    super().run_setup(setup_script=setup_script)
  File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmplD5mMO/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup
    exec(code, locals())
  File "<string>", line 45, in <module>
  File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py", line 1480, in get_version
    return get_versions()["version"]
           ^^^^^^^^^^^^^^
  File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py", line 1412, in get_versions
    cfg = get_config_from_root(root)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py", line 342, in get_config_from_root
    parser = configparser.SafeConfigParser()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'configparser' has no attribute 'SafeConfigParser'. Did you mean: 'RawConfigParser'?
```
2024-02-16 02:12:10 +00:00
Charlie Marsh
7994b68654
Add fix-up for trailing comma with trailing space (#1409)
## Summary

Closes https://github.com/astral-sh/uv/issues/1361.

## Test Plan

```text
Resolved 3 packages in 243ms
Downloaded 3 packages in 193ms
Installed 3 packages in 6ms
 + et-xmlfile==1.1.0
 + jdcal==1.4.1
 + openpyxl==3.0.5
```
2024-02-16 02:08:05 +00:00
Zanie Blue
2586f655bb
Rename to uv (#1302)
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.

Then, update directory and file names:

```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g

# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```

Then add all the files again

```
# Add all the files again
git add crates
git add python/uv

# This one needs a force-add
git add -f crates/uv-trampoline
```
2024-02-15 11:19:46 -06:00
Zanie Blue
b5dd8b7de2
Track yanked versions as incompatibilities (#1290)
Moves yanked version filtering from `VersionMap::from_metadata` to the
resolver and tracks it as a PubGrub unavailable incompatibility so
yanked versions are reflected in error messages.

e.g. before
```
╰─▶ Because only albatross<=0.1.0 is available and you require albatross>0.1.0, 
       we can conclude that the requirements are unsatisfiable.
```

after

```
╰─▶ Because only the following versions of albatross are available:
            albatross<=0.1.0
            albatross==1.0.0
      and albatross==1.0.0 is unusable because it was yanked, we can conclude that albatross>0.1.0 cannot be used.
      And because you require albatross>0.1.0, we can conclude that the requirements are unsatisfiable.
```
2024-02-12 22:01:17 -06:00
Andrew Gallant
5219d37250
add initial rkyv support (#1135)
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.

For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:

* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
  cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
  type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
  of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
  else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
  introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
  given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
  allocate and create all necessary objects.

This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.

The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.

My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.

There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.

[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
2024-01-28 12:14:59 -05:00
Charlie Marsh
036b7e5f43
Use parse_headers rather than parsing body (#1090)
Looking at the internals, this should make almost no difference in
performance, but anyway...
2024-01-25 09:41:21 +01:00
Andrew Gallant
eebc2f340a
make some things guaranteed to be deterministic (#1065)
This PR replaces a few uses of hash maps/sets with btree maps/sets and
index maps/sets. This has the benefit of guaranteeing a deterministic
order of iteration.

I made these changes as part of looking into a flaky test.
Unfortunately, I'm not optimistic that anything here will actually fix
the flaky test, since I don't believe anything was actually dependent
on the order of iteration.
2024-01-23 20:30:33 -05:00
Charlie Marsh
96a61fb351
Remove RFC2047 decoder (#967)
## Summary

- This was inherited from
d719988323/src/metadata.rs (LL78C2-L91C26)
- ...which introduced this code here:
9cd1d43f7c
- ...with the originating issue here:
https://github.com/PyO3/maturin/issues/612
- ...and the upstream issue here:
https://github.com/staktrace/mailparse/issues/50

It seems like the goal was to support Unicode in certain header fields,
but I don't think this is necessary for us. We only use
`get_first_value` for `Requires-Python`, which has to be ASCII, doesn't
it?

In my testing, it seems like the `charset` hack can also be removed. The
tests I copied over actually work without it, which makes me a bit
skeptical.

The main benefit here is that we get to a remove a _big_ dependency
stack, including Chumsky and Stacker and psm which have limited
cross-platform support.
2024-01-18 15:09:45 -05:00
Charlie Marsh
a0420114c3
Avoid storing absolute URLs for files (#944)
## Summary

It turns out that storing an absolute URL for every file caused a
significant performance regression. This PR attempts to address the
regression with two changes.

The first is that we now store the raw string if the URL is an absolute
URL. If the URL is relative, we store the base URL alongside the raw
relative string. As such, we avoid serializing and deserializing URLs
until we need them (later on), except for the base URL.

The second is that we now use the internal `Url` crate methods for
serializing and deserializing. If you look inside `Url`, its standard
serializer and deserialization actually convert it to a string, then
parse the string. But the crate exposes some other methods for faster
serialization and deserialization (with fewer guarantees). I think this
is totally fine since the cache is entirely internal.

If we _just_ change the `Url` serialization (and no other code -- so
continue to store URLs for every file), then the regression goes down to
about 5%:

```shell
❯ python -m scripts.bench \
        --puffin-path ./target/release/main \
        --puffin-path ./target/release/relative --puffin-path ./target/release/puffin \
        scripts/requirements/home-assistant.in --benchmark resolve-warm
Benchmark 1: ./target/release/main (resolve-warm)
  Time (mean ± σ):     496.3 ms ±   4.3 ms    [User: 452.4 ms, System: 175.5 ms]
  Range (min … max):   487.3 ms … 502.4 ms    10 runs

Benchmark 2: ./target/release/relative (resolve-warm)
  Time (mean ± σ):     284.8 ms ±   2.1 ms    [User: 245.8 ms, System: 165.6 ms]
  Range (min … max):   280.3 ms … 288.0 ms    10 runs

Benchmark 3: ./target/release/puffin (resolve-warm)
  Time (mean ± σ):     300.4 ms ±   3.2 ms    [User: 255.5 ms, System: 178.1 ms]
  Range (min … max):   295.4 ms … 305.1 ms    10 runs

Summary
  './target/release/relative (resolve-warm)' ran
    1.05 ± 0.01 times faster than './target/release/puffin (resolve-warm)'
    1.74 ± 0.02 times faster than './target/release/main (resolve-warm)'
```

So I considered _just_ making that change. But 5% is kind of
borderline...

With both of these changes, the regression is down to 1-2%:

```
Benchmark 1: ./target/release/relative (resolve-warm)
  Time (mean ± σ):     282.6 ms ±   7.4 ms    [User: 244.6 ms, System: 181.3 ms]
  Range (min … max):   275.1 ms … 318.5 ms    30 runs

Benchmark 2: ./target/release/puffin (resolve-warm)
  Time (mean ± σ):     286.8 ms ±   2.2 ms    [User: 247.0 ms, System: 169.1 ms]
  Range (min … max):   282.3 ms … 290.7 ms    30 runs

Summary
  './target/release/relative (resolve-warm)' ran
    1.01 ± 0.03 times faster than './target/release/puffin (resolve-warm)'
```

It's consistently ~2%-ish, but at this point it's unclear if that's due
to the URL change or something other change between now and then.

Closes #943.
2024-01-17 09:15:21 -05:00
konsti
5ffbfadf66
Make hashes optional (#910)
There is no guarantee that indexes provide hashes at all or the sha256
we support specifically. [PEP
503](https://peps.python.org/pep-0503/#specification):

> The URL SHOULD include a hash in the form of a URL fragment with the
following syntax: #<hashname>=<hashvalue>, where <hashname> is the
lowercase name of the hash function (such as sha256) and <hashvalue> is
the hex encoded digest.

We instead use the url as input to generate a hash when caching.
2024-01-14 16:32:55 -05:00