mirrors/uv - Forgejo: Beyond coding. We Forge.

1072 commits 490 branches 198 tags 138 MiB

Author	SHA1	Message	Date
Zanie Blue	2586f655bb	Rename to `uv` (#1302 ) First, replace all usages in files in-place. I used my editor for this. If someone wants to add a one-liner that'd be fun. Then, update directory and file names: ``` # Run twice for nested directories find . -type d -print0 \| xargs -0 rename s/puffin/uv/g find . -type d -print0 \| xargs -0 rename s/puffin/uv/g # Update files find . -type f -print0 \| xargs -0 rename s/puffin/uv/g ``` Then add all the files again ``` # Add all the files again git add crates git add python/uv # This one needs a force-add git add -f crates/uv-trampoline ```	2024-02-15 11:19:46 -06:00
Andrew Gallant	5219d37250	add initial rkyv support (#1135 ) This PR adds initial support for [rkyv] to puffin. In particular, the main aim here is to make puffin-client's `SimpleMetadata` type possible to deserialize from a `&[u8]` without doing any copies. This PR stops short of actuallying doing that zero-copy deserialization. Instead, this PR is about adding the necessary trait impls to a variety of types, along with a smattering of small refactorings to make rkyv possible to use. For those unfamiliar, rkyv works via the interplay of three traits: `Archive`, `Serialize` and `Deserialize`. The usual flow of things is this: * Make a type `T` implement `Archive`, `Serialize` and `Deserialize`. rkyv helpfully provides `derive` macros to make this pretty painless in most cases. * The process of implementing `Archive` for `T` usually creates an entirely new distinct type within the same namespace. One can refer to this type without naming it explicitly via `Archived<T>` (where `Archived` is a clever type alias defined by rkyv). * Serialization happens from `T` to (conceptually) a `Vec<u8>`. The serialization format is specifically designed to reflect the in-memory layout of `Archived<T>`. Notably, not `T`. But `Archived<T>`. * One can then get an `Archived<T>` with no copying (albeit, we will likely need to incur some cost for validation) from the previously created `&[u8]`. This is quite literally [implemented as a pointer cast][rkyv-ptr-cast]. * The problem with an `Archived<T>` is that it isn't your `T`. It's something else. And while there is limited interoperability between a `T` and an `Archived<T>`, the main issue is that the surrounding code generally demands a `T` and not an `Archived<T>`. This is at the heart of the tension for introducing zero-copy deserialization, and this is mostly an intrinsic problem to the technique and not an rkyv-specific issue. For this reason, given an `Archived<T>`, one can get a `T` back via an explicit deserialization step. This step is like any other kind of deserialization, although generally faster since no real "parsing" is required. But it will allocate and create all necessary objects. This PR largely proceeds by deriving the three aforementioned traits for `SimpleMetadata`. And, of course, all of its type dependencies. But we stop there for now. The main issue with carrying this work forward so that rkyv is actually used to deserialize a `SimpleMetadata` is figuring out how to deal with `DataWithCachePolicy` inside of the cached client. Ideally, this type would itself have rkyv support, but adding it is difficult. The main difficulty lay in the fact that its `CachePolicy` type is opaque, not easily constructable and is internally the tip of the iceberg of a rat's nest of types found in more crates such as `http`. While one "dumb"-but-annoying approach would be to fork both of those crates and add rkyv trait impls to all necessary types, it is my belief that this is the wrong approach. What we'd like to do is not just use rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to get an `Archived<DataWithCachePolicy>` and make actual decisions used the archived type directly. Doing that will require some work to make `Archived<DataWithCachePolicy>` directly useful. My suspicion is that, after doing the above, we may want to mush forward with a similar approach for `SimpleMetadata`. That is, we want `Archived<SimpleMetadata>` to be as useful as possible. But right now, the structure of the code demands an eager conversion (and thus deserialization) into a `SimpleMetadata` and then into a `VersionMap`. Getting rid of that eagerness is, I think, the next step after dealing with `DataWithCachePolicy` to unlock bigger wins here. There are many commits in this PR, but most are tiny. I still encourage review to happen commit-by-commit. [rkyv]: https://rkyv.org/ [rkyv-ptr-cast]: https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68	2024-01-28 12:14:59 -05:00
konsti	e9b6b6fa36	Implement `--find-links` as flat indexes (directories in pip-compile) (#912 ) Add directory `--find-links` support for local paths to pip-compile. It seems that pip joins all sources and then picks the best package. We explicitly give find links packages precedence if the same exists on an index and locally by prefilling the `VersionMap`, otherwise they are added as another index and the existing rules of precedence apply. Internally, the feature is called _flat index_, which is more meaningful than _find links_: We're not looking for links, we're picking up local directories, and (TBD) support another index format that's just a flat list of files instead of a nested index. `RegistryBuiltDist` and `RegistrySourceDist` now use `WheelFilename` and `SourceDistFilename` respectively. The `File` inside `RegistryBuiltDist` and `RegistrySourceDist` gained the ability to represent both a url and a path so that `--find-links` with a url and with a path works the same, both being locked as `<package_name>@<version>` instead of `<package_name> @ <url>`. (This is more of a detail, this PR in general still work if we strip that and have directory find links represented as `<package_name> @ file:///path/to/file.ext`) `PrioritizedDistribution` and `FlatIndex` have been moved to locations where we can use them in the upstack PR. I added a `scripts/wheels` directory with stripped down wheels to use for testing. We're lacking tests for correct tag priority precedence with flat indexes, i only confirmed this manually since it is not covered in the pip-compile or pip-sync output. Closes #876	2024-01-15 02:04:10 +00:00
konsti	4d8bfd7f61	Split source dist error type into error and kind (#872 ) It's a better, less redundant error type. It will come in handy when adding a second parse function.	2024-01-10 17:42:54 +00:00
Andrew Gallant	6c98ae9d77	pep440: rewrite the parser and make version comparisons cheaper (#789 ) This PR builds on #780 by making both version parsing faster, and perhaps more importantly, making version comparisons much faster. Overall, these changes result in a considerable improvement for the `boto3.in` workload. Here's the status quo: ``` $ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in Resolved 31 packages in 34.56s real 34.579 user 34.004 sys 0.413 maxmem 2867 MB faults 0 ``` And now with this PR: ``` $ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in Resolved 31 packages in 9.20s real 9.218 user 8.919 sys 0.165 maxmem 463 MB faults 0 ``` This particular workload gets stuck in pubgrub doing resolution, and thus benefits mightily from a faster `Version::cmp` routine. With that said, this change does also help a fair bit with "normal" runs: ``` $ hyperfine -w10 \ "puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \ "puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in Time (mean ± σ): 337.5 ms ± 3.9 ms [User: 310.5 ms, System: 73.2 ms] Range (min … max): 333.6 ms … 343.4 ms 10 runs Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in Time (mean ± σ): 189.8 ms ± 3.0 ms [User: 168.1 ms, System: 78.4 ms] Range (min … max): 185.0 ms … 196.2 ms 15 runs Summary puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran 1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ``` There is perhaps some future work here (detailed in the commit messages), but I suspect it would be more fruitful to explore ways of making resolution itself and/or deserialization faster. Fixes #373, Closes #396	2024-01-05 11:57:32 -05:00
Charlie Marsh	5bce699ee1	Add support for HTML indexes (#719 ) ## Summary This PR adds support for HTML index responses (as with `--index-url=https://download.pytorch.org/whl`). Closes https://github.com/astral-sh/puffin/issues/412.	2023-12-24 16:04:00 +00:00
Zanie Blue	ef7be9103c	Parse `SimpleJson` into categorized data in the client (#522 ) Extends #517 with a suggestion from @konstin to parse the `SimpleJson` into an intermediate type `SimpleMetadata(BTreeMap<Version, VersionFiles>)` before converting to a `VersionMap`. This reduces the number of times we need to parse the response. Additionally, we cache the parsed response now instead of `SimpleJson`. `VersionFiles` stores two vectors with `WheelFilename`/`SourceDistFilename` and `File` tuples. These can be iterated over together or separately. A new enum `DistFilename` was added to capture the `SourceDistFilename` and `WheelFilename` variants allowing iteration over both vectors.	2023-12-07 11:04:47 -06:00
Charlie Marsh	6a15950cb5	Rename `Distribution` to `Dist` in all structs and traits (#384 ) We tend to avoid abbreviations, but this one is just so long and absolutely ubiquitous.	2023-11-10 14:55:11 +00:00

Renamed from crates/distribution-filename/src/source_distribution.rs (Browse further)

8 commits