mirrors/uv - Forgejo: Beyond coding. We Forge.

mirrors/uv

mirror of https://github.com/astral-sh/uv.git synced 2025-08-04 10:58:28 +00:00

Author	SHA1	Message	Date
konsti	d54e780843	Source dist metadata refactor (#468 ) ## Summary and motivation For a given source dist, we store the metadata of each wheel built through it in `built-wheel-metadata-v0/pypi/<source dist filename>/metadata.json`. During resolution, we check the cache status of the source dist. If it is fresh, we check `metadata.json` for a matching wheel. If there is one we use that metadata, if there isn't, we build one. If the source is stale, we build a wheel and override `metadata.json` with that single wheel. This PR thereby ties the local built wheel metadata cache to the freshness of the remote source dist. This functionality is available through `SourceDistCachedBuilder`. `puffin_installer::Builder`, `puffin_installer::Downloader` and `Fetcher` are removed, instead there are now `FetchAndBuild` which calls into the also new `SourceDistCachedBuilder`. `FetchAndBuild` is the new main high-level abstraction: It spawns parallel fetching/building, for wheel metadata it calls into the registry client, for wheel files it fetches them, for source dists it calls `SourceDistCachedBuilder`. It handles locks around builds, and newly added also inter-process file locking for git operations. Fetching and building source distributions now happens in parallel in `pip-sync`, i.e. we don't have to wait for the largest wheel to be downloaded to start building source distributions. In a follow-up PR, I'll also clear built wheels when they've become stale. Another effect is that in a fully cached resolution, we need neither zip reading nor email parsing. Closes #473 ## Source dist cache structure Entries by supported sources: * `<build wheel metadata cache>/pypi/foo-1.0.0.zip/metadata.json` * `<build wheel metadata cache>/<sha256(index-url)>/foo-1.0.0.zip/metadata.json` * `<build wheel metadata cache>/url/<sha256(url)>/foo-1.0.0.zip/metadata.json` But the url filename does not need to be a valid source dist filename (<https://github.com/search?q=path%3A*%2Frequirements.txt+master.zip&type=code>), so it could also be the following and we have to take any string as filename: `<build wheel metadata cache>/url/<sha256(url)>/master.zip/metadata.json` Example: ```text # git source dist pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git # pypi source dist django_allauth==0.51.0 # url source dist werkzeug @ `ff1904eb5e/werkzeug-3.0.1.tar.gz` ``` will be stored as ```text built-wheel-metadata-v0 ├── git │ └── 5c56bc1c58c34c11 │ └── 843b753e9e8cb74e83cac55598719b39a4d5ef1f │ └── metadata.json ├── pypi │ └── django-allauth-0.51.0.tar.gz │ └── metadata.json └── url └── 6781bd6440ae72c2 └── werkzeug-3.0.1.tar.gz └── metadata.json ``` The inside of a `metadata.json`: ```json { "data": { "django_allauth-0.51.0-py3-none-any.whl": { "metadata-version": "2.1", "name": "django-allauth", "version": "0.51.0", ... } } } ```	2023-11-24 17:47:58 +00:00
konsti	f0841cdb6e	Wheel metadata refactor (#462 ) A consistent cache structure for remote wheel metadata: * `<wheel metadata cache>/pypi/foo-1.0.0-py3-none-any.json` * `<wheel metadata cache>/<digest(index-url)>/foo-1.0.0-py3-none-any.json` * `<wheel metadata cache>/url/<digest(url)>/foo-1.0.0-py3-none-any.json` The source dist caching will use a similar structure (#468).	2023-11-20 17:26:36 +01:00
konsti	9db6644be6	Test requirements script (#382 ) This script can compare different requirements between pip(-compile) and puffin across python versions, with debug and release builds. Examples: ```shell scripts/compare_with_pip/compare_with_pip.py scripts/compare_with_pip/compare_with_pip.py -p 3.10 scripts/compare_with_pip/compare_with_pip.py --release -p 3.9 --target 'transformers[deepspeed-testing,dev-tensorflow]' ``` It found a bunch of fixed bugs, e.g. the lack of yanked package handling and source dist handling, as well as #423, which is currently most of the output. Example output: https://gist.github.com/konstin/9ccf8dc7c2dcca737bf705429ced4892 #443 should be merged first	2023-11-17 18:26:55 +00:00
Andrew Gallant	63f7f65190	change global allocator to jemalloc (and mimalloc on Windows) (#399 ) This copies the allocator configuration used in the Ruff project. In particular, this gives us an instant 10% win when resolving the top 1K PyPI packages: $ hyperfine \ "./target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null" \ "./target/profiling/puffin-dev resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null" Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null Time (mean ± σ): 974.2 ms ± 26.4 ms [User: 17503.3 ms, System: 2205.3 ms] Range (min … max): 943.5 ms … 1015.9 ms 10 runs Benchmark 2: ./target/profiling/puffin-dev resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null Time (mean ± σ): 883.1 ms ± 23.3 ms [User: 14626.1 ms, System: 2542.2 ms] Range (min … max): 849.5 ms … 916.9 ms 10 runs Summary './target/profiling/puffin-dev resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null' ran 1.10 ± 0.04 times faster than './target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null' I was moved to do this because I noticed `malloc`/`free` taking up a fairly sizeable percentage of time during light profiling. As is becoming a pattern, it will be easier to review this commit-by-commit. Ref #396 (wouldn't call this issue fixed) ----- I did also try adding a `smallvec` optimization to the `Version::release` field, but it didn't bare any fruit. I still think there is more to explore since the results I observed don't quite line up with what I expect. (So probably either my mental model is off or my measurement process is flawed.) You can see that attempt with a little more explanation here: `f9528b4ecd` In the course of adding the `smallvec` optimization, I also shrunk the `Version` fields from a `usize` to a `u32`. They should at least be a fixed size integer since version numbers aren't used to index memory, and I shrunk it to `u32` since it seems reasonable to assume that all version numbers will be smaller than `2^32`.	2023-11-10 14:48:59 -05:00
konsti	5cef40d87a	Add proper caching for pypi metadata fetching kinds (#368 ) I intend this to become the main form of caching for puffin: You can make http requests, you tranform the data to what you really need, you have control over the cache key, and the cache is always json (or anything else much faster we want to replace it with as long as it's serde!)	2023-11-10 11:03:40 +00:00
konsti	91d0fdbbdf	Add script to compare with pip(-tools) (#335 ) Add a script to compare with pip-tools and pydantic input we can compare with it. Below is the output for `pydantic.in`, created from pydantic's pyproject.toml, which i added for that purpose: ```console $ scripts/compare_with_pip.sh scripts/benchmarks/requirements/pydantic.in Finished dev [unoptimized + debuginfo] target(s) in 0.08s Running `target/debug/puffin pip-compile scripts/benchmarks/requirements/pydantic.in` Resolved 85 packages in 1.61s real 0m1,733s user 0m1,714s sys 0m0,048s real 0m10,843s user 0m4,811s sys 0m0,399s --- /tmp/tmp.Y3FzvQ2xxo/pip-compile.txt 2023-11-06 15:47:29.221834123 +0100 +++ /tmp/tmp.Y3FzvQ2xxo/puffin.txt 2023-11-06 15:47:18.377408860 +0100 @@ -31,7 +31,7 @@ mdurl==0.1.2 memray==1.10.0 mergedeep==1.3.4 -mike @ git+https://github.com/jimporter/mike.git +mike @ git+https://github.com/jimporter/mike.git@076a4af3270a448f6aeb880c9c6c2fc0d80f603f mkdocs==1.5.3 mkdocs-autorefs==0.5.0 mkdocs-embed-external-markdown==3.0.1 @@ -52,7 +52,7 @@ py-cpuinfo==9.0.0 pydantic==2.4.2 pydantic-core==2.10.1 -pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git@main +pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git@a973b7942112df731e2618336e55e3343a2e1c32 pydantic-settings==2.0.3 pyflakes==3.1.0 pygments==2.16.1 @@ -61,7 +61,7 @@ pytest==7.4.3 pytest-benchmark==4.0.0 pytest-examples==0.0.10 -pytest-memray==1.5.0 ; platform_system != "Windows" +pytest-memray==1.5.0 pytest-mock==3.12.0 pytest-pretty==1.2.0 python-dateutil==2.8.2 ```	2023-11-07 16:32:12 +01:00
Charlie Marsh	b0286a8939	Add user feedback when building source distributions in the resolver (#347 ) It looks like Cargo, notice the bold green lines at the top (which appear during the resolution, to indicate Git fetches and source distribution builds): <img width="868" alt="Screen Shot 2023-11-06 at 11 28 47 PM" src="`9647a480`-7be7-41e9-b1d3-69faefd054ae"> <img width="868" alt="Screen Shot 2023-11-06 at 11 28 51 PM" src="`6bc491aa`-5b51-4b37-9ee1-257f1bc1c049"> Closes https://github.com/astral-sh/puffin/issues/287 although we can do a lot more here.	2023-11-07 14:17:31 +00:00
konsti	c9e0f4986f	Add requirements from PDM issue (#326 )	2023-11-06 11:07:31 +00:00
konstin	1529def563	Implement mixed PEP 517 and setup.py build There are packages such as DTLSSocket 0.1.16 that say ```toml [build-system] requires = ["Cython<3", "setuptools", "wheel"] ``` In this case we need to install requires PEP 517 style but then call setup.py in the legacy way Part of making home-assistant work	2023-10-30 19:11:52 +01:00
konsti	d47dc64974	Ignore self requirements (#233 ) gps3 0.33.3 depends on itself, which we can ignore. I've also added the home assistant requirements since it occurred when testing with this.	2023-10-30 17:13:52 +01:00
Charlie Marsh	1c5cdcd70a	Prioritize packages in visited order (#222 )	2023-10-30 00:48:36 +00:00
Charlie Marsh	2ba85bf80e	Add PubGrub's priority queue (#221 ) Pulls in https://github.com/pubgrub-rs/pubgrub/pull/104.	2023-10-29 21:16:02 +00:00
konsti	5ad58474ca	Add script to check the top 8k pypi packages (#198 ) To check to top 1k (current state): ```bash scripts/resolve/get_pypi_top_8k.sh cargo run --bin puffin-dev -- resolve-many scripts/resolve/pypi_top_8k_flat.txt --limit 1000 ``` Results: ``` Errors: pywin32, geoip2, maxminddb, pypika, dirac Success: 995, Error: 5 ``` pywin32 has no solution for the build environment, 3 have no `[build-system]` entry in pyproject.toml, `dirac` is missing cmake	2023-10-26 12:03:59 +00:00
konsti	862c1654a0	Select most recent wheel, most recent sdist (#190 ) Select a compatible wheel for a version, even we already found a source distribution previously. If no wheel is found, select the most recent source distribution, not the oldest compatible one. This fixes the resolution of `mst.in`, which i added	2023-10-26 08:15:26 +00:00
Charlie Marsh	21bb9c29cc	Add an additional requirements fixup (#174 ) Also checking in a variety of different requirements inputs.	2023-10-23 19:50:39 -04:00
Charlie Marsh	bd01fb490e	Remove packages when syncing (#135 ) `pip-sync` will now uninstall any packages that aren't necessary. Closes https://github.com/astral-sh/puffin/issues/128.	2023-10-19 00:14:20 -04:00
Charlie Marsh	e15b99b911	Rename commands to `pip-sync` and `pip-compile` (#123 ) To free up the rest of the interface.	2023-10-18 21:15:20 +00:00
konsti	8cc4fe0d44	Install source distribution requirements with puffin itself instead of pip (#122 ) This is also a lot faster. Unfortunately it copies a lot of code from the sync cli since the `Printer` is private. The first commit are some refactorings i made when i thought about how i could reuse the existing code.	2023-10-18 19:11:17 +00:00
Charlie Marsh	471a1d657d	Migrate resolver proof-of-concept to PubGrub (#97 ) ## Summary This PR enables the proof-of-concept resolver to backtrack by way of using the `pubgrub-rs` crate. Rather than using PubGrub as a _framework_ (implementing the `DependencyProvider` trait, letting PubGrub call us), I've instead copied over PubGrub's primary solver hook (which is only ~100 lines or so) and modified it for our purposes (e.g., made it async). There's a lot to improve here, but it's a start that will let us understand PubGrub's appropriateness for this problem space. A few observations: - In simple cases, the resolver is slower than our current (naive) resolver. I think it's just that the pipelining isn't as efficient as in the naive case, where we can just stream package and version fetches concurrently without any bottlenecks. - A lot of the code here relates to bridging PubGrub with our own abstractions -- so we need a `PubGrubPackage`, a `PubGrubVersion`, etc.	2023-10-15 22:05:44 -04:00
Charlie Marsh	f03c605bde	Add a script to benchmark uninstalls (#78 )	2023-10-09 16:59:15 -04:00
Charlie Marsh	75cb7a0178	Add benchmark scripts (#69 ) Moving these out of the README and into proper scripts.	2023-10-08 23:37:38 +00:00

21 commits