mirror of
https://github.com/astral-sh/ruff.git
synced 2025-07-19 11:05:24 +00:00
![]() ## Summary This changes the caching design from one cache file per source file, to one cache file per package. This greatly reduces the amount of cache files that are opened and written, while maintaining roughly the same (combined) size as bincode is very compact. Below are some very much not scientific performance tests. It uses projects/sources to check: * small.py: single, 31 bytes Python file with 2 errors. * test.py: single, 43k Python file with 8 errors. * fastapi: FastAPI repo, 1134 files checked, 0 errors. Source | Before # files | After # files | Before size | After size -------|-------|-------|-------|------- small.py | 1 | 1 | 20 K | 20 K test.py | 1 | 1 | 60 K | 60 K fastapi | 1134 | 518 | 4.5 M | 2.3 M One question that might come up is why fastapi still has 518 cache files and not 1? That is because this is using the existing package resolution, which sees examples, docs, etc. as separate from the "main" source code (in the fastapi directory in the repo). In this future it might be worth consider switching to a one cache file per repo strategy. This new design is not perfect and does have a number of known issues. First, like the old design it doesn't remove the cache for a source file that has been (re)moved until `ruff clean` is called. Second, this currently uses a large mutex around the mutation of the package cache (e.g. inserting result). This could be (or become) a bottleneck. It's future work to test and improve this (if needed). Third, currently the packages and opened and stored in a sequential loop, this could be done parallel. This is also future work. ## Test Plan Run `ruff check` (with caching enabled) twice on any Python source code and it should produce the same results. |
||
---|---|---|
.. | ||
src | ||
tests | ||
Cargo.toml |