## Summary
For formatter instabilities, the message we get look something like
this:
```text
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py
@@ -47,9 +47,9 @@
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/pbs.py
@@ -26,9 +26,9 @@
- pbs_script_header_dict["select_node_line"] += (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ pbs_script_header_dict[
+ "select_node_line"
+ ] += NOT_YET_IMPLEMENTED_ExprJoinedStr
```
For ruff crashes. you don't even get that but just the file that crashed
it. To extract the actual bug, you'd need to manually remove parts of
the file, rerun to see if the bug still occurs (and revert if it
doesn't) until you have a minimal example.
With this script, you run
```shell
cargo run --bin ruff_shrinking -- target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py target/minirepo/code.py "Unstable formatting" "target/debug/ruff_dev format-dev --stability-check target/minirepo"
```
and get
```python
class Slurm():
def gen_script_header(self, job):
if resources.queue_name != "":
script_header_dict["slurm_partition_line"] = f"#SBATCH --partition {resources.queue_name}"
```
which is an nice minimal example.
I've been using this script and it would be easier for me if this were
part of main. The main disadvantage to merging is that it adds
additional dependencies.
## Test Plan
I've been using this for a number of minimization. This is an internal
helper script you only run manually. I could add a test that minimizes a
rule violation if required.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Comparing repos with black requires that we use the settings as black,
notably line length and magic trailing comma behaviour. Excludes and
preserving quotes (vs. a preference for either quote style) is not yet
implemented because they weren't needed for the test projects.
In the other two commits i fixed the output when the progress bar is
hidden (this way is recommonded in the indicatif docs), added a
`scratch.pyi` file to gitignore because black formats stub files
differently and also updated the ecosystem readme with the projects json
without forks.
## Test Plan
I added a `line-length` vs `line_length` test. Otherwise only my
personal usage atm, a PR to integrate the script into the CI to check
some projects will follow.
## Summary
Document all `ruff_dev` subcommands and document the `format_dev` flags
in the formatter readme.
CC @zanieb please flag everything that isn't clear or missing
## Test Plan
n/a
## Summary
The similarity index, the fraction of unchanged lines, is easier to
understand than the jaccard index, the fraction between intersection and
union.
## Test Plan
I ran this on django and git a 0.945 index, meaning 5.5% of lines are
currently reformatted when compared to black
## Summary
This PR reworks the `upstream_categories` mechanism that is only used
for documentation purposes to make it easier to generate docs using
`all_rules()`. The new implementation also relies on "tribal knowledge"
about rule codes, so it's not the best implementation, but gets us
forward.
Another option would be to change the rule-defining proc macros to allow
configuring an optional `RuleCategory`, but that seems more heavy-handed
and possibly unnecessary in the long run...
Draft since this builds on #5439.
cc @charliermarsh :)
## Summary
We now always render the icons, but very faintly if inactive, and always
right-align. This ensures consistent alignment as you scroll down the
page:
<img width="1792" alt="Screen Shot 2023-07-09 at 10 45 50 PM"
src="da47ac0e-d646-49e1-bbe1-9f43adf94bb4">
## Summary
This changes the docs to show a nursery icon (🌅) for rules in the
nursery.
It currently doesn't do that for the rules that are in sub-categories
(Pylint, Pycodestyle) because there is no `all_rules()` for the
`RuleCodePrefix` that's returned by `UpstreamCategory` iteration (and as
mentioned on Discord, I think `UpstreamCategory` maybe shouldn't be a
thing). (That would be enabled by #5591.)
## Test Plan
Generated docs to see new icons (with the caveat above).
## Summary
This extends the `ruff_dev` formatter script util. Instead of only doing
stability checks, you can now choose different compatible options on the
CLI and get statistics.
* It adds an option the formats all files that ruff would check to allow
looking at an entire black-formatted repository with `git diff`
* It computes the [Jaccard
index](https://en.wikipedia.org/wiki/Jaccard_index) as a measure of
deviation between input and output, which is useful as single number
metric for assessing our current deviations from black.
* It adds progress bars to both the single projects as well as the
multi-project mode.
* It adds an option to write the multi-project output to a file
Sample usage:
```
$ cargo run --bin ruff_dev -- format-dev --stability-check crates/ruff/resources/test/cpython
$ cargo run --bin ruff_dev -- format-dev --stability-check /home/konsti/projects/django
Syntax error in /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors (parser error): BaseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
Found 0 stability errors in 2755 files (jaccard index 0.911) in 9.75s
$ cargo run --bin ruff_dev -- format-dev --write /home/konsti/projects/django
```
Options:
```
Several utils related to the formatter which can be run on one or more repositories. The selected set of files in a repository is the same as for `ruff check`.
* Check formatter stability: Format a repository twice and ensure that it looks that the first and second formatting look the same. * Format: Format the files in a repository to be able to check them with `git diff` * Statistics: The subcommand the Jaccard index between the (assumed to be black formatted) input and the ruff formatted output
Usage: ruff_dev format-dev [OPTIONS] [FILES]...
Arguments:
[FILES]...
Like `ruff check`'s files. See `--multi-project` if you want to format an ecosystem checkout
Options:
--stability-check
Check stability
We want to ensure that once formatted content stays the same when formatted again, which is known as formatter stability or formatter idempotency, and that the formatter prints syntactically valid code. As our test cases cover only a limited amount of code, this allows checking entire repositories.
--write
Format the files. Without this flag, the python files are not modified
--format <FORMAT>
Control the verbosity of the output
[default: default]
Possible values:
- minimal: Filenames only
- default: Filenames and reduced diff
- full: Full diff and invalid code
-x, --exit-first-error
Print only the first error and exit, `-x` is same as pytest
--multi-project
Checks each project inside a directory, useful e.g. if you want to check all of the ecosystem checkouts
--error-file <ERROR_FILE>
Write all errors to this file in addition to stdout. Only used in multi-project mode
```
## Test Plan
I ran this on django (2755 files, jaccard index 0.911) and discovered a
magic trailing comma problem and that we really needed to implement
import formatting. I ran the script on cpython to identify
https://github.com/astral-sh/ruff/pull/5558.
## Summary
This PR adds a separate configuration file to enable us to turn on
[Insiders-only
plugins](https://squidfunk.github.io/mkdocs-material/insiders/getting-started/#built-in-plugins).
I've turned on the `typeset` plugin which ensures that the settings on
the left-hand navigation pane render as code:
<img width="1792" alt="Screen Shot 2023-07-05 at 6 27 20 PM"
src="c93676dd-bb48-417a-9d3b-528bf001e9b7">
## Summary
This makes the output of `check-formatter-stability` more concise by
removing extraneous newlines. It also adds a `--error-file` option to
that script that allows creating a file with just the errors (without
the status messages) to share with others.
## Test Plan
I ran it over CPython and looked at the output. I then added the
`--error-file` option and looked at the contents of the file
## Summary
As discussed on ~IRC~ Discord, this will make it easier for e.g. the
docs generation stuff to get all rules for a linter (using
`all_rules()`) instead of just non-nursery ones, and it also makes it
more Explicit Is Better Than Implicit to iterate over linter rules.
Grepping for `Item = Rule` reveals some remaining implicit
`IntoIterator`s that I didn't feel were necessarily in scope for this
(and honestly, iterating over a `RuleSet` makes sense).
Support for `let…else` formatting was just merged to nightly
(rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02
to pick this up. Followup to #939.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR uses rayon to parallelize the stability check by scheduling each project as its own task.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I ran the ecosystem check. It now makes use of all cores (except at the end, there are some large projects).
## Performance
The check now completes in minutes where it took about 30 minutes before.
<!-- How was it tested? -->
ruff_dev repeat recently broke (i think with the cargo update?):
> thread 'main' panicked at 'Command repeat: Short option names must be
unique for each argument, but '-n' is in use by both 'no_cache' and
'repeat''
This fixes this by removing the short argument.
## Summary
Experimental release for Jupyter Notebook integration.
Currently, this requires a user to explicitly opt-in using the
[include](https://beta.ruff.rs/docs/settings/#include) configuration:
```toml
[tool.ruff]
include = ["*.py", "*.pyi", "**/pyproject.toml", "*.ipynb"]
```
Or, a user can pass in the file directly:
```sh
ruff check path/to/notebook.ipynb
```
For known limitations, please refer #5188
## Test Plan
Following command should work without the `--all-features` flag:
```sh
cargo dev round-trip /path/to/notebook.ipynb
```
Following command should work with the above config file along with
`select = ["ALL"]`:
```sh
cargo run --bin ruff -- check --no-cache --config=../test-repos/openai-cookbook/pyproject.toml --fix ../test-repos/openai-cookbook/
```
Passing the Jupyter notebook directly:
```sh
cargo run --bin ruff -- check --no-cache --isolated --select=ALL --fix ../test-repos/openai-cookbook/examples/Classification_using_embeddings.ipynb
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds a new `PyFormatOptions` struct that stores the python formatter options.
The new options aren't used yet, with the exception of magical trailing commas and the options passed to the printer.
I'll follow up with more PRs that use the new options (e.g. `QuoteStyle`).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test` I'll follow up with a new PR that adds support for overriding the options in our fixture tests.
## Summary
This contains three changes:
* repos in `check_ecosystem.py` are stored as `org:name` instead of
`org/name` to create a flat directory layout
* `check_ecosystem.py` performs a maximum of 50 parallel jobs at the
same time to avoid consuming to much RAM
* `check-formatter-stability` gets a new option `--multi-project` so
it's possible to do `cargo run --bin ruff_dev --
check-formatter-stability --multi-project target/checkouts`
With these three changes it becomes easy to check the formatter
stability over a larger number of repositories. This is part of the
integration of integrating formatter regressions checks into the
ecosystem checks.
## Test Plan
```shell
python scripts/check_ecosystem.py --checkouts target/checkouts --projects github_search.jsonl -v $(which true) $(which true)
cargo run --bin ruff_dev -- check-formatter-stability --multi-project target/checkouts
```
## Summary
We want to ensure that once formatted content stays the same when
formatted again, which is known as formatter stability or formatter
idempotency, and that the formatter prints syntactically valid code. As
our test cases cover only a limited amount of code, this allows checking
entire repositories.
This adds a new subcommand to `ruff_dev` which can be invoked as `cargo
run --bin ruff_dev -- check-formatter-stability <repo>`. While initially
only intended to check stability, it has also found cases where the
formatter printed invalid syntax or panicked.
## Test Plan
Running this on cpython is already identifying bugs
(https://github.com/astral-sh/ruff/pull/5089)
## Summary
This adds a new subcommand that can be used as
```shell
cargo build --bin ruff_dev --profile=release-debug
perf record -g -F 999 target/release-debug/ruff_dev repeat --repeat 30 --exit-zero --no-cache path/to/cpython > /dev/null
flamegraph --perfdata perf.data
```
## Test Plan
This is a ruff internal script. I successfully used it to profile
cpython with the instructions above
This tackles three problems:
* pre-commit was slow because it ran cargo commands
* Improve the clarity on what you need to run to get your PR pass on CI
(and make those fast)
* You had to compile and run `cargo dev generate-all` separately, which
was slow
The first change is to remove all cargo commands except running ruff
itself from pre-commit. With `cargo run --bin ruff` already compiled it
takes about 7s on my machine. It would make sense to also use the ruff
pre-commit action here even if we're then lagging a release behind for
checking ruff on ruff.
The contributing guide is now clear about what you need to run:
```shell
cargo clippy --workspace --all-targets --all-features -- -D warnings # Linting...
RUFF_UPDATE_SCHEMA=1 cargo test # Testing and updating ruff.schema.json
pre-commit run --all-files # rust and python formatting, markdown and python linting, etc.
```
Example timings from my machine:
`cargo clippy --workspace --all-targets --all-features -- -D warnings`:
23s
`RUFF_UPDATE_SCHEMA=1 cargo test`: 2min (recompiling), 1min (no code
changes, this is mainly doc tests)
`pre-commit run --all-files`: 7s
The exact numbers don't matter so much as the approximate experience (6s
is easier to just wait than 1min, esp if you need to fix and rerun). The
biggest remaining block seems to be doc tests, i'm surprised i didn't
find any solution to speeding them up (nextest simply doesn't run them
at all). Also note that the formatter has it's own tests which are much
faster since they avoid linking ruff (`cargo test
ruff_python_formatter`).
The third change is to enable `cargo test` to update the schema. Similar
to `INSTA_UPDATE=always`, i've added `RUFF_UPDATE_SCHEMA=1` (name open
to bikeshedding), so `RUFF_UPDATE_SCHEMA=1 cargo test` updates the
schema, while `cargo test` still fails as expected if the repo isn't
up-to-date.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
This PR moves `Diagnostic`, `DiagnosticKind`, and `Fix` into their own crate, which will enable us to further split up Ruff, since sub-linter crates (which need to implement functions that return `Diagnostic`) can now depend on `ruff_diagnostics` rather than Ruff.
This PR productionizes @MichaReiser's suggestion in https://github.com/charliermarsh/ruff/issues/1820#issuecomment-1440204423, by creating a separate crate for the `ast` module (`rust_python_ast`). This will enable us to further split up the `ruff` crate, as we'll be able to create (e.g.) separate sub-linter crates that have access to these common AST utilities.
This was mostly a straightforward copy (with adjustments to module imports), as the few dependencies that _did_ require modifications were handled in #3366, #3367, and #3368.