Commit graph

23 commits

Author SHA1 Message Date
Charlie Marsh
5849a75223
Rename ruff crate to ruff_linter (#7529) 2023-09-20 08:38:27 +02:00
Dhruv Manilawala
ee0f1270cf
Add NotebookIndex to the cache (#6863)
## Summary

This PR updates the `FileCache` to include an optional `NotebookIndex`
to support caching for Jupyter Notebooks.

We only require the index to compute the diagnostics and thus we don't
really need to store the entire `Notebook` on the `Diagnostics` struct.
This means we only need the index to be stored in the cache to
reconstruct the `Diagnostics`.

## Test Plan

Update an existing test case to run over the fixtures under
`ruff_notebook` crate where there are multiple Jupyter Notebook.

Locally, the following commands were run in order:
1. Remove the cache: `rm -rf .ruff_cache`
2. Run without cache: `cargo run --bin ruff -- check --isolated
crates/ruff_notebook/resources/test/fixtures/jupyter/unused_variable.ipynb
--no-cache`
3. Run with cache: `cargo run --bin ruff -- check --isolated
crates/ruff_notebook/resources/test/fixtures/jupyter/unused_variable.ipynb`
4. Check whether the `.ruff_cache` directory was created or not
5. Run with cache again and verify: `cargo run --bin ruff -- check
--isolated
crates/ruff_notebook/resources/test/fixtures/jupyter/unused_variable.ipynb`

## Benchmarks

https://github.com/astral-sh/ruff/pull/6863#issuecomment-1715675186

fixes: #6671
2023-09-12 18:29:03 +05:30
Charlie Marsh
afcd00da56
Create ruff_notebook crate (#7039)
## Summary

This PR moves `ruff/jupyter` into its own `ruff_notebook` crate. Beyond
the move itself, there were a few challenges:

1. `ruff_notebook` relies on the source map abstraction. I've moved the
source map into `ruff_diagnostics`, since it doesn't have any
dependencies on its own and is used alongside diagnostics.
2. `ruff_notebook` has a couple tests for end-to-end linting and
autofixing. I had to leave these tests in `ruff` itself.
3. We had code in `ruff/jupyter` that relied on Python lexing, in order
to provide a more targeted error message in the event that a user saves
a `.py` file with a `.ipynb` extension. I removed this in order to avoid
a dependency on the parser, it felt like it wasn't worth retaining just
for that dependency.

## Test Plan

`cargo test`
2023-09-01 13:56:44 +00:00
Charlie Marsh
059757a8c8
Implement Ranged on more structs (#6921)
Now that it's in `ruff_text_size`, we can use it in a few places that we
couldn't before.
2023-08-27 19:03:08 +00:00
Micha Reiser
ea72d5feba
Refactor SourceKind to store file content (#6640) 2023-08-18 13:45:38 +00:00
konsti
1df7e9831b
Replace .map_or(false, $closure) with .is_some_and(closure) (#6244)
**Summary**
[Option::is_some_and](https://doc.rust-lang.org/stable/std/option/enum.Option.html#method.is_some_and)
and
[Result::is_ok_and](https://doc.rust-lang.org/std/result/enum.Result.html#method.is_ok_and)
are new methods is rust 1.70. I find them way more readable than
`.map_or(false, ...)`.

The changes are `s/.map_or(false,/.is_some_and(/g`, then manually
switching to `is_ok_and` where the value is a Result rather than an
Option.

**Test Plan** n/a^
2023-08-01 19:29:42 +02:00
Micha Reiser
2cf00fee96
Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Charlie Marsh
716cab2f19
Run rustfmt on nightly to clean up erroneous comments (#5106)
## Summary

This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:

```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```

Since these all seem like reasonable things to fix, so may as well while
I'm here.
2023-06-15 00:19:05 +00:00
Aarni Koskela
7b4dde0c6c
Add JSON Lines (NDJSON) message serialization (#5048)
## Summary

This adds `json-lines` (https://jsonlines.org/ or http://ndjson.org/) as
an output format.

I'm sure you already know, but

* JSONL is more greppable (each record is a single line) than the pretty
JSON
* JSONL is faster to ingest piecewise (and/or in parallel) than JSON

## Test Plan

Snapshot test in the new module :)
2023-06-13 14:15:55 +00:00
Dhruv Manilawala
d8f5d2d767
Add support for auto-fix in Jupyter notebooks (#4665)
## Summary

Add support for applying auto-fixes in Jupyter Notebook.

### Solution

Cell offsets are the boundaries for each cell in the concatenated source
code. They are represented using `TextSize`. It includes the start and
end offset as well, thus creating a range for each cell. These offsets
are updated using the `SourceMap` markers.

### SourceMap

`SourceMap` contains markers constructed from each edits which tracks
the original source code position to the transformed positions. The
following drawing might make it clear:

![SourceMap visualization](3c94e591-70a7-4b57-bd32-0baa91cc7858)

The center column where the dotted lines are present are the markers
included in the `SourceMap`. The `Notebook` looks at these markers and
updates the cell offsets after each linter loop. If you notice closely,
the destination takes into account all of the markers before it.

The index is constructed only when required as it's only used to render
the diagnostics. So, a `OnceCell` is used for this purpose. The cell
offsets, cell content and the index will be updated after each iteration
of linting in the mentioned order. The order is important here as the
content is updated as per the new offsets and index is updated as per
the new content.

## Limitations

### 1

Styling rules such as the ones in `pycodestyle` will not be applicable
everywhere in Jupyter notebook, especially at the cell boundaries. Let's
take an example where a rule suggests to have 2 blank lines before a
function and the cells contains the following code:

```python
import something
# ---
def first():
	pass

def second():
	pass
```

(Again, the comment is only to visualize cell boundaries.)

In the concatenated source code, the 2 blank lines will be added but it
shouldn't actually be added when we look in terms of Jupyter notebook.
It's as if the function `first` is at the start of a file.

`nbqa` solves this by recording newlines before and after running
`autopep8`, then running the tool and restoring the newlines at the end
(refer https://github.com/nbQA-dev/nbQA/pull/807).

## Test Plan

Three commands were run in order with common flags (`--select=ALL
--no-cache --isolated`) to isolate which stage the problem is occurring:
1. Only diagnostics
2. Fix with diff (`--fix --diff`)
3. Fix (`--fix`)

### https://github.com/facebookresearch/segment-anything

```
-------------------------------------------------------------------------------
 Jupyter Notebooks       3            0            0            0            0
 |- Markdown             3           98            0           94            4
 |- Python               3          513          468            4           41
 (Total)                            611          468           98           45
-------------------------------------------------------------------------------
```

```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/segment-anything/**/*.ipynb --fix
...
Found 180 errors (89 fixed, 91 remaining).
```

### https://github.com/openai/openai-cookbook

```
-------------------------------------------------------------------------------
 Jupyter Notebooks      65            0            0            0            0
 |- Markdown            64         3475           12         2507          956
 |- Python              65         9700         7362         1101         1237
 (Total)                          13175         7374         3608         2193
===============================================================================
```

```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/openai-cookbook/**/*.ipynb --fix
error: Failed to parse /path/to/openai-cookbook/examples/vector_databases/Using_vector_databases_for_embeddings_search.ipynb:cell 4:29:18: unexpected token '-'
...
Found 4227 errors (2165 fixed, 2062 remaining).
```

### https://github.com/tensorflow/docs

```
-------------------------------------------------------------------------------
 Jupyter Notebooks     150            0            0            0            0
 |- Markdown             1           55            0           46            9
 |- Python               1          402          289           60           53
 (Total)                            457          289          106           62
-------------------------------------------------------------------------------
```

```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/tensorflow-docs/**/*.ipynb --fix
error: Failed to parse /path/to/tensorflow-docs/site/en/guide/extension_type.ipynb:cell 80:1:1: unexpected token Indent
error: Failed to parse /path/to/tensorflow-docs/site/en/r1/tutorials/eager/custom_layers.ipynb:cell 20:1:1: unexpected token Indent
error: Failed to parse /path/to/tensorflow-docs/site/en/guide/data.ipynb:cell 175:5:14: unindent does not match any outer indentation level
error: Failed to parse /path/to/tensorflow-docs/site/en/r1/tutorials/representation/unicode.ipynb:cell 30:1:1: unexpected token Indent
...
Found 12726 errors (5140 fixed, 7586 remaining).
```

### https://github.com/tensorflow/models

```
-------------------------------------------------------------------------------
 Jupyter Notebooks      46            0            0            0            0
 |- Markdown             1           11            0            6            5
 |- Python               1          328          249           19           60
 (Total)                            339          249           25           65
-------------------------------------------------------------------------------
```

```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/tensorflow-models/**/*.ipynb --fix
...
Found 4856 errors (2690 fixed, 2166 remaining).
```

resolves: #1218
fixes: #4556
2023-06-12 14:14:15 +00:00
Charlie Marsh
b030c70dda
Move unused imports rule into its own module (#4795) 2023-06-02 04:27:23 +00:00
Micha Reiser
85f094f592
Improve Message sorting performance (#4624) 2023-05-24 16:34:48 +02:00
Charlie Marsh
19c4b7bee6
Rename ruff_python_semantic's Context struct to SemanticModel (#4565) 2023-05-22 02:35:03 +00:00
Micha Reiser
ddbe5a1243
Add Fix::applicability to JSON output (#4341) 2023-05-10 14:34:53 +00:00
Micha Reiser
8969ad5879
Always generate fixes (#4239) 2023-05-10 07:06:14 +00:00
Zanie Adkins
cf7aa26aa4
Add Applicability to Fix (#4303)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-10 08:42:46 +02:00
Zanie Adkins
0801f14046
Refactor Fix and Edit API (#4198) 2023-05-08 11:57:03 +02:00
Micha Reiser
cab65b25da
Replace row/column based Location with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Micha Reiser
e8aebee3f6
Pretty print Diagnostics in snapshot tests (#3906) 2023-04-11 09:03:00 +00:00
Micha Reiser
c33c9dc585
Introduce SourceFile to avoid cloning the message filename (#3904) 2023-04-11 08:28:55 +00:00
Micha Reiser
056c212975
Render code frame with context (#3901) 2023-04-11 10:22:11 +02:00
Micha Reiser
381203c084
Store source code on message (#3897) 2023-04-11 07:57:36 +00:00
Micha Reiser
9209e57c5a
Extract message emitters from Printer (#3895) 2023-04-11 07:24:25 +00:00