<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
I decided to disable the new
[`needless_continue`](https://rust-lang.github.io/rust-clippy/master/index.html#needless_continue)
rule because I often found the explicit `continue` more readable over an
empty block or having to invert the condition of an other branch.
## Test Plan
`cargo test`
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR adds a fallback logic for `is_python_notebook` to check the
`kernelspec.language` field.
Reference implementation in VS Code:
1c31e75898/extensions/ipynb/src/deserializers.ts (L20-L22)
It's also required for the kernel to provide the `language` they're
implementing based on
https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs
reference although that's for the `kernel.json` file but is also
included in the notebook metadata.
Closes: #12281
## Test Plan
Add a test case for `is_python_notebook` and include the test notebook
for round trip validation.
The test notebook contains two cells, one is JavaScript (denoted via the
`vscode.languageId` metadata) and the other is Python (no metadata). The
notebook metadata only contains `kernelspec` and the `language_info` is
absent.
I also verified that this is a valid notebook by opening it in Jupyter
Lab, VS Code and using `nbformat` validator.
## Summary
This PR adds support for VS Code specific cell metadata to consider when
collecting valid code cells.
For context, Ruff only runs on valid code cells. These are the code
cells that doesn't contain cell magics. Previously, Ruff only used the
notebook's metadata to determine whether it's a Python notebook. But, in
VS Code, a notebook's preferred language might be Python but it could
still contain code cells for other languages. This can be determined
with the `metadata.vscode.languageId` field.
### References:
* https://code.visualstudio.com/docs/languages/identifiers
* e6c009a3d4/extensions/ipynb/src/serializers.ts (L104-L107)
*
e6c009a3d4/extensions/ipynb/src/serializers.ts (L117-L122)
This brings us one step closer to fixing #12281.
## Test Plan
Add test cases for `is_valid_python_code_cell` and an integration test
case which showcase running it end to end. The test notebook contains a
JavaScript code cell and a Python code cell.
## Summary
Closes#11914.
This PR introduces a snapshot test that replays the LSP requests made
during a document formatting request, and confirms that the notebook
document is updated in the expected way.
## Summary
Closes https://github.com/astral-sh/ruff/issues/10858.
`ruff server` now supports `*.ipynb` (aka Jupyter Notebook) files.
Extensive internal changes have been made to facilitate this, which I've
done some work to contextualize with documentation and an pre-review
that highlights notable sections of the code.
`*.ipynb` cells should behave similarly to `*.py` documents, with one
major exception. The format command `ruff.applyFormat` will only apply
to the currently selected notebook cell - if you want to format an
entire notebook document, use `Format Notebook` from the VS Code context
menu.
## Test Plan
The VS Code extension does not yet have Jupyter Notebook support
enabled, so you'll first need to enable it manually. To do this,
checkout the `pre-release` branch and modify `src/common/server.ts` as
follows:
Before:

After:

I recommend testing this PR with large, complicated notebook files. I
used notebook files from [this popular
repository](https://github.com/jakevdp/PythonDataScienceHandbook/tree/master/notebooks)
in my preliminary testing.
The main thing to test is ensuring that notebook cells behave the same
as Python documents, besides the aforementioned issue with
`ruff.applyFormat`. You should also test adding and deleting cells (in
particular, deleting all the code cells and ensure that doesn't break
anything), changing the kind of a cell (i.e. from markup -> code or vice
versa), and creating a new notebook file from scratch. Finally, you
should also test that source actions work as expected (and across the
entire notebook).
Note: `ruff.applyAutofix` and `ruff.applyOrganizeImports` are currently
broken for notebook files, and I suspect it has something to do with
https://github.com/astral-sh/ruff/issues/11248. Once this is fixed, I
will update the test plan accordingly.
---------
Co-authored-by: nolan <nolan.king90@gmail.com>
## Summary
I used `cargo-shear` (see
[tweet](https://twitter.com/boshen_c/status/1770106165923586395)) to
remove some unused dependencies that `cargo udeps` wasn't reporting.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
## Summary
I used `codespell` and `gramma` to identify mispellings and grammar
errors throughout the codebase and fixed them. I tried not to make any
controversial changes, but feel free to revert as you see fit.
## Summary
Given a statement like `colors = 6`, we currently treat the cell as an
automagic (since `colors` is an automagic) -- i.e., we assume it's
equivalent to `%colors = 6`. This PR adds some additional detection
whereby if the statement is an _assignment_, we avoid treating it as
such. I audited the list of automagics, and I believe this is safe for
all of them.
Closes https://github.com/astral-sh/ruff/issues/8526.
Closes https://github.com/astral-sh/ruff/issues/9648.
## Test Plan
`cargo test`
When formatting notebooks, we populate the `id` field for cells that do
not have one. Previously, we generated a UUID v4 which resulted in
non-deterministic formatting. Here, we generate the UUID from a seeded
random number generator instead of using true randomness. For example,
here are the first five ids it would generate:
```
7fb27b94-1602-401d-9154-2211134fc71a
acae54e3-7e7d-407b-bb7b-55eff062a284
9a63283c-baf0-4dbc-ab1f-6479b197f3a8
8dd0d809-2fe7-4a7c-9628-1538738b07e2
72eea511-9410-473a-a328-ad9291626812
```
We also add a check that an id is not present in another cell to prevent
accidental introduction of duplicate ids.
The specification is lax, and we could just use incrementing integers
e.g. `0`, `1`, ... but I have a minor preference for retaining the UUID
format. Some discussion
[here](https://github.com/astral-sh/ruff/pull/9359#discussion_r1439607121)
— I'm happy to go either way though.
Discovered via #9293
## Summary
This PR modifies our `Cargo.toml` files to use workspace dependencies
for _all_ dependencies, rather than the status quo of sporadically
trying to use workspace dependencies for those dependencies that are
used across multiple crates. I find the current situation more confusing
and harder to manage, since we have a mix of workspace and crate-local
dependencies, whereas this setup consistently uses the same approach for
all dependencies.
The example below used to panic because we tried to split at 2 bytes in
the 4-bytes character `转`.
```python
def sample_func(xx):
"""
转置 (transpose)
"""
return xx.T
```
Fixes#9145
Fixes https://github.com/astral-sh/ruff-vscode/issues/362
The second commit is a small test refactoring.
## Summary
This PR updates the logic for `is_magic_cell` to include certain cell
magics. These cell magics would contain Python code following the line
defining the command. The code could define a variable which can then be
referenced in other cells. Currently, we would ignore the cell
completely leading to undefined-name violation.
As discussed in
https://github.com/astral-sh/ruff/issues/8354#issuecomment-1832221009
## Test Plan
Add new test case to validate this scenario.
## Summary
This PR updates the `E402` rule to work at cell level for Jupyter
notebooks. This is enabled only in preview to gather feedback.
The implementation basically resets the import boundary flag on the
semantic model when we encounter the first statement in a cell.
Another potential solution is to introduce `E403` rule that is
specifically for notebooks that works at cell level while `E402` will be
disabled for notebooks.
## Test Plan
Add a notebook with imports in multiple cells and verify that the rule
works as expected.
resolves: #8669
## Summary
This PR updates `B015` and `B018` to ignore last top-level expressions
in each cell of a Jupyter Notebook.
Part of #8669
## Test Plan
Add test cases for both rules and update the snapshots.
Update to [Rust
1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use
the new clippy lints table.
The update itself introduced a new clippy lint about superfluous hashes
in raw strings, which got removed.
I moved our lint config from `rustflags` to the newly stabilized
[workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table).
One consequence is that we have to `unsafe_code = "warn"` instead of
"forbid" because the latter now actually bans unsafe code:
```
error[E0453]: allow(unsafe_code) incompatible with previous forbid
--> crates/ruff_source_file/src/newlines.rs:62:17
|
62 | #[allow(unsafe_code)]
| ^^^^^^^^^^^ overruled by previous forbid
|
= note: `forbid` lint level was set on command line
```
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
LangChain is attempting to use Ruff over their Jupyter notebooks
(https://github.com/langchain-ai/langchain/pull/12677/files), but
running into a bunch of syntax errors, the majority of which come from
our inability to recognize automagic.
If you run this in a cell:
```jupyter
pip install requests
```
Jupyter will automatically treat that as:
```jupyter
%pip install requests
```
We need to ignore cells that use these automagics, since the parser
doesn't understand them. (I guess we could support it in the parser, but
that seems much harder?). The good news is that AFAICT Jupyter doesn't
let you mix automagics with code, so by skipping these cells, we don't
miss out on analyzing any Python code.
## Test Plan
1. `cargo test`
2. Ran over LangChain and verified that there are no more errors
relating to `pip install` automagics.
## Summary
This PR adds a new `cell` field to the JSON output format which
indicates the Notebook cell this diagnostic (and fix) belongs to. It
also updates the location for the diagnostic and fixes as per the
`NotebookIndex`. It will be used in the VSCode extension to display the
diagnostic in the correct cell.
The diagnostic and edit start and end source locations are translated
for the notebook as per the `NotebookIndex`. The end source location for
an edit needs some special handling.
### Edit end location
To understand this, the following context is required:
1. Visible lines in Jupyter Notebook vs JSON array strings: The newline
is part of the string in the JSON format. This means that if there are 3
visible lines in a cell where the last line is empty then the JSON would
contain 2 strings in the source array, both ending with a newline:
**JSON format:**
```json
[
"# first line\n",
"# second line\n",
]
```
**Notebook view:**
```python
1 # first line
2 # second line
3
```
2. If an edit needs to remove an entire line including the newline, then
the end location would be the start of the next row.
To remove a statement in the following code:
```python
import os
```
The edit would be:
```
start: row 1, col 1
end: row 2, col 1
```
Now, here's where the problem lies. The notebook index doesn't have any
information for row 2 because it doesn't exists in the actual notebook.
The newline was added by Ruff to concatenate the source code and it's
removed before writing back. But, the edit is computed looking at that
newline.
This means that while translating the end location for an edit belong to
a Notebook, we need to check if both the start and end location belongs
to the same cell. If not, then the end location should be the first
character of the next row and if so, translate that back to the last
character of the previous row. Taking the above example, the translated
location for Notebook would be:
```
start: row 1, col 1
end: row 1, col 10
```
## Test Plan
Add test cases for notebook output in the JSON format and update
existing snapshots.
## Summary
This PR refactors the `NotebookIndex` struct to use `OneIndexed` to make
the
intent of the code clearer.
## Test Plan
Update the existing test case and run `cargo test` to verify the change.
- [x] Verify `--diff` output
- [x] Verify the diagnostics output
- [x] Verify `--show-source` output
## Summary
When writing back notebooks via `stdout`, we need to write back the
entire JSON content, not _just_ the fixed source code. Otherwise,
writing the output _back_ to the file will yield an invalid notebook.
Closes https://github.com/astral-sh/ruff/issues/7747
## Test Plan
`cargo test`
## Summary
This PR fixes the bug where the `NotebookIndex` was not being computed
when
using stdin as the input source.
## Test Plan
On `main`, the diagnostic output won't include the cell number when
using stdin
while it'll be included after this fix.
### `main`
```console
$ cat ~/playground/ruff/notebooks/test.ipynb | cargo run --bin ruff -- check --isolated --no-cache - --stdin-filename ~/playground/ruff/notebooks/test.ipynb
/Users/dhruv/playground/ruff/notebooks/test.ipynb:2:8: F401 [*] `math` imported but unused
/Users/dhruv/playground/ruff/notebooks/test.ipynb:7:8: F811 Redefinition of unused `random` from line 1
/Users/dhruv/playground/ruff/notebooks/test.ipynb:8:8: F401 [*] `pprint` imported but unused
/Users/dhruv/playground/ruff/notebooks/test.ipynb:12:4: F632 [*] Use `==` to compare constant literals
/Users/dhruv/playground/ruff/notebooks/test.ipynb:13:38: F632 [*] Use `==` to compare constant literals
Found 5 errors.
[*] 4 potentially fixable with the --fix option.
```
### `dhruv/notebook-index-stdin`
```console
$ cat ~/playground/ruff/notebooks/test.ipynb | cargo run --bin ruff -- check --isolated --no-cache - --stdin-filename ~/playground/ruff/notebooks/test.ipynb
/Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 3:2:8: F401 [*] `math` imported but unused
/Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 5:1:8: F811 Redefinition of unused `random` from line 1
/Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 5:2:8: F401 [*] `pprint` imported but unused
/Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 6:2:4: F632 [*] Use `==` to compare constant literals
/Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 6:3:38: F632 [*] Use `==` to compare constant literals
Found 5 errors.
[*] 4 potentially fixable with the --fix option.
```
## Summary
This PR bumps the pyproject-toml crate to 0.7.0. The only difference is that it now depends on indexmap 2. I reviewed the indexmap 2 changes and they don't seem relevant to us.
I used this opportunity to remove the default features from `serde_with` which removes our indexmap 1 dependency (and some other unused dependencies)
## Test Plan
`cargo test`
## Summary
This PR updates the `FileCache` to include an optional `NotebookIndex`
to support caching for Jupyter Notebooks.
We only require the index to compute the diagnostics and thus we don't
really need to store the entire `Notebook` on the `Diagnostics` struct.
This means we only need the index to be stored in the cache to
reconstruct the `Diagnostics`.
## Test Plan
Update an existing test case to run over the fixtures under
`ruff_notebook` crate where there are multiple Jupyter Notebook.
Locally, the following commands were run in order:
1. Remove the cache: `rm -rf .ruff_cache`
2. Run without cache: `cargo run --bin ruff -- check --isolated
crates/ruff_notebook/resources/test/fixtures/jupyter/unused_variable.ipynb
--no-cache`
3. Run with cache: `cargo run --bin ruff -- check --isolated
crates/ruff_notebook/resources/test/fixtures/jupyter/unused_variable.ipynb`
4. Check whether the `.ruff_cache` directory was created or not
5. Run with cache again and verify: `cargo run --bin ruff -- check
--isolated
crates/ruff_notebook/resources/test/fixtures/jupyter/unused_variable.ipynb`
## Benchmarks
https://github.com/astral-sh/ruff/pull/6863#issuecomment-1715675186fixes: #6671
## Summary
This PR moves `ruff/jupyter` into its own `ruff_notebook` crate. Beyond
the move itself, there were a few challenges:
1. `ruff_notebook` relies on the source map abstraction. I've moved the
source map into `ruff_diagnostics`, since it doesn't have any
dependencies on its own and is used alongside diagnostics.
2. `ruff_notebook` has a couple tests for end-to-end linting and
autofixing. I had to leave these tests in `ruff` itself.
3. We had code in `ruff/jupyter` that relied on Python lexing, in order
to provide a more targeted error message in the event that a user saves
a `.py` file with a `.ipynb` extension. I removed this in order to avoid
a dependency on the parser, it felt like it wasn't worth retaining just
for that dependency.
## Test Plan
`cargo test`