## Summary
I think it makes sense for `PythonVersion::default()` to return our
minimum-supported non-EOL version.
## Test Plan
`cargo test`
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
Enable using the new `Mode::Jupyter` for the tokenizer/parser to parse
Jupyter line magic tokens.
The individual call to the lexer i.e., `lex_starts_at` done by various
rules should consider the context of the source code (is this content
from a Jupyter Notebook?). Thus, a new field `source_type` (of type
`PySourceType`) is added to `Checker` which is being passed around as an
argument to the relevant functions. This is then used to determine the
`Mode` for the lexer.
## Test Plan
Add new test cases to make sure that the magic statement is considered
while generating the diagnostic and autofix:
* For `I001`, if there's a magic statement in between two import blocks,
they should be sorted independently
fixes: #6090
**Summary** This adds the information whether we're in a .py python
source file or in a .pyi stub file to enable people working on #5822 and
related issues.
I'm not completely happy with `Default` for something that depends on
the input.
**Test Plan** None, this is currently unused, i'm leaving this to first
implementation of stub file specific formatting.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Right now, if we have two fixes that have an overlapping edit, but not
an _identical_ set of edits, they'll conflict, causing us to do another
linter traversal. Here, I've enabled the fixer to support partially
overlapping edits, which (as an example) let's us greatly reduce the
number of iterations required in the test suite.
The most common case here is that in which a bunch of edits need to
import some symbol, and then use that symbol, but in different ways. In
that case, all edits will have a common fix (to import the symbol), but
deviate in some way. With this change, we can do all of those edits in
one pass.
Note that the simplest way to enable this was to store sorted edits on
`Fix`. We don't allow modifying the edits on `Fix` once it's
constructed, so this is an easy change, and allows us to avoid a bunch
of clones and traversals later on.
Closes#5800.
## Summary
This PR adds a `logger-objects` setting that allows users to mark
specific symbols a `logging.Logger` objects. Currently, if a `logger` is
imported, we only flagged it as a `logging.Logger` if it comes exactly
from the `logging` module or is `flask.current_app.logger`.
This PR allows users to mark specific loggers, like
`logging_setup.logger`, to ensure that they're covered by the
`flake8-logging-format` rules and others.
For example, if you have a module `logging_setup.py` with the following
contents:
```python
import logging
logger = logging.getLogger(__name__)
```
Adding `"logging_setup.logger"` to `logger-objects` will ensure that
`logging_setup.logger` is treated as a `logging.Logger` object when
imported from other modules (e.g., `from logging_setup import logger`).
Closes https://github.com/astral-sh/ruff/issues/5694.
## Summary
Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
x = 1
else:
if cond2:
x = 2
```
and
```python
if cond1:
x = 1
elif cond2:
x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.
We change the representation to a flat
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub elif_else_clauses: Vec<ElifElseClause>,
}
pub struct ElifElseClause {
pub range: TextRange,
pub test: Option<Expr>,
pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.
This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).
## Merge Plan
This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.
* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
* Merge https://github.com/astral-sh/RustPython-Parser/pull/22
* Create tag in the parser, update linter+formatter PR
* Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
These had just bitrotted over time -- we were no longer passing along
the row-and-column indices, etc.
## Test Plan

## Summary
`v0.0.275` in the top-right was showing `v0.0.0` at all times.
## Test Plan

## Summary
I initially wanted this category to be more general and decoupled from
the plugin, but I got some feedback that the titling felt inconsistent
with others.
## Summary
Add support for applying auto-fixes in Jupyter Notebook.
### Solution
Cell offsets are the boundaries for each cell in the concatenated source
code. They are represented using `TextSize`. It includes the start and
end offset as well, thus creating a range for each cell. These offsets
are updated using the `SourceMap` markers.
### SourceMap
`SourceMap` contains markers constructed from each edits which tracks
the original source code position to the transformed positions. The
following drawing might make it clear:

The center column where the dotted lines are present are the markers
included in the `SourceMap`. The `Notebook` looks at these markers and
updates the cell offsets after each linter loop. If you notice closely,
the destination takes into account all of the markers before it.
The index is constructed only when required as it's only used to render
the diagnostics. So, a `OnceCell` is used for this purpose. The cell
offsets, cell content and the index will be updated after each iteration
of linting in the mentioned order. The order is important here as the
content is updated as per the new offsets and index is updated as per
the new content.
## Limitations
### 1
Styling rules such as the ones in `pycodestyle` will not be applicable
everywhere in Jupyter notebook, especially at the cell boundaries. Let's
take an example where a rule suggests to have 2 blank lines before a
function and the cells contains the following code:
```python
import something
# ---
def first():
pass
def second():
pass
```
(Again, the comment is only to visualize cell boundaries.)
In the concatenated source code, the 2 blank lines will be added but it
shouldn't actually be added when we look in terms of Jupyter notebook.
It's as if the function `first` is at the start of a file.
`nbqa` solves this by recording newlines before and after running
`autopep8`, then running the tool and restoring the newlines at the end
(refer https://github.com/nbQA-dev/nbQA/pull/807).
## Test Plan
Three commands were run in order with common flags (`--select=ALL
--no-cache --isolated`) to isolate which stage the problem is occurring:
1. Only diagnostics
2. Fix with diff (`--fix --diff`)
3. Fix (`--fix`)
### https://github.com/facebookresearch/segment-anything
```
-------------------------------------------------------------------------------
Jupyter Notebooks 3 0 0 0 0
|- Markdown 3 98 0 94 4
|- Python 3 513 468 4 41
(Total) 611 468 98 45
-------------------------------------------------------------------------------
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/segment-anything/**/*.ipynb --fix
...
Found 180 errors (89 fixed, 91 remaining).
```
### https://github.com/openai/openai-cookbook
```
-------------------------------------------------------------------------------
Jupyter Notebooks 65 0 0 0 0
|- Markdown 64 3475 12 2507 956
|- Python 65 9700 7362 1101 1237
(Total) 13175 7374 3608 2193
===============================================================================
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/openai-cookbook/**/*.ipynb --fix
error: Failed to parse /path/to/openai-cookbook/examples/vector_databases/Using_vector_databases_for_embeddings_search.ipynb:cell 4:29:18: unexpected token '-'
...
Found 4227 errors (2165 fixed, 2062 remaining).
```
### https://github.com/tensorflow/docs
```
-------------------------------------------------------------------------------
Jupyter Notebooks 150 0 0 0 0
|- Markdown 1 55 0 46 9
|- Python 1 402 289 60 53
(Total) 457 289 106 62
-------------------------------------------------------------------------------
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/tensorflow-docs/**/*.ipynb --fix
error: Failed to parse /path/to/tensorflow-docs/site/en/guide/extension_type.ipynb:cell 80:1:1: unexpected token Indent
error: Failed to parse /path/to/tensorflow-docs/site/en/r1/tutorials/eager/custom_layers.ipynb:cell 20:1:1: unexpected token Indent
error: Failed to parse /path/to/tensorflow-docs/site/en/guide/data.ipynb:cell 175:5:14: unindent does not match any outer indentation level
error: Failed to parse /path/to/tensorflow-docs/site/en/r1/tutorials/representation/unicode.ipynb:cell 30:1:1: unexpected token Indent
...
Found 12726 errors (5140 fixed, 7586 remaining).
```
### https://github.com/tensorflow/models
```
-------------------------------------------------------------------------------
Jupyter Notebooks 46 0 0 0 0
|- Markdown 1 11 0 6 5
|- Python 1 328 249 19 60
(Total) 339 249 25 65
-------------------------------------------------------------------------------
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/tensorflow-models/**/*.ipynb --fix
...
Found 4856 errors (2690 fixed, 2166 remaining).
```
resolves: #1218fixes: #4556
## Summary
Add copyright notice detection to enforce the presence of copyright
headers in Python files.
Configurable settings include: the relevant regular expression, the
author name, and the minimum file size, similar to
[flake8-copyright](https://github.com/savoirfairelinux/flake8-copyright).
Closes https://github.com/charliermarsh/ruff/issues/3579
---------
Signed-off-by: ryan <ryang@waabi.ai>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>