## Summary
Adds a rule to detect unions that include `typing.NoReturn` or
`typing.Never`. In such cases, the use of the bottom type is redundant.
Closes https://github.com/astral-sh/ruff/issues/9113.
## Test Plan
`cargo test`
## Summary
Given a function like:
```python
def func(x: int):
if not x:
raise ValueError
else:
raise TypeError
```
We now correctly use `NoReturn` as the return type, rather than `None`.
Closes https://github.com/astral-sh/ruff/issues/9201.
This PR adds a `as_slice` method to all the string nodes which returns
all the parts of the nodes as a slice. This will be useful in the next
PR to split the string formatting to use this method to extract the
_single node_ or _implicitly concanated nodes_.
## Summary
This PR introduces a new `StringLike` enum which is a narrow type to
indicate string-like nodes. These includes the string literals, bytes
literals, and the literal parts of f-strings.
The main motivation behind this is to avoid repetition of rule calling
in the AST checker. We add a new `analyze::string_like` function which
takes in the enum and calls all the respective rule functions which
expects atleast 2 of the variants of this enum.
I'm open to discarding this if others think it's not that useful at this
stage as currently only 3 rules require these nodes.
As suggested
[here](https://github.com/astral-sh/ruff/pull/8835#discussion_r1414746934)
and
[here](https://github.com/astral-sh/ruff/pull/8835#discussion_r1414750204).
## Test Plan
`cargo test`
Rebase of #6365 authored by @davidszotten.
## Summary
This PR updates the AST structure for an f-string elements.
The main **motivation** behind this change is to have a dedicated node
for the string part of an f-string. Previously, the existing
`ExprStringLiteral` node was used for this purpose which isn't exactly
correct. The `ExprStringLiteral` node should include the quotes as well
in the range but the f-string literal element doesn't include the quote
as it's a specific part within an f-string. For example,
```python
f"foo {x}"
# ^^^^
# This is the literal part of an f-string
```
The introduction of `FStringElement` enum is helpful which represent
either the literal part or the expression part of an f-string.
### Rule Updates
This means that there'll be two nodes representing a string depending on
the context. One for a normal string literal while the other is a string
literal within an f-string. The AST checker is updated to accommodate
this change. The rules which work on string literal are updated to check
on the literal part of f-string as well.
#### Notes
1. The `Expr::is_literal_expr` method would check for
`ExprStringLiteral` and return true if so. But now that we don't
represent the literal part of an f-string using that node, this improves
the method's behavior and confines to the actual expression. We do have
the `FStringElement::is_literal` method.
2. We avoid checking if we're in a f-string context before adding to
`string_type_definitions` because the f-string literal is now a
dedicated node and not part of `Expr`.
3. Annotations cannot use f-string so we avoid changing any rules which
work on annotation and checks for `ExprStringLiteral`.
## Test Plan
- All references of `Expr::StringLiteral` were checked to see if any of
the rules require updating to account for the f-string literal element
node.
- New test cases are added for rules which check against the literal
part of an f-string.
- Check the ecosystem results and ensure it remains unchanged.
## Performance
There's a performance penalty in the parser. The reason for this remains
unknown as it seems that the generated assembly code is now different
for the `__reduce154` function. The reduce function body is just popping
the `ParenthesizedExpr` on top of the stack and pushing it with the new
location.
- The size of `FStringElement` enum is the same as `Expr` which is what
it replaces in `FString::format_spec`
- The size of `FStringExpressionElement` is the same as
`ExprFormattedValue` which is what it replaces
I tried reducing the `Expr` enum from 80 bytes to 72 bytes but it hardly
resulted in any performance gain. The difference can be seen here:
- Original profile: https://share.firefox.dev/3Taa7ES
- Profile after boxing some node fields:
https://share.firefox.dev/3GsNXpD
### Backtracking
I tried backtracking the changes to see if any of the isolated change
produced this regression. The problem here is that the overall change is
so small that there's only a single checkpoint where I can backtrack and
that checkpoint results in the same regression. This checkpoint is to
revert using `Expr` to the `FString::format_spec` field. After this
point, the change would revert back to the original implementation.
## Review process
The review process is similar to #7927. The first set of commits update
the node structure, parser, and related AST files. Then, further commits
update the linter and formatter part to account for the AST change.
---------
Co-authored-by: David Szotten <davidszotten@gmail.com>
## Summary
Adds detection for branches without a `return` or `raise`, so that we
can properly `Optional` the return types. I'd like to remove this and
replace it with our code graph analysis from the `unreachable.rs` rule,
but it at least fixes the worst offenders.
Closes#8942.
## Summary
This PR updates the `E402` rule to work at cell level for Jupyter
notebooks. This is enabled only in preview to gather feedback.
The implementation basically resets the import boundary flag on the
semantic model when we encounter the first statement in a cell.
Another potential solution is to introduce `E403` rule that is
specifically for notebooks that works at cell level while `E402` will be
disabled for notebooks.
## Test Plan
Add a notebook with imports in multiple cells and verify that the rule
works as expected.
resolves: #8669
## Summary
This PR is a follow-up to the AST refactor which does the following:
- Remove `Deref` implementation on `StringLiteralValue` and use explicit
`as_str` calls instead. The `Deref` implementation would implicitly
perform allocations in case of implicitly concatenated strings. This is
to make sure the allocation is explicit.
- Now, certain methods can be implemented to do zero allocations which
have been implemented in this PR. They are:
- `is_empty`
- `len`
- `chars`
- Custom `PartialEq` implementation to compare each character
## Test Plan
Run the linter test suite and make sure all tests pass.
## Summary
This PR updates the string nodes (`ExprStringLiteral`,
`ExprBytesLiteral`, and `ExprFString`) to account for implicit string
concatenation.
### Motivation
In Python, implicit string concatenation are joined while parsing
because the interpreter doesn't require the information for each part.
While that's feasible for an interpreter, it falls short for a static
analysis tool where having such information is more useful. Currently,
various parts of the code uses the lexer to get the individual string
parts.
One of the main challenge this solves is that of string formatting.
Currently, the formatter relies on the lexer to get the individual
string parts, and formats them including the comments accordingly. But,
with PEP 701, f-string can also contain comments. Without this change,
it becomes very difficult to add support for f-string formatting.
### Implementation
The initial proposal was made in this discussion:
https://github.com/astral-sh/ruff/discussions/6183#discussioncomment-6591993.
There were various AST designs which were explored for this task which
are available in the linked internal document[^1].
The selected variant was the one where the nodes were kept as it is
except that the `implicit_concatenated` field was removed and instead a
new struct was added to the `Expr*` struct. This would be a private
struct would contain the actual implementation of how the AST is
designed for both single and implicitly concatenated strings.
This implementation is achieved through an enum with two variants:
`Single` and `Concatenated` to avoid allocating a vector even for single
strings. There are various public methods available on the value struct
to query certain information regarding the node.
The nodes are structured in the following way:
```
ExprStringLiteral - "foo" "bar"
|- StringLiteral - "foo"
|- StringLiteral - "bar"
ExprBytesLiteral - b"foo" b"bar"
|- BytesLiteral - b"foo"
|- BytesLiteral - b"bar"
ExprFString - "foo" f"bar {x}"
|- FStringPart::Literal - "foo"
|- FStringPart::FString - f"bar {x}"
|- StringLiteral - "bar "
|- FormattedValue - "x"
```
[^1]: Internal document:
https://www.notion.so/astral-sh/Implicit-String-Concatenation-e036345dc48943f89e416c087bf6f6d9?pvs=4
#### Visitor
The way the nodes are structured is that the entire string, including
all the parts that are implicitly concatenation, is a single node
containing individual nodes for the parts. The previous section has a
representation of that tree for all the string nodes. This means that
new visitor methods are added to visit the individual parts of string,
bytes, and f-strings for `Visitor`, `PreorderVisitor`, and
`Transformer`.
## Test Plan
- `cargo insta test --workspace --all-features --unreferenced reject`
- Verify that the ecosystem results are unchanged
Update to [Rust
1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use
the new clippy lints table.
The update itself introduced a new clippy lint about superfluous hashes
in raw strings, which got removed.
I moved our lint config from `rustflags` to the newly stabilized
[workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table).
One consequence is that we have to `unsafe_code = "warn"` instead of
"forbid" because the latter now actually bans unsafe code:
```
error[E0453]: allow(unsafe_code) incompatible with previous forbid
--> crates/ruff_source_file/src/newlines.rs:62:17
|
62 | #[allow(unsafe_code)]
| ^^^^^^^^^^^ overruled by previous forbid
|
= note: `forbid` lint level was set on command line
```
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This PR adds (unsafe) fixes to the flake8-annotations rules that enforce
missing return types, offering to automatically insert type annotations
for functions with literal return values. The logic is smart enough to
generate simplified unions (e.g., `float` instead of `int | float`) and
deal with implicit returns (`return` without a value).
Closes https://github.com/astral-sh/ruff/issues/1640 (though we could
open a separate issue for referring parameter types).
Closes https://github.com/astral-sh/ruff/issues/8213.
## Test Plan
`cargo test`
## Summary
This PR implements validation in the formatter tests to ensure that we
don't modify the AST during formatting. Black has similar logic.
In implementing this, I learned that Black actually _does_ modify the
AST, and their test infrastructure normalizes the AST to wipe away those
differences. Specifically, Black changes the indentation of docstrings,
which _does_ modify the AST; and it also inserts parentheses in `del`
statements, which changes the AST too.
Ruff also does both these things, so we _also_ implement the same
normalization using a new visitor that allows for modifying the AST.
Closes https://github.com/astral-sh/ruff/issues/8184.
## Test Plan
`cargo test`
## Summary
Adds an extra check to F632 to check for any `is` comparisons to a
mutable initialisers.
Implements #8589 .
Example:
```Python
named_var = {}
if named_var is {}: # F632 (fix)
pass
```
The if condition will always evaluate to False because it checks on
identity and it's impossible to take the same identity as a hard coded
list/set/dict initializer.
## Test Plan
Multiple test cases were added to ensure the rule works + doesn't flag
false positives + the fix works correctly.
## Summary
Adds `TRIO105` from the [flake8-trio
plugin](https://github.com/Zac-HD/flake8-trio). The `MethodName` logic
mirrors that of `TRIO100` to stay consistent within the plugin.
It is at 95% parity with the exception of upstream also checking for a
slightly more complex scenario where a call to `start()` on a
`trio.Nursery` context should also be immediately awaited. Upstream
plugin appears to just check for anything named `nursery` judging from
[the relevant issue](https://github.com/Zac-HD/flake8-trio/issues/56).
Unsure if we want to do so something similar or, alternatively, if there
is some capability in ruff to check for calls made on this context some
other way
## Test Plan
Added a new fixture, based on [the one from upstream
plugin](https://github.com/Zac-HD/flake8-trio/blob/main/tests/eval_files/trio105.py)
## Issue link
Refers: https://github.com/astral-sh/ruff/issues/8451
## Summary
This PR removes the `unicode` flag from the string literal in
`ComparableExpr`. This flag isn't required as all strings are unicode in
Python 3 so `"foo" == u"foo"`.
## Summary
This PR adds a new `LiteralExpressionRef` which wraps all of the literal
expression nodes in a single enum. This allows for a narrow type when
working exclusively with a literal node. Additionally, it also
implements a `Expr::as_literal_expr` method to return the new enum if
the expression is indeed a literal one.
A few rules have been updated to account for the new enum:
1. `redundant_literal_union`
2. `if_else_block_instead_of_dict_lookup`
3. `magic_value_comparison`
To account for the change in (2), a new `ComparableLiteral` has been
added which can be constructed from the new enum
(`ComparableLiteral::from(<LiteralExpressionRef>)`).
### Open Questions
1. The new `ComparableLiteral` can be exclusively used via the
`LiteralExpressionRef` enum. Should we remove all of the literal
variants from `ComparableExpr` and instead have a single
`ComparableExpr::Literal(ComparableLiteral)` variant instead?
## Test Plan
`cargo test`
## Summary
If the value of `shell` wasn't literally `True`, we now show a message
describing it as truthy, rather than the (misleading) `shell=True`
literal in the diagnostic.
Closes https://github.com/astral-sh/ruff/issues/8310.
## Summary
This PR adds `Default` for the following literal nodes:
* `StringLiteral`
* `BytesLiteral`
* `BooleanLiteral`
* `NoneLiteral`
* `EllipsisLiteral`
The implementation creates the zero value of the respective literal
nodes in terms of the Python language.
## Test Plan
`cargo test`
## Summary
This PR splits the `Constant` enum as individual literal nodes. It
introduces the following new nodes for each variant:
* `ExprStringLiteral`
* `ExprBytesLiteral`
* `ExprNumberLiteral`
* `ExprBooleanLiteral`
* `ExprNoneLiteral`
* `ExprEllipsisLiteral`
The main motivation behind this refactor is to introduce the new AST
node for implicit string concatenation in the coming PR. The elements of
that node will be either a string literal, bytes literal or a f-string
which can be implemented using an enum. This means that a string or
bytes literal cannot be represented by `Constant::Str` /
`Constant::Bytes` which creates an inconsistency.
This PR avoids that inconsistency by splitting the constant nodes into
it's own literal nodes, literal being the more appropriate naming
convention from a static analysis tool perspective.
This also makes working with literals in the linter and formatter much
more ergonomic like, for example, if one would want to check if this is
a string literal, it can be done easily using
`Expr::is_string_literal_expr` or matching against `Expr::StringLiteral`
as oppose to matching against the `ExprConstant` and enum `Constant`. A
few AST helper methods can be simplified as well which will be done in a
follow-up PR.
This introduces a new `Expr::is_literal_expr` method which is the same
as `Expr::is_constant_expr`. There are also intermediary changes related
to implicit string concatenation which are quiet less. This is done so
as to avoid having a huge PR which this already is.
## Test Plan
1. Verify and update all of the existing snapshots (parser, visitor)
2. Verify that the ecosystem check output remains **unchanged** for both
the linter and formatter
### Formatter ecosystem check
#### `main`
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
#### `dhruv/constant-to-literal`
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
## Summary
This PR adds a new `Singleton` enum for the `PatternMatchSingleton`
node.
Earlier the node was using the `Constant` enum but the value for this
pattern can only be either `None`, `True` or `False`. With the coming PR
to remove the `Constant`, this node required a new type to fill in.
This also has the benefit of narrowing the type down to only the
possible values for the node as evident by the removal of `unreachable`.
## Test Plan
Update the AST snapshots and run `cargo test`.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Fixes https://github.com/astral-sh/ruff/issues/7448
Fixes https://github.com/astral-sh/ruff/issues/7892
I've removed automatic dangling comment formatting, we're doing manual
dangling comment formatting everywhere anyway (the
assert-all-comments-formatted ensures this) and dangling comments would
break the formatting there.
## Test Plan
New test file.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
**Summary** Insert a newline after nested function and class
definitions, unless there is a trailing own line comment.
We need to e.g. format
```python
if platform.system() == "Linux":
if sys.version > (3, 10):
def f():
print("old")
else:
def f():
print("new")
f()
```
as
```python
if platform.system() == "Linux":
if sys.version > (3, 10):
def f():
print("old")
else:
def f():
print("new")
f()
```
even though `f()` is directly preceded by an if statement, not a
function or class definition. See the comments and fixtures for trailing
own line comment handling.
**Test Plan** I checked that the new content of `newlines.py` matches
black's formatting.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Implement
[`no-single-item-in`](https://github.com/dosisod/refurb/blob/master/refurb/checks/iterable/no_single_item_in.py)
as `single-item-membership-test` (`FURB171`).
Uses the helper function `generate_comparison` from the `pycodestyle`
implementations; this function should probably be moved, but I am not
sure where at the moment.
Update: moved it to `ruff_python_ast::helpers`.
Related to #1348.
## Test Plan
`cargo test`
## Summary
We now list each changed file when running with `--check`.
Closes https://github.com/astral-sh/ruff/issues/7782.
## Test Plan
```
❯ cargo run -p ruff_cli -- format foo.py --check
Compiling ruff_cli v0.0.292 (/Users/crmarsh/workspace/ruff/crates/ruff_cli)
rgo + Finished dev [unoptimized + debuginfo] target(s) in 1.41s
Running `target/debug/ruff format foo.py --check`
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended only for experimentation.
Would reformat: foo.py
1 file would be reformatted
```
## Summary
When lexing a number like `0x995DC9BBDF1939FA` that exceeds our small
number representation, we were only storing the portion after the base
(in this case, `995DC9BBDF1939FA`). When using that representation in
code generation, this could lead to invalid syntax, since
`995DC9BBDF1939FA)` on its own is not a valid integer.
This PR modifies the code to store the full span, including the radix
prefix.
See:
https://github.com/astral-sh/ruff/issues/7455#issuecomment-1739802958.
## Test Plan
`cargo test`
## Summary
This PR adds support for named expressions when analyzing `__all__`
assignments, as per https://github.com/astral-sh/ruff/issues/7672. It
also loosens the enforcement around assignments like: `__all__ =
list(some_other_expression)`. We shouldn't flag these as invalid, even
though we can't analyze the members, since we _know_ they evaluate to a
`list`.
Closes https://github.com/astral-sh/ruff/issues/7672.
## Test Plan
`cargo test`
## Summary
This is a follow-up to #7469 that attempts to achieve similar gains, but
without introducing malachite. Instead, this PR removes the `BigInt`
type altogether, instead opting for a simple enum that allows us to
store small integers directly and only allocate for values greater than
`i64`:
```rust
/// A Python integer literal. Represents both small (fits in an `i64`) and large integers.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Int(Number);
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Number {
/// A "small" number that can be represented as an `i64`.
Small(i64),
/// A "large" number that cannot be represented as an `i64`.
Big(Box<str>),
}
impl std::fmt::Display for Number {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Number::Small(value) => write!(f, "{value}"),
Number::Big(value) => write!(f, "{value}"),
}
}
}
```
We typically don't care about numbers greater than `isize` -- our only
uses are comparisons against small constants (like `1`, `2`, `3`, etc.),
so there's no real loss of information, except in one or two rules where
we're now a little more conservative (with the worst-case being that we
don't flag, e.g., an `itertools.pairwise` that uses an extremely large
value for the slice start constant). For simplicity, a few diagnostics
now show a dedicated message when they see integers that are out of the
supported range (e.g., `outdated-version-block`).
An additional benefit here is that we get to remove a few dependencies,
especially `num-bigint`.
## Test Plan
`cargo test`
## Summary
This is only used for the `level` field in relative imports (e.g., `from
..foo import bar`). It seems unnecessary to use a wrapper here, so this
PR changes to a `u32` directly.
## Summary
If a function has no parameters (and no comments within the parameters'
`()`), we're supposed to wrap the return annotation _whenever_ it
breaks. However, our `empty_parameters` test didn't properly account for
the case in which the parameters include a newline (but no other
content), like:
```python
def get_dashboards_hierarchy(
) -> Dict[Type['BaseDashboard'], List[Type['BaseDashboard']]]:
"""Get hierarchy of dashboards classes.
Returns:
Dict of dashboards classes.
"""
dashboards_hierarchy = {}
```
This PR fixes that detection. Instead of lexing, it now checks if the
parameters itself is empty (or if it contains comments).
Closes https://github.com/astral-sh/ruff/issues/7457.
## Summary
The tokenizer was split into a forward and a backwards tokenizer. The
backwards tokenizer uses the same names as the forwards ones (e.g.
`next_token`). The backwards tokenizer gets the comment ranges that we
already built to skip comments.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
`ComparableExpr` includes the `ExprContext` field on an expression, so,
e.g., the two tuples in `(a, b) = (a, b)` won't be considered equal.
Similarly, the tuples in `[(a, b) for (a, b) in c]` _also_ wouldn't be
considered equal. I find this behavior surprising, since
`ComparableExpr` is intended to allow you to compare two ASTs, but
`ExprContext` is really encoding information about the broader context
for the expression.
## Motivation
The `ast::Arguments` for call argument are split into positional
arguments (args) and keywords arguments (keywords). We currently assume
that call consists of first args and then keywords, which is generally
the case, but not always:
```python
f(*args, a=2, *args2, **kwargs)
class A(*args, a=2, *args2, **kwargs):
pass
```
The consequence is accidentally reordering arguments
(https://github.com/astral-sh/ruff/pull/7268).
## Summary
`Arguments::args_and_keywords` returns an iterator of an `ArgOrKeyword`
enum that yields args and keywords in the correct order. I've fixed the
obvious `args` and `keywords` usages, but there might be some cases with
wrong assumptions remaining.
## Test Plan
The generator got new test cases, otherwise the stacked PR
(https://github.com/astral-sh/ruff/pull/7268) which uncovered this.