Commit graph

314 commits

Author SHA1 Message Date
Charlie Marsh
6435e4e4aa
Enable auto-return-type involving Optional and Union annotations (#8885)
## Summary

Previously, this was only supported for Python 3.10 and later, since we
always use the PEP 604-style unions.
2023-11-28 18:35:55 -08:00
Dhruv Manilawala
ec7456bac0
Rename as_str to to_str (#8886)
This PR renames the method on `StringLiteralValue` from `as_str` to
`to_str`. The main motivation is to follow the naming convention as
described in the [Rust API
Guidelines](https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv).
This method can perform a string allocation in case the string is
implicitly concatenated.
2023-11-28 18:50:42 -06:00
Dhruv Manilawala
b28556d739
Update E402 to work at cell level for notebooks (#8872)
## Summary

This PR updates the `E402` rule to work at cell level for Jupyter
notebooks. This is enabled only in preview to gather feedback.

The implementation basically resets the import boundary flag on the
semantic model when we encounter the first statement in a cell.

Another potential solution is to introduce `E403` rule that is
specifically for notebooks that works at cell level while `E402` will be
disabled for notebooks.

## Test Plan

Add a notebook with imports in multiple cells and verify that the rule
works as expected.

resolves: #8669
2023-11-29 00:32:35 +00:00
Dhruv Manilawala
501cca8b72
Remove #[allow(unused_variables)] from visitor methods (#8828)
Small follow-up to remove `#[allow(unused_variables)]` from visitor
methods and use underscore prefix for unused variables instead.
2023-11-25 00:09:46 +00:00
Dhruv Manilawala
626b0577cd
Explicit as_str (no deref), add no allocation methods (#8826)
## Summary

This PR is a follow-up to the AST refactor which does the following:
- Remove `Deref` implementation on `StringLiteralValue` and use explicit
`as_str` calls instead. The `Deref` implementation would implicitly
perform allocations in case of implicitly concatenated strings. This is
to make sure the allocation is explicit.
- Now, certain methods can be implemented to do zero allocations which
have been implemented in this PR. They are:
    - `is_empty`
    - `len`
    - `chars`
    - Custom `PartialEq` implementation to compare each character

## Test Plan

Run the linter test suite and make sure all tests pass.
2023-11-25 00:03:59 +00:00
Dhruv Manilawala
017e829115
Update string nodes for implicit concatenation (#7927)
## Summary

This PR updates the string nodes (`ExprStringLiteral`,
`ExprBytesLiteral`, and `ExprFString`) to account for implicit string
concatenation.

### Motivation

In Python, implicit string concatenation are joined while parsing
because the interpreter doesn't require the information for each part.
While that's feasible for an interpreter, it falls short for a static
analysis tool where having such information is more useful. Currently,
various parts of the code uses the lexer to get the individual string
parts.

One of the main challenge this solves is that of string formatting.
Currently, the formatter relies on the lexer to get the individual
string parts, and formats them including the comments accordingly. But,
with PEP 701, f-string can also contain comments. Without this change,
it becomes very difficult to add support for f-string formatting.

### Implementation

The initial proposal was made in this discussion:
https://github.com/astral-sh/ruff/discussions/6183#discussioncomment-6591993.
There were various AST designs which were explored for this task which
are available in the linked internal document[^1].

The selected variant was the one where the nodes were kept as it is
except that the `implicit_concatenated` field was removed and instead a
new struct was added to the `Expr*` struct. This would be a private
struct would contain the actual implementation of how the AST is
designed for both single and implicitly concatenated strings.

This implementation is achieved through an enum with two variants:
`Single` and `Concatenated` to avoid allocating a vector even for single
strings. There are various public methods available on the value struct
to query certain information regarding the node.

The nodes are structured in the following way:

```
ExprStringLiteral - "foo" "bar"
|- StringLiteral - "foo"
|- StringLiteral - "bar"

ExprBytesLiteral - b"foo" b"bar"
|- BytesLiteral - b"foo"
|- BytesLiteral - b"bar"

ExprFString - "foo" f"bar {x}"
|- FStringPart::Literal - "foo"
|- FStringPart::FString - f"bar {x}"
  |- StringLiteral - "bar "
  |- FormattedValue - "x"
```

[^1]: Internal document:
https://www.notion.so/astral-sh/Implicit-String-Concatenation-e036345dc48943f89e416c087bf6f6d9?pvs=4

#### Visitor

The way the nodes are structured is that the entire string, including
all the parts that are implicitly concatenation, is a single node
containing individual nodes for the parts. The previous section has a
representation of that tree for all the string nodes. This means that
new visitor methods are added to visit the individual parts of string,
bytes, and f-strings for `Visitor`, `PreorderVisitor`, and
`Transformer`.

## Test Plan

- `cargo insta test --workspace --all-features --unreferenced reject`
- Verify that the ecosystem results are unchanged
2023-11-24 17:55:41 -06:00
Samuel Cormier-Iijima
852a8f4a4f
[PIE796] don't report when using ellipses for enum values in stub files (#8825)
## Summary

Just ignores ellipses as enum values inside stub files.

Fixes #8818.
2023-11-24 15:24:57 +00:00
konsti
14e65afdc6
Update to Rust 1.74 and use new clippy lints table (#8722)
Update to [Rust
1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use
the new clippy lints table.

The update itself introduced a new clippy lint about superfluous hashes
in raw strings, which got removed.

I moved our lint config from `rustflags` to the newly stabilized
[workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table).
One consequence is that we have to `unsafe_code = "warn"` instead of
"forbid" because the latter now actually bans unsafe code:

```
error[E0453]: allow(unsafe_code) incompatible with previous forbid
  --> crates/ruff_source_file/src/newlines.rs:62:17
   |
62 |         #[allow(unsafe_code)]
   |                 ^^^^^^^^^^^ overruled by previous forbid
   |
   = note: `forbid` lint level was set on command line
```

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-11-16 18:12:46 -05:00
Charlie Marsh
bf2cc3f520
Add autotyping-like return type inference for annotation rules (#8643)
## Summary

This PR adds (unsafe) fixes to the flake8-annotations rules that enforce
missing return types, offering to automatically insert type annotations
for functions with literal return values. The logic is smart enough to
generate simplified unions (e.g., `float` instead of `int | float`) and
deal with implicit returns (`return` without a value).

Closes https://github.com/astral-sh/ruff/issues/1640 (though we could
open a separate issue for referring parameter types).

Closes https://github.com/astral-sh/ruff/issues/8213.

## Test Plan

`cargo test`
2023-11-13 23:34:15 -05:00
Charlie Marsh
df9ade7fd9
Use AST transformer for relocate (#8660) 2023-11-13 13:24:27 -05:00
Charlie Marsh
345e1401cf
Treat class C: ... and class C(): ... equivalently (#8659)
## Summary

These should be seen as identical from the `ComparableAst` perspective.
2023-11-13 18:03:04 +00:00
Charlie Marsh
d574fcd1ac
Compare formatted and unformatted ASTs during formatter tests (#8624)
## Summary

This PR implements validation in the formatter tests to ensure that we
don't modify the AST during formatting. Black has similar logic.

In implementing this, I learned that Black actually _does_ modify the
AST, and their test infrastructure normalizes the AST to wipe away those
differences. Specifically, Black changes the indentation of docstrings,
which _does_ modify the AST; and it also inserts parentheses in `del`
statements, which changes the AST too.

Ruff also does both these things, so we _also_ implement the same
normalization using a new visitor that allows for modifying the AST.

Closes https://github.com/astral-sh/ruff/issues/8184.

## Test Plan

`cargo test`
2023-11-13 17:43:27 +00:00
Jesse Serrao
39728a1198
Add check for is comparison with mutable initialisers to rule F632 (#8607)
## Summary

Adds an extra check to F632 to check for any `is` comparisons to a
mutable initialisers.
Implements #8589 .

Example:
```Python
named_var = {}
if named_var is {}:  # F632 (fix)
    pass
```
The if condition will always evaluate to False because it checks on
identity and it's impossible to take the same identity as a hard coded
list/set/dict initializer.

## Test Plan

Multiple test cases were added to ensure the rule works + doesn't flag
false positives + the fix works correctly.
2023-11-11 00:29:23 +00:00
Kar Petrosyan
e2c7b1ece6
[TRIO] Add TRIO109 rule (#8534)
## Summary

Adds TRIO109 from the [flake8-trio
plugin](https://github.com/Zac-HD/flake8-trio).
Relates to: https://github.com/astral-sh/ruff/issues/8451
2023-11-07 17:13:01 -05:00
qdegraaf
4170ef0508
[TRIO] Add TRIO105: SyncTrioCall (#8490)
## Summary

Adds `TRIO105` from the [flake8-trio
plugin](https://github.com/Zac-HD/flake8-trio). The `MethodName` logic
mirrors that of `TRIO100` to stay consistent within the plugin.

It is at 95% parity with the exception of upstream also checking for a
slightly more complex scenario where a call to `start()` on a
`trio.Nursery` context should also be immediately awaited. Upstream
plugin appears to just check for anything named `nursery` judging from
[the relevant issue](https://github.com/Zac-HD/flake8-trio/issues/56).

Unsure if we want to do so something similar or, alternatively, if there
is some capability in ruff to check for calls made on this context some
other way

## Test Plan

Added a new fixture, based on [the one from upstream
plugin](https://github.com/Zac-HD/flake8-trio/blob/main/tests/eval_files/trio105.py)

## Issue link

Refers: https://github.com/astral-sh/ruff/issues/8451
2023-11-05 19:56:10 +00:00
Micha Reiser
f16505d885
Formatter: Remove unnecessary group (#8455) 2023-11-03 04:14:29 +00:00
Dhruv Manilawala
d350ede992
Remove unicode flag from comparable (#8440)
## Summary

This PR removes the `unicode` flag from the string literal in
`ComparableExpr`. This flag isn't required as all strings are unicode in
Python 3 so `"foo" == u"foo"`.
2023-11-02 13:21:45 +05:30
Dhruv Manilawala
97ae617fac
Introduce LiteralExpressionRef for all literals (#8339)
## Summary

This PR adds a new `LiteralExpressionRef` which wraps all of the literal
expression nodes in a single enum. This allows for a narrow type when
working exclusively with a literal node. Additionally, it also
implements a `Expr::as_literal_expr` method to return the new enum if
the expression is indeed a literal one.

A few rules have been updated to account for the new enum:
1. `redundant_literal_union`
2. `if_else_block_instead_of_dict_lookup`
3. `magic_value_comparison`

To account for the change in (2), a new `ComparableLiteral` has been
added which can be constructed from the new enum
(`ComparableLiteral::from(<LiteralExpressionRef>)`).

### Open Questions

1. The new `ComparableLiteral` can be exclusively used via the
`LiteralExpressionRef` enum. Should we remove all of the literal
variants from `ComparableExpr` and instead have a single
`ComparableExpr::Literal(ComparableLiteral)` variant instead?

## Test Plan

`cargo test`
2023-10-31 12:56:11 +00:00
Dhruv Manilawala
8977b6ae11
Inline AST helpers for new literal nodes (#8374)
A small refactor to inline the `is_const_none` now that there's a
dedicated `ExprNoneLiteral` node.
2023-10-31 11:06:54 +00:00
Charlie Marsh
161c093c06
Avoid including literal shell=True for truthy, non-True diagnostics (#8359)
## Summary

If the value of `shell` wasn't literally `True`, we now show a message
describing it as truthy, rather than the (misleading) `shell=True`
literal in the diagnostic.

Closes https://github.com/astral-sh/ruff/issues/8310.
2023-10-30 15:44:38 +00:00
Dhruv Manilawala
b0dc5a86a1
Impl Default for (String|Bytes|Boolean|None|Ellipsis)Literal (#8341)
## Summary

This PR adds `Default` for the following literal nodes:
* `StringLiteral`
* `BytesLiteral`
* `BooleanLiteral`
* `NoneLiteral`
* `EllipsisLiteral`

The implementation creates the zero value of the respective literal
nodes in terms of the Python language.

## Test Plan

`cargo test`
2023-10-30 08:47:44 +00:00
Dhruv Manilawala
230c9ce236
Split Constant to individual literal nodes (#8064)
## Summary

This PR splits the `Constant` enum as individual literal nodes. It
introduces the following new nodes for each variant:
* `ExprStringLiteral`
* `ExprBytesLiteral`
* `ExprNumberLiteral`
* `ExprBooleanLiteral`
* `ExprNoneLiteral`
* `ExprEllipsisLiteral`

The main motivation behind this refactor is to introduce the new AST
node for implicit string concatenation in the coming PR. The elements of
that node will be either a string literal, bytes literal or a f-string
which can be implemented using an enum. This means that a string or
bytes literal cannot be represented by `Constant::Str` /
`Constant::Bytes` which creates an inconsistency.

This PR avoids that inconsistency by splitting the constant nodes into
it's own literal nodes, literal being the more appropriate naming
convention from a static analysis tool perspective.

This also makes working with literals in the linter and formatter much
more ergonomic like, for example, if one would want to check if this is
a string literal, it can be done easily using
`Expr::is_string_literal_expr` or matching against `Expr::StringLiteral`
as oppose to matching against the `ExprConstant` and enum `Constant`. A
few AST helper methods can be simplified as well which will be done in a
follow-up PR.

This introduces a new `Expr::is_literal_expr` method which is the same
as `Expr::is_constant_expr`. There are also intermediary changes related
to implicit string concatenation which are quiet less. This is done so
as to avoid having a huge PR which this already is.

## Test Plan

1. Verify and update all of the existing snapshots (parser, visitor)
2. Verify that the ecosystem check output remains **unchanged** for both
the linter and formatter

### Formatter ecosystem check

#### `main`

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |

#### `dhruv/constant-to-literal`

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
2023-10-30 12:13:23 +05:30
Dhruv Manilawala
78bbf6d403
New Singleton enum for PatternMatchSingleton node (#8063)
## Summary

This PR adds a new `Singleton` enum for the `PatternMatchSingleton`
node.

Earlier the node was using the `Constant` enum but the value for this
pattern can only be either `None`, `True` or `False`. With the coming PR
to remove the `Constant`, this node required a new type to fill in.

This also has the benefit of narrowing the type down to only the
possible values for the node as evident by the removal of `unreachable`.

## Test Plan

Update the AST snapshots and run `cargo test`.
2023-10-30 05:48:53 +00:00
Dhruv Manilawala
ec1be60dcb
Remove leftover constant tuple reference (#8062)
This PR removes the leftover reference to the tuple variant in
`Constant`.
2023-10-19 17:50:45 +00:00
konsti
8f9753f58e
Comments outside expression parentheses (#7873)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fixes https://github.com/astral-sh/ruff/issues/7448
Fixes https://github.com/astral-sh/ruff/issues/7892

I've removed automatic dangling comment formatting, we're doing manual
dangling comment formatting everywhere anyway (the
assert-all-comments-formatted ensures this) and dangling comments would
break the formatting there.

## Test Plan

New test file.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-10-19 09:24:11 +00:00
konsti
0c3123e07e
Insert newline after nested function or class statements (#7946)
**Summary** Insert a newline after nested function and class
definitions, unless there is a trailing own line comment.

We need to e.g. format
```python
if platform.system() == "Linux":
    if sys.version > (3, 10):
        def f():
            print("old")
    else:
        def f():
            print("new")
    f()
```
as
```python
if platform.system() == "Linux":
    if sys.version > (3, 10):

        def f():
            print("old")

    else:

        def f():
            print("new")

    f()
```
even though `f()` is directly preceded by an if statement, not a
function or class definition. See the comments and fixtures for trailing
own line comment handling.

**Test Plan** I checked that the new content of `newlines.py` matches
black's formatting.

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-10-18 09:45:58 +00:00
Charlie Marsh
d685107638
Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030)
This is a do-over of https://github.com/astral-sh/ruff/pull/8011, which
I accidentally merged into a non-`main` branch. Sorry!
2023-10-18 00:01:18 +00:00
Tom Kuson
62f1ee08e7
[refurb] Implement single-item-membership-test (FURB171) (#7815)
## Summary

Implement
[`no-single-item-in`](https://github.com/dosisod/refurb/blob/master/refurb/checks/iterable/no_single_item_in.py)
as `single-item-membership-test` (`FURB171`).

Uses the helper function `generate_comparison` from the `pycodestyle`
implementations; this function should probably be moved, but I am not
sure where at the moment.

Update: moved it to `ruff_python_ast::helpers`.

Related to #1348.

## Test Plan

`cargo test`
2023-10-08 14:08:47 +00:00
Charlie Marsh
f71c80af68
Show changed files when running under --check (#7788)
## Summary

We now list each changed file when running with `--check`.

Closes https://github.com/astral-sh/ruff/issues/7782.

## Test Plan

```
❯ cargo run -p ruff_cli -- format foo.py --check
   Compiling ruff_cli v0.0.292 (/Users/crmarsh/workspace/ruff/crates/ruff_cli)
rgo +    Finished dev [unoptimized + debuginfo] target(s) in 1.41s
     Running `target/debug/ruff format foo.py --check`
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended only for experimentation.
Would reformat: foo.py
1 file would be reformatted
```
2023-10-03 18:50:06 +00:00
Tom Kuson
e129f77bcf
Extend reimplemented-starmap (FURB140) to catch calls with a single and starred argument (#7768) 2023-10-02 21:38:05 -04:00
Dhruv Manilawala
e62e245c61
Add support for PEP 701 (#7376)
## Summary

This PR adds support for PEP 701 in Ruff. This is a rollup PR of all the
other individual PRs. The separate PRs were created for logic separation
and code reviews. Refer to each pull request for a detail description on
the change.

Refer to the PR description for the list of pull requests within this PR.

## Test Plan

### Formatter ecosystem checks

Explanation for the change in ecosystem check:
https://github.com/astral-sh/ruff/pull/7597#issue-1908878183

#### `main`

```
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76083 |              1789 |              1631 |
| django       |           0.99983 |              2760 |                36 |
| transformers |           0.99963 |              2587 |               319 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99983 |              3496 |                18 |
| warehouse    |           0.99967 |               648 |                15 |
| zulip        |           0.99972 |              1437 |                21 |
```

#### `dhruv/pep-701`

```
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76051 |              1789 |              1632 |
| django       |           0.99983 |              2760 |                36 |
| transformers |           0.99963 |              2587 |               319 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99983 |              3496 |                18 |
| warehouse    |           0.99967 |               648 |                15 |
| zulip        |           0.99972 |              1437 |                21 |
```
2023-09-29 02:55:39 +00:00
Charlie Marsh
f45281345d
Include radix base prefix in large number representation (#7700)
## Summary

When lexing a number like `0x995DC9BBDF1939FA` that exceeds our small
number representation, we were only storing the portion after the base
(in this case, `995DC9BBDF1939FA`). When using that representation in
code generation, this could lead to invalid syntax, since
`995DC9BBDF1939FA)` on its own is not a valid integer.

This PR modifies the code to store the full span, including the radix
prefix.

See:
https://github.com/astral-sh/ruff/issues/7455#issuecomment-1739802958.

## Test Plan

`cargo test`
2023-09-28 20:38:06 +00:00
Charlie Marsh
0a8cad2550
Allow named expressions in __all__ assignments (#7673)
## Summary

This PR adds support for named expressions when analyzing `__all__`
assignments, as per https://github.com/astral-sh/ruff/issues/7672. It
also loosens the enforcement around assignments like: `__all__ =
list(some_other_expression)`. We shouldn't flag these as invalid, even
though we can't analyze the members, since we _know_ they evaluate to a
`list`.

Closes https://github.com/astral-sh/ruff/issues/7672.

## Test Plan

`cargo test`
2023-09-27 00:36:55 -04:00
Charlie Marsh
93b5d8a0fb
Implement our own small-integer optimization (#7584)
## Summary

This is a follow-up to #7469 that attempts to achieve similar gains, but
without introducing malachite. Instead, this PR removes the `BigInt`
type altogether, instead opting for a simple enum that allows us to
store small integers directly and only allocate for values greater than
`i64`:

```rust
/// A Python integer literal. Represents both small (fits in an `i64`) and large integers.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Int(Number);

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Number {
    /// A "small" number that can be represented as an `i64`.
    Small(i64),
    /// A "large" number that cannot be represented as an `i64`.
    Big(Box<str>),
}

impl std::fmt::Display for Number {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Number::Small(value) => write!(f, "{value}"),
            Number::Big(value) => write!(f, "{value}"),
        }
    }
}
```

We typically don't care about numbers greater than `isize` -- our only
uses are comparisons against small constants (like `1`, `2`, `3`, etc.),
so there's no real loss of information, except in one or two rules where
we're now a little more conservative (with the worst-case being that we
don't flag, e.g., an `itertools.pairwise` that uses an extremely large
value for the slice start constant). For simplicity, a few diagnostics
now show a dedicated message when they see integers that are out of the
supported range (e.g., `outdated-version-block`).

An additional benefit here is that we get to remove a few dependencies,
especially `num-bigint`.

## Test Plan

`cargo test`
2023-09-25 15:13:21 +00:00
Charlie Marsh
4d6f5ff0a7
Remove Int wrapper type from parser (#7577)
## Summary

This is only used for the `level` field in relative imports (e.g., `from
..foo import bar`). It seems unnecessary to use a wrapper here, so this
PR changes to a `u32` directly.
2023-09-21 17:01:44 +00:00
Charlie Marsh
5df0326bc8
Treat parameters-with-newline as empty in function formatting (#7550)
## Summary

If a function has no parameters (and no comments within the parameters'
`()`), we're supposed to wrap the return annotation _whenever_ it
breaks. However, our `empty_parameters` test didn't properly account for
the case in which the parameters include a newline (but no other
content), like:

```python
def get_dashboards_hierarchy(
) -> Dict[Type['BaseDashboard'], List[Type['BaseDashboard']]]:
    """Get hierarchy of dashboards classes.

    Returns:
        Dict of dashboards classes.
    """
    dashboards_hierarchy = {}
```

This PR fixes that detection. Instead of lexing, it now checks if the
parameters itself is empty (or if it contains comments).

Closes https://github.com/astral-sh/ruff/issues/7457.
2023-09-20 16:20:22 -04:00
konsti
2cbe1733c8
Use CommentRanges in backwards lexing (#7360)
## Summary

The tokenizer was split into a forward and a backwards tokenizer. The
backwards tokenizer uses the same names as the forwards ones (e.g.
`next_token`). The backwards tokenizer gets the comment ranges that we
already built to skip comments.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-09-16 03:21:45 +00:00
Charlie Marsh
ec2f229a45
Remove ExprContext from ComparableExpr (#7362)
`ComparableExpr` includes the `ExprContext` field on an expression, so,
e.g., the two tuples in `(a, b) = (a, b)` won't be considered equal.
Similarly, the tuples in `[(a, b) for (a, b) in c]` _also_ wouldn't be
considered equal. I find this behavior surprising, since
`ComparableExpr` is intended to allow you to compare two ASTs, but
`ExprContext` is really encoding information about the broader context
for the expression.
2023-09-14 15:40:02 +00:00
konsti
f4c7bff36b
Don't reorder parameters in function calls (#7268)
## Summary

In `f(*args, a=b, *args2, **kwargs)` the args (`*args`, `*args2`) and
keywords (`a=b`, `**kwargs`) are interleaved, which we previously didn't
handle.

Fixes #6498

**main**

| project | similarity index | total files | changed files |

|--------------|------------------:|------------------:|------------------:|
| cpython | 0.76083 | 1789 | 1632 |
| **django** | 0.99966 | 2760 | 58 |
| transformers | 0.99930 | 2587 | 447 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99983 | 3496 | 18 |
| warehouse | 0.99825 | 648 | 22 |
| zulip | 0.99950 | 1437 | 27 |

**PR**

| project | similarity index | total files | changed files |

|--------------|------------------:|------------------:|------------------:|
| cpython | 0.76083 | 1789 | 1632 |
| **django** | 0.99967 | 2760 | 53 |
| transformers | 0.99930 | 2587 | 447 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99983 | 3496 | 18 |
| warehouse | 0.99825 | 648 | 22 |
| zulip | 0.99950 | 1437 | 27 |


## Test Plan

New fixtures
2023-09-13 09:01:49 +00:00
konsti
56440ad835
Introduce ArgOrKeyword to keep call parameter order (#7302)
## Motivation

The `ast::Arguments` for call argument are split into positional
arguments (args) and keywords arguments (keywords). We currently assume
that call consists of first args and then keywords, which is generally
the case, but not always:

```python
f(*args, a=2, *args2, **kwargs)

class A(*args, a=2, *args2, **kwargs):
    pass
```

The consequence is accidentally reordering arguments
(https://github.com/astral-sh/ruff/pull/7268).

## Summary

`Arguments::args_and_keywords` returns an iterator of an `ArgOrKeyword`
enum that yields args and keywords in the correct order. I've fixed the
obvious `args` and `keywords` usages, but there might be some cases with
wrong assumptions remaining.

## Test Plan

The generator got new test cases, otherwise the stacked PR
(https://github.com/astral-sh/ruff/pull/7268) which uncovered this.
2023-09-13 08:45:46 +00:00
Dhruv Manilawala
04f2842e4f
Move ExprConstant::kind to StringConstant::unicode (#7180) 2023-09-06 07:39:25 +00:00
Dhruv Manilawala
fa6bff0078
Add inline documentation for Ipy* AST nodes (#7178) 2023-09-06 12:07:34 +05:30
Charlie Marsh
b0d171ac19
Supported starred exceptions in length-one tuple detection (#7080) 2023-09-03 13:31:13 +00:00
Charlie Marsh
68f605e80a
Fix WithItem ranges for parenthesized, non-as items (#6782)
## Summary

This PR attempts to address a problem in the parser related to the
range's of `WithItem` nodes in certain contexts -- specifically,
`WithItem` nodes in parentheses that do not have an `as` token after
them.

For example,
[here](https://play.ruff.rs/71be2d0b-2a04-4c7e-9082-e72bff152679):

```python
with (a, b):
    pass
```

The range of the `WithItem` `a` is set to the range of `(a, b)`, as is
the range of the `WithItem` `b`. In other words, when we have this kind
of sequence, we use the range of the entire parenthesized context,
rather than the ranges of the items themselves.

Note that this also applies to cases
[like](https://play.ruff.rs/c551e8e9-c3db-4b74-8cc6-7c4e3bf3713a):

```python
with (a, b, c as d):
    pass
```

You can see the issue in the parser here:

```rust
#[inline]
WithItemsNoAs: Vec<ast::WithItem> = {
    <location:@L> <all:OneOrMore<Test<"all">>> <end_location:@R> => {
        all.into_iter().map(|context_expr| ast::WithItem { context_expr, optional_vars: None, range: (location..end_location).into() }).collect()
    },
}
```

Fixing this issue is... very tricky. The naive approach is to use the
range of the `context_expr` as the range for the `WithItem`, but that
range will be incorrect when the `context_expr` is itself parenthesized.
For example, _that_ solution would fail here, since the range of the
first `WithItem` would be that of `a`, rather than `(a)`:

```python
with ((a), b):
    pass
```

The `with` parsing in general is highly precarious due to ambiguities in
the grammar. Changing it in _any_ way seems to lead to an ambiguous
grammar that LALRPOP fails to translate. Consensus seems to be that we
don't really understand _why_ the current grammar works (i.e., _how_ it
avoids these ambiguities as-is).

The solution implemented here is to avoid changing the grammar itself,
and instead change the shape of the nodes returned by various rules in
the grammar. Specifically, everywhere that we return `Expr`, we instead
return `ParenthesizedExpr`, which includes a parenthesized range and the
underlying `Expr` itself. (If an `Expr` isn't parenthesized, the ranges
will be equivalent.) In `WithItemsNoAs`, we can then use the
parenthesized range as the range for the `WithItem`.
2023-08-31 16:21:29 +01:00
Valeriy Savchenko
26d53c56a2
[refurb] Implement repeated-append rule (FURB113) (#6702)
## Summary

As an initial effort with replicating `refurb` rules (#1348 ), this PR
adds support for
[FURB113](https://github.com/dosisod/refurb/blob/master/refurb/checks/builtin/list_extend.py)
and adds a new category of checks.

## Test Plan

I included a new test + checked that all other tests pass.
2023-08-28 22:51:59 +00:00
Charlie Marsh
58f5f27dc3
Add TOML files to SourceType (#6929)
## Summary

This PR adds a higher-level enum (`SourceType`) around `PySourceType` to
allow us to use the same detection path to handle TOML files. Right now,
we have ad hoc `is_pyproject_toml` checks littered around, and some
codepaths are omitting that logic altogether (like `add_noqa`). Instead,
we should always be required to check the source type and handle TOML
files as appropriate.

This PR will also help with our pre-commit capabilities. If we add
`toml` to pre-commit (to support `pyproject.toml`), pre-commit will
start to pass _other_ files to Ruff (along with `poetry.lock` and
`Pipfile` -- see
[identify](b59996304f/identify/extensions.py (L355))).
By detecting those files and handling those cases, we avoid attempting
to parse them as Python files, which would lead to pre-commit errors.
(We tried to add `toml` to pre-commit here
(https://github.com/astral-sh/ruff-pre-commit/pull/44), but had to
revert here (https://github.com/astral-sh/ruff-pre-commit/pull/45) as it
led to the pre-commit hook attempting to parse `poetry.lock` files as
Python files.)
2023-08-28 15:01:48 +00:00
Charlie Marsh
fc89976c24
Move Ranged into ruff_text_size (#6919)
## Summary

The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.

Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.

## Test Plan

`cargo test`
2023-08-27 14:12:51 -04:00
Micha Reiser
7c480236e0
Use dyn dispatch for any_over_* (#6912) 2023-08-27 15:54:01 +02:00
Charlie Marsh
15b73bdb8a
Introduce AST nodes for PatternMatchClass arguments (#6881)
## Summary

This PR introduces two new AST nodes to improve the representation of
`PatternMatchClass`. As a reminder, `PatternMatchClass` looks like this:

```python
case Point2D(0, 0, x=1, y=2):
  ...
```

Historically, this was represented as a vector of patterns (for the `0,
0` portion) and parallel vectors of keyword names (for `x` and `y`) and
values (for `1` and `2`). This introduces a bunch of challenges for the
formatter, but importantly, it's also really different from how we
represent similar nodes, like arguments (`func(0, 0, x=1, y=2)`) or
parameters (`def func(x, y)`).

So, firstly, we now use a single node (`PatternArguments`) for the
entire parenthesized region, making it much more consistent with our
other nodes. So, above, `PatternArguments` would be `(0, 0, x=1, y=2)`.

Secondly, we now have a `PatternKeyword` node for `x=1` and `y=2`. This
is much more similar to the how `Keyword` is represented within
`Arguments` for call expressions.

Closes https://github.com/astral-sh/ruff/issues/6866.

Closes https://github.com/astral-sh/ruff/issues/6880.
2023-08-26 14:45:44 +00:00
Dhruv Manilawala
d1f07008f7
Rename Notebook related symbols (#6862)
This PR renames the following symbols:

* `PySourceType::Jupyter` -> `PySourceType::Ipynb`
* `SourceKind::Jupyter` -> `SourceKind::IpyNotebook`
* `JupyterIndex` -> `NotebookIndex`
2023-08-25 11:40:54 +05:30