Commit graph

432 commits

Author SHA1 Message Date
Charlie Marsh
35ba3c91ce
Use u64 instead of i64 in Int type (#11356)
## Summary

I believe the value here is always unsigned, since we represent `-42` as
a unary operator on `42`.
2024-05-10 13:35:15 +00:00
Micha Reiser
22639c5a2a
Move all module from the AST to the semantic crate (#11330) 2024-05-08 08:56:50 +00:00
Alex Waygood
6774f27f4b
Refactor the ExprDict node (#11267)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-05-07 11:46:10 +00:00
Dhruv Manilawala
28cc71fb6b
Remove cyclic dev dependency with the parser crate (#11261)
## Summary

This PR removes the cyclic dev dependency some of the crates had with
the parser crate.

The cyclic dependencies are:
* `ruff_python_ast` has a **dev dependency** on `ruff_python_parser` and
`ruff_python_parser` directly depends on `ruff_python_ast`
* `ruff_python_trivia` has a **dev dependency** on `ruff_python_parser`
and `ruff_python_parser` has an indirect dependency on
`ruff_python_trivia` (`ruff_python_parser` - `ruff_python_ast` -
`ruff_python_trivia`)

Specifically, this PR does the following:
* Introduce two new crates
* `ruff_python_ast_integration_tests` and move the tests from the
`ruff_python_ast` crate which uses the parser in this crate
* `ruff_python_trivia_integration_tests` and move the tests from the
`ruff_python_trivia` crate which uses the parser in this crate

### Motivation

The main motivation for this PR is to help development. Before this PR,
`rust-analyzer` wouldn't provide any intellisense in the
`ruff_python_parser` crate regarding the symbols in `ruff_python_ast`
crate.

```
[ERROR][2024-05-03 13:47:06] .../vim/lsp/rpc.lua:770	"rpc"	"/Users/dhruv/.cargo/bin/rust-analyzer"	"stderr"	"[ERROR project_model::workspace] cyclic deps: ruff_python_parser(Idx::<CrateData>(50)) -> ruff_python_ast(Idx::<CrateData>(37)), alternative path: ruff_python_ast(Idx::<CrateData>(37)) -> ruff_python_parser(Idx::<CrateData>(50))\n"
```

## Test Plan

Check the logs of `rust-analyzer` to not see any signs of cyclic
dependency.
2024-05-07 09:24:57 +00:00
Micha Reiser
6a1e555537
Upgrade to Rust 1.78 (#11260) 2024-05-03 12:46:21 +00:00
Charlie Marsh
9a1f6f6762
Avoid allocations for isort module names (#11251)
## Summary

Random refactor I noticed when investigating the F401 changes. We don't
need to allocate in most cases here.
2024-05-02 19:17:56 +00:00
Micha Reiser
64700d296f
Remove ImportMap (#11234)
## Summary

This PR removes the `ImportMap` implementation and all its routing
through ruff.

The import map was added in https://github.com/astral-sh/ruff/pull/3243
but we then never ended up using it to do cross file analysis.

We are now working on adding multifile analysis to ruff, and revisit
import resolution as part of it.


```
hyperfine --warmup 10 --runs 20 --setup "./target/release/ruff clean" \
              "./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" \
              "./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" 
Benchmark 1: ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
  Time (mean ± σ):      37.6 ms ±   0.9 ms    [User: 52.2 ms, System: 63.7 ms]
  Range (min … max):    35.8 ms …  39.8 ms    20 runs
 
Benchmark 2: ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
  Time (mean ± σ):      36.0 ms ±   0.7 ms    [User: 50.3 ms, System: 58.4 ms]
  Range (min … max):    34.5 ms …  37.6 ms    20 runs
 
Summary
  ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I ran
    1.04 ± 0.03 times faster than ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
```

I suspect that the performance improvement should even be more
significant for users that otherwise don't have any diagnostics.


```
hyperfine --warmup 10 --runs 20 --setup "cd ../ecosystem/airflow && ../../ruff/target/release/ruff clean" \
              "./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I" \
              "./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I" 
Benchmark 1: ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I
  Time (mean ± σ):      53.7 ms ±   1.8 ms    [User: 68.4 ms, System: 63.0 ms]
  Range (min … max):    51.1 ms …  58.7 ms    20 runs
 
Benchmark 2: ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I
  Time (mean ± σ):      50.8 ms ±   1.4 ms    [User: 50.7 ms, System: 60.9 ms]
  Range (min … max):    48.5 ms …  55.3 ms    20 runs
 
Summary
  ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I ran
    1.06 ± 0.05 times faster than ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I

```

## Test Plan

`cargo test`
2024-05-02 11:26:02 -07:00
Dhruv Manilawala
04a922866a
Add basic docs for the parser crate (#11199)
## Summary

This PR adds a basic README for the `ruff_python_parser` crate and
updates the CONTRIBUTING docs with the fuzzer and benchmark section.

Additionally, it also updates some inline documentation within the
parser crate and splits the `parse_program` function into
`parse_single_expression` and `parse_module` which will be called by
matching against the `Mode`.

This PR doesn't go into too much internal detail around the parser logic
due to the following reasons:
1. Where should the docs go? Should it be as a module docs in `lib.rs`
or in README?
2. The parser is still evolving and could include a lot of refactors
with the future work (feedback loop and improved error recovery and
resilience)

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-29 17:08:07 +00:00
Alex Waygood
87929ad5f1
Add convenience methods for iterating over all parameter nodes in a function (#11174) 2024-04-29 10:36:15 +00:00
Micha Reiser
7cd065e4a2
Kick off Red-knot (#10849)
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Carl Meyer <carl@astral.sh>
2024-04-27 08:34:00 +00:00
Carl Meyer
845ba7cf5f
Make ImportFrom level just a u32 (#11170) 2024-04-26 20:38:35 -06:00
Jelle Zijlstra
cd3e319538
Add support for PEP 696 syntax (#11120) 2024-04-26 09:47:29 +02:00
Alex Waygood
269014a539
Delete unused methods from Parameters (#11150) 2024-04-25 22:11:24 +01:00
Carl Meyer
c80b9a4a90
Reduce size of Stmt from 144 to 120 bytes (#11051)
## Summary

I happened to notice that we box `TypeParams` on `StmtClassDef` but not
on `StmtFunctionDef` and wondered why, since `StmtFunctionDef` is bigger
and sets the size of `Stmt`.

@charliermarsh found that at the time we started boxing type params on
classes, classes were the largest statement type (see #6275), but that's
no longer true.

So boxing type-params also on functions reduces the overall size of
`Stmt`.

## Test Plan

The `<=` size tests are a bit irritating (since their failure doesn't
tell you the actual size), but I manually confirmed that the size is
actually 120 now.
2024-04-19 17:02:17 -06:00
Dhruv Manilawala
13ffb5bc19
Replace LALRPOP parser with hand-written parser (#10036)
(Supersedes #9152, authored by @LaBatata101)

## Summary

This PR replaces the current parser generated from LALRPOP to a
hand-written recursive descent parser.

It also updates the grammar for [PEP
646](https://peps.python.org/pep-0646/) so that the parser outputs the
correct AST. For example, in `data[*x]`, the index expression is now a
tuple with a single starred expression instead of just a starred
expression.

Beyond the performance improvements, the parser is also error resilient
and can provide better error messages. The behavior as seen by any
downstream tools isn't changed. That is, the linter and formatter can
still assume that the parser will _stop_ at the first syntax error. This
will be updated in the following months.

For more details about the change here, refer to the PR corresponding to
the individual commits and the release blog post.

## Test Plan

Write _lots_ and _lots_ of tests for both valid and invalid syntax and
verify the output.

## Acknowledgements

- @MichaReiser for reviewing 100+ parser PRs and continuously providing
guidance throughout the project
- @LaBatata101 for initiating the transition to a hand-written parser in
#9152
- @addisoncrump for implementing the fuzzer which helped
[catch](https://github.com/astral-sh/ruff/pull/10903)
[a](https://github.com/astral-sh/ruff/pull/10910)
[lot](https://github.com/astral-sh/ruff/pull/10966)
[of](https://github.com/astral-sh/ruff/pull/10896)
[bugs](https://github.com/astral-sh/ruff/pull/10877)

---------

Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-04-18 17:57:39 +05:30
Alex Waygood
f779babc5f
Improve handling of builtin symbols in linter rules (#10919)
Add a new method to the semantic model to simplify and improve the correctness of a common pattern
2024-04-16 11:37:31 +01:00
renovate[bot]
388658efdb
Update pre-commit dependencies (#10698)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2024-04-06 23:00:41 +00:00
Charlie Marsh
7fb5f47efe
Respect # noqa directives on __all__ openers (#10798)
## Summary

Historically, given:

```python
__all__ = [  # noqa: F822
    "Bernoulli",
    "Beta",
    "Binomial",
]
```

The F822 violations would be attached to the `__all__`, so this `# noqa`
would be enforced for _all_ definitions in the list. This changed in
https://github.com/astral-sh/ruff/pull/10525 for the better, in that we
now use the range of each string. But these `# noqa` directives stopped
working.

This PR sets the `__all__` as a parent range in the diagnostic, so that
these directives are respected once again.

Closes https://github.com/astral-sh/ruff/issues/10795.

## Test Plan

`cargo test`
2024-04-06 14:51:17 +00:00
Dhruv Manilawala
99dd3a8ab0
Implement as_str & Display for all operator enums (#10691)
## Summary

This PR adds the `as_str` implementation for all the operator methods.
It already exists for `CmpOp` which is being [used in the
linter](ffcd77860c/crates/ruff_linter/src/rules/flake8_simplify/rules/key_in_dict.rs (L117))
and it makes sense to implement it for the rest as well. This will also
be utilized in error messages for the new parser.
2024-04-02 10:34:36 +00:00
Dhruv Manilawala
eee2d5b915
Remove unused operator methods and impl (#10690)
## Summary

This PR removes unused operator methods and impl traits. There is
already the `is_macro::Is` implementation for all the operators and this
seems unnecessary.
2024-04-02 15:53:20 +05:30
Alex Waygood
a06ffeb54e
Track ranges of names inside __all__ definitions (#10525) 2024-03-22 18:38:40 +00:00
Charlie Marsh
60fd98eb2f
Update Rust to v1.77 (#10510) 2024-03-21 12:10:33 -04:00
Alex Waygood
7caf0d064a
Simplify formatting of strings by using flags from the AST nodes (#10489) 2024-03-20 16:16:54 +00:00
Alex Waygood
162d2eb723
Track casing of r-string prefixes in the tokenizer and AST (#10314)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-03-18 17:18:04 +00:00
Dhruv Manilawala
5f40371ffc
Use ExprFString for StringLike::FString variant (#10311)
## Summary

This PR updates the `StringLike::FString` variant to use `ExprFString`
instead of `FStringLiteralElement`.

For context, the reason it used `FStringLiteralElement` is that the node
is actually the string part of an f-string ("foo" in `f"foo{x}"`). But,
this is inconsistent with other variants where the captured value is the
_entire_ string.

This is also problematic w.r.t. implicitly concatenated strings. Any
rules which work with `StringLike::FString` doesn't account for the
string part in an implicitly concatenated f-strings. For example, we
don't flag confusable character in the first part of `"𝐁ad" f"𝐁ad
string"`, but only the second part
(https://play.ruff.rs/16071c4c-a1dd-4920-b56f-e2ce2f69c843).

### Update `PYI053`

_This is included in this PR because otherwise it requires a temporary
workaround to be compatible with the old logic._

This PR also updates the `PYI053` (`string-or-bytes-too-long`) rule for
f-string to consider _all_ the visible characters in a f-string,
including the ones which are implicitly concatenated. This is consistent
with implicitly concatenated strings and bytes.

For example,

```python
def foo(
	# We count all the characters here
    arg1: str = '51 character ' 'stringgggggggggggggggggggggggggggggggg',
	# But not here because of the `{x}` replacement field which _breaks_ them up into two chunks
    arg2: str = f'51 character {x} stringgggggggggggggggggggggggggggggggggggggggggggg',
) -> None: ...
```

This PR fixes it to consider all _visible_ characters inside an f-string
which includes expressions as well.

fixes: #10310 
fixes: #10307 

## Test Plan

Add new test cases and update the snapshots.

## Review

To facilitate the review process, the change have been split into two
commits: one which has the code change while the other has the test
cases and updated snapshots.
2024-03-14 13:30:22 +05:30
Alex Waygood
c2e15f38ee
Unify enums used for internal representation of quoting style (#10383) 2024-03-13 17:19:17 +00:00
Dhruv Manilawala
32d6f84e3d
Add methods to iter over f-string elements (#10309)
## Summary

This PR adds methods on `FString` to iterate over the two different kind
of elements it can have - literals and expressions. This is similar to
the methods we have on `ExprFString`.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-03-13 08:46:55 +00:00
Charlie Marsh
dbf82233b8
Gate f-string struct size test for Rustc < 1.76 (#10371)
Closes https://github.com/astral-sh/ruff/issues/10319.
2024-03-12 15:46:36 -04:00
Alex Waygood
1d97f27335
Start tracking quoting style in the AST (#10298)
This PR modifies our AST so that nodes for string literals, bytes literals and f-strings all retain the following information:
- The quoting style used (double or single quotes)
- Whether the string is triple-quoted or not
- Whether the string is raw or not

This PR is a followup to #10256. Like with that PR, this PR does not, in itself, fix any bugs. However, it means that we will have the necessary information to preserve quoting style and rawness of strings in the `ExprGenerator` in a followup PR, which will allow us to provide a fix for https://github.com/astral-sh/ruff/issues/7799.

The information is recorded on the AST nodes using a bitflag field on each node, similarly to how we recorded the information on `Tok::String`, `Tok::FStringStart` and `Tok::FStringMiddle` tokens in #10298. Rather than reusing the bitflag I used for the tokens, however, I decided to create a custom bitflag for each AST node.

Using different bitflags for each node allows us to make invalid states unrepresentable: it is valid to set a `u` prefix on a string literal, but not on a bytes literal or an f-string. It also allows us to have better debug representations for each AST node modified in this PR.
2024-03-08 19:11:47 +00:00
Micha Reiser
8ea5b08700
refactor: Use QualifiedName for Imported::call_path (#10214)
## Summary

When you try to remove an internal representation leaking into another
type and end up rewriting a simple version of `smallvec`.

The goal of this PR is to replace the `Box<[&'a str]>` with
`Box<QualifiedName>` to avoid that the internal `QualifiedName`
representation leaks (and it gives us a nicer API too). However, doing
this when `QualifiedName` uses `SmallVec` internally gives us all sort
of funny lifetime errors. I was lost but @BurntSushi came to rescue me.
He figured out that `smallvec` has a variance problem which is already
tracked in https://github.com/servo/rust-smallvec/issues/146

To fix the variants problem, I could use the smallvec-2-alpha-4 or
implement our own smallvec. I went with implementing our own small vec
for this specific problem. It obviously isn't as sophisticated as
smallvec (only uses safe code), e.g. it doesn't perform any size
optimizations, but it does its job.

Other changes:

* Removed `Imported::qualified_name` (the version that returns a
`String`). This can be replaced by calling `ToString` on the qualified
name.
* Renamed `Imported::call_path` to `qualified_name` and changed its
return type to `&QualifiedName`.
* Renamed `QualifiedName::imported` to `user_defined` which is the more
common term when talking about builtins vs the rest/user defined
functions.


## Test plan

`cargo test`
2024-03-06 09:55:59 +01:00
Micha Reiser
184241f99a
Remove Expr postfix from ExprNamed, ExprIf, and ExprGenerator (#10229)
The expression types in our AST are called `ExprYield`, `ExprAwait`,
`ExprStringLiteral` etc, except `ExprNamedExpr`, `ExprIfExpr` and
`ExprGenratorExpr`. This seems to align with [Python AST's
naming](https://docs.python.org/3/library/ast.html) but feels
inconsistent and excessive.

This PR removes the `Expr` postfix from `ExprNamedExpr`, `ExprIfExpr`,
and `ExprGeneratorExpr`.
2024-03-04 12:55:01 +01:00
Micha Reiser
a6d892b1f4
Split CallPath into QualifiedName and UnqualifiedName (#10210)
## Summary

Charlie can probably explain this better than I but it turns out,
`CallPath` is used for two different things:

* To represent unqualified names like `version` where `version` can be a
local variable or imported (e.g. `from sys import version` where the
full qualified name is `sys.version`)
* To represent resolved, full qualified names

This PR splits `CallPath` into two types to make this destinction clear.

> Note: I haven't renamed all `call_path` variables to `qualified_name`
or `unqualified_name`. I can do that if that's welcomed but I first want
to get feedback on the approach and naming overall.

## Test Plan

`cargo test`
2024-03-04 09:06:51 +00:00
Micha Reiser
db25a563f7
Remove unneeded lifetime bounds (#10213)
## Summary

This PR removes the unneeded lifetime `'b` from many of our `Visitor`
implementations.

The lifetime is unneeded because it is only constraint by `'a`, so we
can use `'a` directly.

## Test Plan

`cargo build`
2024-03-03 18:12:11 +00:00
Micha Reiser
e725b6fdaf
CallPath newtype wrapper (#10201)
## Summary

This PR changes the `CallPath` type alias to a newtype wrapper. 

A newtype wrapper allows us to limit the API and to experiment with
alternative ways to implement matching on `CallPath`s.



## Test Plan

`cargo test`
2024-03-03 16:54:24 +01:00
Jane Lewis
0293908b71
Implement RUF028 to detect useless formatter suppression comments (#9899)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
Fixes #6611

## Summary

This lint rule spots comments that are _intended_ to suppress or enable
the formatter, but will be ignored by the Ruff formatter.

We borrow some functions the formatter uses for determining comment
placement / putting them in context within an AST.

The analysis function uses an AST visitor to visit each comment and
attach it to the AST. It then uses that context to check:
1. Is this comment in an expression?
2. Does this comment have bad placement? (e.g. a `# fmt: skip` above a
function instead of at the end of a line)
3. Is this comment redundant?
4. Does this comment actually suppress any code?
5. Does this comment have ambiguous placement? (e.g. a `# fmt: off`
above an `else:` block)

If any of these are true, a violation is thrown. The reported reason
depends on the order of the above check-list: in other words, a `# fmt:
skip` comment on its own line within a list expression will be reported
as being in an expression, since that reason takes priority.

The lint suggests removing the comment as an unsafe fix, regardless of
the reason.

## Test Plan

A snapshot test has been created.
2024-02-28 19:21:06 +00:00
Micha Reiser
77c5561646
Add parenthesized flag to ExprTuple and ExprGenerator (#9614) 2024-02-26 15:35:20 +00:00
Charlie Marsh
0304623878
[perflint] Catch a wider range of mutations in PERF101 (#9955)
## Summary

This PR ensures that if a list `x` is modified within a `for` loop, we
avoid flagging `list(x)` as unnecessary. Previously, we only detected
calls to exactly `.append`, and they couldn't be nested within other
statements.

Closes https://github.com/astral-sh/ruff/issues/9925.
2024-02-12 12:17:55 -05:00
Charlie Marsh
6f0e4ad332
Remove unnecessary string cloning from the parser (#9884)
Closes https://github.com/astral-sh/ruff/issues/9869.
2024-02-09 16:03:27 -05:00
Charlie Marsh
49fe1b85f2
Reduce size of Expr from 80 to 64 bytes (#9900)
## Summary

This PR reduces the size of `Expr` from 80 to 64 bytes, by reducing the
sizes of...

- `ExprCall` from 72 to 56 bytes, by using boxed slices for `Arguments`.
- `ExprCompare` from 64 to 48 bytes, by using boxed slices for its
various vectors.

In testing, the parser gets a bit faster, and the linter benchmarks
improve quite a bit.
2024-02-09 02:53:13 +00:00
Micha Reiser
fe7d965334
Reduce Result<Tok, LexicalError> size by using Box<str> instead of String (#9885) 2024-02-08 20:36:22 +00:00
Micha Reiser
688177ff6a
Use Rust 1.76 (#9897) 2024-02-08 18:20:08 +00:00
Charlie Marsh
daae28efc7
Respect async with in timeout-without-await (#9859)
Closes https://github.com/astral-sh/ruff/issues/9855.
2024-02-06 12:04:24 -05:00
Dhruv Manilawala
36b752876e
Implement AnyNode/AnyNodeRef for FStringFormatSpec (#9836)
## Summary

This PR adds the `AnyNode` and `AnyNodeRef` implementation for
`FStringFormatSpec` node which will be required in the f-string
formatting.

The main usage for this is so that we can pass in the node directly to
`suppressed_node` in case debug expression is used to format is as
verbatim text.
2024-02-05 19:23:43 +00:00
Charlie Marsh
ea1c089652
Use AhoCorasick to speed up quote match (#9773)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

When I was looking at the v0.2.0 release, this method showed up in a
CodSpeed regression (we were calling it more), so I decided to quickly
look at speeding it up. @BurntSushi suggested using Aho-Corasick, and it
looks like it's about 7 or 8x faster:

```text
Parser/AhoCorasick      time:   [8.5646 ns 8.5914 ns 8.6191 ns]
Parser/Iterator         time:   [64.992 ns 65.124 ns 65.271 ns]
```

## Test Plan

`cargo test`
2024-02-02 09:57:39 -05:00
Micha Reiser
ce14f4dea5
Range formatting API (#9635) 2024-01-31 11:13:37 +01:00
Micha Reiser
5fe0fdd0a8
Delete is_node_with_body method (#9643) 2024-01-25 14:41:13 +00:00
Alex Waygood
a1e65a92bd
Move is_tuple_parenthesized from the formatter to ruff_python_ast (#9533)
This allows it to be used in the linter as well as the formatter. It
will be useful in #9474
2024-01-15 16:10:40 +00:00
Chammika Mannakkara
0003c730e0
[flake8-simplify] Implement enumerate-for-loop (SIM113) (#7777)
Implements SIM113 from #998

Added tests
Limitations 
   - No fix yet
   - Only flag cases where index variable immediately precede `for` loop

@charliermarsh please review and let me know any improvements

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-01-14 11:00:59 -05:00
Charlie Marsh
009430e034
[ruff] Avoid treating named expressions as static keys (RUF011) (#9494)
Closes https://github.com/astral-sh/ruff/issues/9487.
2024-01-12 14:33:45 -05:00
Charlie Marsh
a31a314b2b
Account for possibly-empty f-string values in truthiness logic (#9484)
Closes https://github.com/astral-sh/ruff/issues/9479.
2024-01-11 21:16:19 -05:00