Commit graph

70 commits

Author SHA1 Message Date
Max Mynter
b4a1ebdfe3
[semantic-syntax-tests] IrrefutableCasePattern, SingleStarredAssignment, WriteToDebug, InvalidExpression (#17748)
Re: #17526 

## Summary

Add integration test for semantic syntax for `IrrefutableCasePattern`,
`SingleStarredAssignment`, `WriteToDebug`, and `InvalidExpression`.

## Notes
- Following @ntBre's suggestion, I will keep the test coming in batches
like this over the next few days in separate PRs to keep the review load
per PR manageable while also not spamming too many.

- I did not add a test for `del __debug__` which is one of the examples
in `crates/ruff_python_parser/src/semantic_errors.rs:1051`.
For python version `<= 3.8` there is no error and for `>=3.9` the error
is not `WriteToDebug` but `SyntaxError: cannot delete __debug__ on
Python 3.9 (syntax was removed in 3.9)`.

- The `blacken-docs` bypass is necessary because otherwise the test does
not pass pre-commit checks; but we want to check for this faulty syntax.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan
This is a test.
2025-05-09 14:54:05 -04:00
Brent Westbrook
4510a236d3
Default to latest supported Python version for version-related syntax errors (#17529)
## Summary

This PR partially addresses #16418 via the following:

- `LinterSettings::unresolved_python_version` is now a `TargetVersion`,
which is a thin wrapper around an `Option<PythonVersion>`
- `Checker::target_version` now calls `TargetVersion::linter_version`
internally, which in turn uses `unwrap_or_default` to preserve the
current default behavior
- Calls to the parser now call `TargetVersion::parser_version`, which
calls `unwrap_or_else(PythonVersion::latest)`
- The `Checker`'s implementation of
`SemanticSyntaxContext::python_version` also uses
`TargetVersion::parser_version` to use `PythonVersion::latest` for
semantic errors

In short, all lint rule behavior should be unchanged, but we default to
the latest Python version for the new syntax errors, which should
minimize confusing version-related syntax errors for users without a
version configured.

## Test Plan

Existing tests, which showed no changes (except for printing default
settings).
2025-05-06 10:19:13 -04:00
Max Mynter
178c882740
[semantic-syntax-tests] Add test fixtures for AwaitOutsideAsyncFunction (#17785)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
Re: #17526 
## Summary
Add test fixtures for `AwaitOutsideAsync` and
`AsyncComprehensionOutsideAsyncFunction` errors.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan
This is a test. 

<!-- How was it tested? -->
2025-05-05 14:02:06 -04:00
Max Mynter
101e1a5ddd
[semantic-syntax-tests] for for InvalidStarExpression, DuplicateMatchKey, and DuplicateMatchClassAttribute (#17754)
Re: #17526 

## Summary
Add integration tests for Python Semantic Syntax for
`InvalidStarExpression`, `DuplicateMatchKey`, and
`DuplicateMatchClassAttribute`.

## Note
- Red knot integration tests for `DuplicateMatchKey` exist already in
line 89-101.
<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan
This is a test.
<!-- How was it tested? -->
2025-05-05 17:30:16 +00:00
Micha Reiser
fa628018b2
Use #[expect(lint)] over #[allow(lint)] where possible (#17822) 2025-05-03 21:20:31 +02:00
Max Mynter
f584b66824
Expand Semantic Syntax Coverage (#17725)
Re: #17526 

## Summary
Adds tests to red knot and `linter.rs` for the semantic syntax. 

Specifically add tests for `ReboundComprehensionVariable`,
`DuplicateTypeParameter`, and `MultipleCaseAssignment`.

Refactor the `test_async_comprehension_in_sync_comprehension` →
`test_semantic_error` to be more general for all semantic syntax test
cases.

## Test Plan
This is a test.

## Question
I'm happy to contribute more tests the coming days. 

Should that happen here or should we merge this PR such that the
refactor `test_async_comprehension_in_sync_comprehension` →
`test_semantic_error` is available on main and others can chime in, too?
2025-04-30 10:14:08 -04:00
Dylan
ae7691b026
Add Python 3.14 to configuration options (#17647)
A small PR that just updates the various settings/configurations to
allow Python 3.14. At the moment selecting that target version will
have no impact compared to Python 3.13 - except that a warning
is emitted if the user does so with `preview` disabled.
2025-04-28 16:29:00 -05:00
Brent Westbrook
01a31c08f5
Add config option to disable typing_extensions imports (#17611)
Summary
--

This PR resolves https://github.com/astral-sh/ruff/issues/9761 by adding
a linter configuration option to disable
`typing_extensions` imports. As mentioned [here], it would be ideal if
we could
detect whether or not `typing_extensions` is available as a dependency
automatically, but this seems like a much easier fix in the meantime.

The default for the new option, `typing-extensions`, is `true`,
preserving the current behavior. Setting it to `false` will bail out of
the new
`Checker::typing_importer` method, which has been refactored from the 
`Checker::import_from_typing` method in
https://github.com/astral-sh/ruff/pull/17340),
with `None`, which is then handled specially by each rule that calls it.

I considered some alternatives to a config option, such as checking if
`typing_extensions` has been imported or checking for a `TYPE_CHECKING`
block we could use, but I think defaulting to allowing
`typing_extensions` imports and allowing the user to disable this with
an option is both simple to implement and pretty intuitive.

[here]:
https://github.com/astral-sh/ruff/issues/9761#issuecomment-2790492853

Test Plan
--

New linter tests exercising several combinations of Python versions and
the new config option for PYI019. I also added tests for the other
affected rules, but only in the case where the new config option is
enabled. The rules' existing tests also cover the default case.
2025-04-28 14:57:36 -04:00
Dylan
152a0b6585
Collect preview lint behaviors in separate module (#17646)
This PR collects all behavior gated under preview into a new module
`ruff_linter::preview` that exposes functions like
`is_my_new_feature_enabled` - just as is done in the formatter crate.
2025-04-28 09:12:24 -05:00
Max Mynter
3f84e75e20
Add Semantic Error Test for LateFutureImport (#17612)
Adresses a question in #17526.

## Summary
Adds a syntax error test for `__future__` import not at top of file. 

## Question: 
Is this a redundant with
8d2c79276d/crates/ruff_linter/resources/test/fixtures/pyflakes/F404_0.py (L1-L8)
and
8d2c79276d/crates/ruff_linter/resources/test/fixtures/pyflakes/F404_1.py (L1-L5)

which test pyflake `F404`?
<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan
This is a test
<!-- How was it tested? -->
2025-04-25 08:32:57 -04:00
Vasco Schiavo
4eecc40110
[semantic-syntax-errors] test for LoadBeforeGlobalDeclaration - ruff linter (#17592)
Hey @ntBre 

just one easy case to see if I understood the issue #17526 

Let me know if is this what you had in mind.
2025-04-24 16:14:33 -04:00
Brent Westbrook
92ecfc908b
[syntax-errors] Make async-comprehension-in-sync-comprehension more specific (#17460)
## Summary

While adding semantic error support to red-knot, I noticed duplicate
diagnostics for code like this:

```py
# error: [invalid-syntax] "cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.9 (syntax was added in 3.11)"
# error: [invalid-syntax] "`asynchronous comprehension` outside of an asynchronous function"
 [reveal_type(x) async for x in AsyncIterable()]
```

Beyond the duplication, the first error message doesn't make much sense
because this syntax is _not_ allowed on Python 3.11 either.

To fix this, this PR renames the
`async-comprehension-outside-async-function` semantic syntax error to
`async-comprehension-in-sync-comprehension` and fixes the rule to avoid
applying outside of sync comprehensions at all.

## Test Plan

New linter test demonstrating the false positive. The mdtests from my red-knot 
PR also reflect this change.
2025-04-24 15:45:54 -04:00
Brent Westbrook
da32a83c9f
[syntax-errors] return outside function (#17300)
Summary
--

This PR reimplements [return-outside-function
(F706)](https://docs.astral.sh/ruff/rules/return-outside-function/) as a
semantic syntax error.

These changes are very similar to those in
https://github.com/astral-sh/ruff/pull/17298.

Test Plan
--

New linter tests, plus existing F706 tests.
2025-04-11 17:05:54 +00:00
Brent Westbrook
ffef71d106
[syntax-errors] yield, yield from, and await outside functions (#17298)
Summary
--

This PR reimplements [yield-outside-function
(F704)](https://docs.astral.sh/ruff/rules/yield-outside-function/) as a
semantic syntax error. Despite the name, this rule covers `yield from`
and `await` in addition to `yield`.

Test Plan
--

New linter tests, along with the existing F704 test.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2025-04-11 10:16:23 -04:00
Brent Westbrook
058439d5d3
[syntax-errors] Async comprehension in sync comprehension (#17177)
Summary
--

Detect async comprehensions nested in sync comprehensions in async
functions before Python 3.11, when this was [changed].

The actual logic of this rule is very straightforward, but properly
tracking the async scopes took a bit of work. An alternative to the
current approach is to offload the `in_async_context` check into the
`SemanticSyntaxContext` trait, but that actually required much more
extensive changes to the `TestContext` and also to ruff's semantic
model, as you can see in the changes up to
31554b473507034735bd410760fde6341d54a050. This version has the benefit
of mostly centralizing the state tracking in `SemanticSyntaxChecker`,
although there was some subtlety around deferred function body traversal
that made the changes to `Checker` more intrusive too (hence the new
linter test).

The `Checkpoint` struct/system is obviously overkill for now since it's
only tracking a single `bool`, but I thought it might be more useful
later.

[changed]: https://github.com/python/cpython/issues/77527

Test Plan
--

New inline tests and a new linter integration test.
2025-04-08 12:50:52 -04:00
Brent Westbrook
2baaedda6c
[syntax-errors] Start detecting compile-time syntax errors (#16106)
## Summary

This PR implements the "greeter" approach for checking the AST for
syntax errors emitted by the CPython compiler. It introduces two main
infrastructural changes to support all of the compile-time errors:
1. Adds a new `semantic_errors` module to the parser crate with public
`SemanticSyntaxChecker` and `SemanticSyntaxError` types
2. Embeds a `SemanticSyntaxChecker` in the `ruff_linter::Checker` for
checking these errors in ruff

As a proof of concept, it also implements detection of two syntax
errors:
1. A reimplementation of
[`late-future-import`](https://docs.astral.sh/ruff/rules/late-future-import/)
(`F404`)
2. Detection of rebound comprehension iteration variables
(https://github.com/astral-sh/ruff/issues/14395)

## Test plan
Existing F404 tests, new inline tests in the `ruff_python_parser` crate,
and a linter CLI test showing an example of the `Message` output.

I also tested in VS Code, where `preview = false` and turning off syntax
errors both disable the new errors:


![image](https://github.com/user-attachments/assets/cf453d95-04f7-484b-8440-cb812f29d45e)

And on the playground, where `preview = false` also disables the errors:


![image](https://github.com/user-attachments/assets/a97570c4-1efa-439f-9d99-a54487dd6064)


Fixes #14395

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2025-03-21 14:45:25 -04:00
Dylan
74f64d3f96
Server: Allow FixAll action in presence of version-specific syntax errors (#16848)
The single flag `has_syntax_error` on `LinterResult` is replaced with
two (private) flags: `has_valid_syntax` and
`has_no_unsupported_syntax_errors`, which record whether there are
`ParseError`s or `UnsupportedSyntaxError`s, respectively. Only the
former is used to prevent a `FixAll` action.

An attempt has been made to make consistent the usage of the phrases
"valid syntax" (which seems to be used to refer only to _parser_ errors)
and "syntax error" (which refers to both _parser_ errors and
version-specific syntax errors).

Closes #16841
2025-03-20 05:09:14 -05:00
Brent Westbrook
22de00de16 [internal] Return Messages from check_path (#16837)
Summary
--

This PR updates `check_path` in the `ruff_linter` crate to return a
`Vec<Message>` instead of a `Vec<Diagnostic>`. The main motivation for
this is to make it easier to convert semantic syntax errors directly
into `Message`s rather than `Diagnostic`s in #16106. However, this also
has the benefit of keeping the preview check on unsupported syntax
errors in `check_path`, as suggested in
https://github.com/astral-sh/ruff/pull/16429#discussion_r1974748024.

All of the interesting changes are in the first commit. The second
commit just renames variables like `diagnostics` to `messages`, and the
third commit is a tiny import fix.

I also updated the `ExpandedMessage::location` field name, which caused
a few extra commits tidying up the playground code. I thought it was
nicely symmetric with `end_location`, but I'm happy to revert that too.

Test Plan
--

Existing tests. I also tested the playground and server manually.
2025-03-19 10:08:07 -04:00
Brent Westbrook
37fbe58b13
Document LinterResult::has_syntax_error and add Parsed::has_no_syntax_errors (#16443)
Summary
--

This is a follow up addressing the comments on #16425. As @dhruvmanila
pointed out, the naming is a bit tricky. I went with `has_no_errors` to
try to differentiate it from `is_valid`. It actually ends up negated in
most uses, so it would be more convenient to have `has_any_errors` or
`has_errors`, but I thought it would sound too much like the opposite of
`is_valid` in that case. I'm definitely open to suggestions here.

Test Plan
--

Existing tests.
2025-03-04 08:35:38 -05:00
Brent Westbrook
4a23756024
Avoid caching files with unsupported syntax errors (#16425) 2025-02-28 09:58:11 +01:00
Brent Westbrook
78806361fd
Start detecting version-related syntax errors in the parser (#16090)
## Summary

This PR builds on the changes in #16220 to pass a target Python version
to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which
collects version-related syntax errors while parsing. These syntax
errors are then turned into `Message`s in ruff (in preview mode).

This PR only detects one syntax error (`match` statement before Python
3.10), but it has been pretty quick to extend to several other simple
errors (see #16308 for example).

## Test Plan

The current tests are CLI tests in the linter crate, but these could be
supplemented with inline parser tests after #16357.

I also tested the display of these syntax errors in VS Code:


![image](https://github.com/user-attachments/assets/062b4441-740e-46c3-887c-a954049ef26e)

![image](https://github.com/user-attachments/assets/101f55b8-146c-4d59-b6b0-922f19bcd0fa)

---------

Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2025-02-25 23:03:48 -05:00
Brent Westbrook
e7a6c19e3a
Add per-file-target-version option (#16257)
## Summary

This PR is another step in preparing to detect syntax errors in the
parser. It introduces the new `per-file-target-version` top-level
configuration option, which holds a mapping of compiled glob patterns to
Python versions. I intend to use the
`LinterSettings::resolve_target_version` method here to pass to the
parser:


f50849aeef/crates/ruff_linter/src/linter.rs (L491-L493)

## Test Plan

I added two new CLI tests to show that the `per-file-target-version` is
respected in both the formatter and the linter.
2025-02-24 08:47:13 -05:00
Dylan
f29c7b03ec
Warn on invalid noqa even when there are no diagnostics (#16178)
On `main` we warn the user if there is an invalid noqa comment[^1] and
at least one of the following holds:

- There is at least one diagnostic
- A lint rule related to `noqa`s is enabled (e.g. `RUF100`)

This is probably strange behavior from the point of view of the user, so
we now show invalid `noqa`s even when there are no diagnostics.

Closes #12831

[^1]: For the current definition of "invalid noqa comment", which may be
expanded in #12811 . This PR is independent of loc. cit. in the sense
that the CLI warnings should be consistent, regardless of which `noqa`
comments are considered invalid.
2025-02-16 13:58:18 -06:00
Charlie Marsh
c7d48e10e6
Detect empty implicit namespace packages (#14236)
## Summary

The implicit namespace package rule currently fails to detect cases like
the following:

```text
foo/
├── __init__.py
└── bar/
    └── baz/
        └── __init__.py
```

The problem is that we detect a root at `foo`, and then an independent
root at `baz`. We _would_ detect that `bar` is an implicit namespace
package, but it doesn't contain any files! So we never check it, and
have no place to raise the diagnostic.

This PR adds detection for these kinds of nested packages, and augments
the `INP` rule to flag the `__init__.py` file above with a specialized
message. As a side effect, I've introduced a dedicated `PackageRoot`
struct which we can pass around in lieu of Yet Another `Path`.

For now, I'm only enabling this in preview (and the approach doesn't
affect any other rules). It's a bug fix, but it may end up expanding the
rule.

Closes https://github.com/astral-sh/ruff/issues/13519.
2024-11-09 22:03:34 -05:00
Micha Reiser
9f3a38d408
Extract LineIndex independent methods from Locator (#13938) 2024-10-28 07:53:41 +00:00
Micha Reiser
27c50bebec
Bump MSRV to Rust 1.80 (#13826) 2024-10-20 10:55:36 +02:00
Dhruv Manilawala
ff53db3d99
Consider VS Code cell metadata to determine valid code cells (#12864)
## Summary

This PR adds support for VS Code specific cell metadata to consider when
collecting valid code cells.

For context, Ruff only runs on valid code cells. These are the code
cells that doesn't contain cell magics. Previously, Ruff only used the
notebook's metadata to determine whether it's a Python notebook. But, in
VS Code, a notebook's preferred language might be Python but it could
still contain code cells for other languages. This can be determined
with the `metadata.vscode.languageId` field.

### References:
* https://code.visualstudio.com/docs/languages/identifiers
* e6c009a3d4/extensions/ipynb/src/serializers.ts (L104-L107)
*
e6c009a3d4/extensions/ipynb/src/serializers.ts (L117-L122)

This brings us one step closer to fixing #12281.

## Test Plan

Add test cases for `is_valid_python_code_cell` and an integration test
case which showcase running it end to end. The test notebook contains a
JavaScript code cell and a Python code cell.
2024-08-13 22:09:56 +05:30
Dhruv Manilawala
88a4cc41f7
Disable auto-fix when source has syntax errors (#12134)
## Summary

This PR updates Ruff to **not** generate auto-fixes if the source code
contains syntax errors as determined by the parser.

The main motivation behind this is to avoid infinite autofix loop when
the token-based rules are run over any source with syntax errors in
#11950.

Although even after this, it's not certain that there won't be an
infinite autofix loop because the logic might be incorrect. For example,
https://github.com/astral-sh/ruff/issues/12094 and
https://github.com/astral-sh/ruff/pull/12136.

This requires updating the test infrastructure to not validate for fix
availability status when the source contained syntax errors. This is
required because otherwise the fuzzer might fail as it uses the test
function to run the linter and validate the source code.

resolves: #11455 

## Test Plan

`cargo insta test`
2024-07-02 14:22:51 +05:30
Dhruv Manilawala
72b6c26101 Simplify LinterResult, avoid cloning ParseError (#11903)
## Summary

Follow-up to #11902

This PR simplifies the `LinterResult` struct by avoiding the generic and
not store the `ParseError`.

This is possible because the callers already have access to the
`ParseError` via the `Parsed` output. This also means that we can
simplify the return type of `check_path` and avoid the generic `T` on
`LinterResult`.

## Test Plan

`cargo insta test`
2024-06-27 13:44:11 +02:00
Dhruv Manilawala
73851e73ab Avoid displaying syntax error as log message (#11902)
## Summary

Follow-up to #11901 

This PR avoids displaying the syntax errors as log message now that the
`E999` diagnostic cannot be disabled.

For context on why this was added, refer to
https://github.com/astral-sh/ruff/pull/2505. Basically, we would allow
ignoring the syntax error diagnostic because certain syntax feature
weren't supported back then like `match` statement. And, if a user
ignored `E999`, Ruff would give no feedback if the source code contained
any syntax error. So, this log message was a way to indicate to the user
even if `E999` was disabled.

The current state of the parser is such that (a) it matches with the
latest grammar and (b) it's easy to add support for any new syntax.

**Note:** This PR doesn't remove the `DisplayParseError` struct because
it's still being used by the formatter.

## Test Plan

Update existing snapshots from the integration tests.
2024-06-27 13:44:11 +02:00
Dhruv Manilawala
e7b49694a7 Remove E999 as a rule, disallow any disablement methods for syntax error (#11901)
## Summary

This PR updates the way syntax errors are handled throughout the linter.

The main change is that it's now not considered as a rule which involves
the following changes:
* Update `Message` to be an enum with two variants - one for diagnostic
message and the other for syntax error message
* Provide methods on the new message enum to query information required
by downstream usages

This means that the syntax errors cannot be hidden / disabled via any
disablement methods. These are:
1. Configuration via `select`, `ignore`, `per-file-ignores`, and their
`extend-*` variants
	```console
$ cargo run -- check ~/playground/ruff/src/lsp.py --extend-select=E999
--no-preview --no-cache
	    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.10s
Running `target/debug/ruff check /Users/dhruv/playground/ruff/src/lsp.py
--extend-select=E999 --no-preview --no-cache`
warning: Rule `E999` is deprecated and will be removed in a future
release. Syntax errors will always be shown regardless of whether this
rule is selected or not.
/Users/dhruv/playground/ruff/src/lsp.py:1:8: F401 [*] `abc` imported but
unused
	  |
	1 | import abc
	  |        ^^^ F401
	2 | from pathlib import Path
	3 | import os
	  |
	  = help: Remove unused import: `abc`
	```
3. Command-line flags via `--select`, `--ignore`, `--per-file-ignores`,
and their `--extend-*` variants
	```console
$ cargo run -- check ~/playground/ruff/src/lsp.py --no-cache
--config=~/playground/ruff/pyproject.toml
	    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/ruff check /Users/dhruv/playground/ruff/src/lsp.py
--no-cache --config=/Users/dhruv/playground/ruff/pyproject.toml`
warning: Rule `E999` is deprecated and will be removed in a future
release. Syntax errors will always be shown regardless of whether this
rule is selected or not.
/Users/dhruv/playground/ruff/src/lsp.py:1:8: F401 [*] `abc` imported but
unused
	  |
	1 | import abc
	  |        ^^^ F401
	2 | from pathlib import Path
	3 | import os
	  |
	  = help: Remove unused import: `abc`
	```

This also means that the **output format** needs to be updated:
1. The `code`, `noqa_row`, `url` fields in the JSON output is optional
(`null` for syntax errors)
2. Other formats are changed accordingly
For each format, a new test case specific to syntax errors have been
added. Please refer to the snapshot output for the exact format for
syntax error message.

The output of the `--statistics` flag will have a blank entry for syntax
errors:
```
315     F821    [ ] undefined-name
119             [ ] syntax-error
103     F811    [ ] redefined-while-unused
```

The **language server** is updated to consider the syntax errors by
convert them into LSP diagnostic format separately.

### Preview

There are no quick fixes provided to disable syntax errors. This will
automatically work for `ruff-lsp` because the `noqa_row` field will be
`null` in that case.
<img width="772" alt="Screenshot 2024-06-26 at 14 57 08"
src="aaac827e-4777-4ac8-8c68-eaf9f2c36774">

Even with `noqa` comment, the syntax error is displayed:
<img width="763" alt="Screenshot 2024-06-26 at 14 59 51"
src="ba1afb68-7eaf-4b44-91af-6d93246475e2">

Rule documentation page:
<img width="1371" alt="Screenshot 2024-06-26 at 16 48 07"
src="524f01df-d91f-4ac0-86cc-40e76b318b24">


## Test Plan

- [x] Disablement methods via config shows a warning
	- [x] `select`, `extend-select`
	- [ ] ~`ignore`~ _doesn't show any message_
- [ ] ~`per-file-ignores`, `extend-per-file-ignores`~ _doesn't show any
message_
- [x] Disablement methods via command-line flag shows a warning
	- [x] `--select`, `--extend-select`
	- [ ] ~`--ignore`~ _doesn't show any message_
- [ ] ~`--per-file-ignores`, `--extend-per-file-ignores`~ _doesn't show
any message_
- [x] File with syntax errors should exit with code 1
- [x] Language server
	- [x] Should show diagnostics for syntax errors
	- [x] Should not recommend a quick fix edit for adding `noqa` comment
	- [x] Same for `ruff-lsp`

resolves: #8447
2024-06-27 13:44:11 +02:00
T-256
d6a2cad9c2 Drop deprecated nursery rule group (#10172)
Co-authored-by: Micha Reiser <micha@reiser.io>
Resolves https://github.com/astral-sh/ruff/issues/7992
2024-06-27 13:44:11 +02:00
Dhruv Manilawala
b617d90651
Update E999 to show all syntax errors (#11900)
## Summary

This PR updates the linter to show all the parse errors as diagnostics
instead of just the first one.

Note that this doesn't affect the parse error displayed as error log
message. This will be removed in a follow-up PR.

### Breaking?

I don't think this is a breaking change even though this might give more
diagnostics. The main reason is that this shouldn't affect any users
because it'll only give additional diagnostics in the case of multiple
syntax errors.

## Test Plan

Add an integration test case which would raise more than one parse
error.
2024-06-19 13:09:54 +05:30
Dhruv Manilawala
549cc1e437
Build CommentRanges outside the parser (#11792)
## Summary

This PR updates the parser to remove building the `CommentRanges` and
instead it'll be built by the linter and the formatter when it's
required.

For the linter, it'll be built and owned by the `Indexer` while for the
formatter it'll be built from the `Tokens` struct and passed as an
argument.

## Test Plan

`cargo insta test`
2024-06-09 09:55:17 +00:00
Dhruv Manilawala
d22f3402e1
Remove result_like dependency (#11793)
## Summary

This PR removes the `result-like` dependency and instead implement the
required functionality. The motivation being that `noqa.is_enabled()` is
easier to read than `noqa.into()`.

For context, I was just trying to understand the syntax error workflow
and I saw these flags which were being converted via `into`. I always
find `into` confusing because you never know what's it being converted
into unless you know the type. Later realized that it's just a boolean
flag. After removing the usages from these two flags, it turns out that
the dependency is only being used in one rule so I thought to remove
that as well.

## Test Plan

`cargo insta test`
2024-06-07 11:53:22 +05:30
Dhruv Manilawala
bf5b62edac
Maintain synchronicity between the lexer and the parser (#11457)
## Summary

This PR updates the entire parser stack in multiple ways:

### Make the lexer lazy

* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473

Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.

The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.

The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.

Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.

This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)

[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`

### Remove `SoftKeywordTransformer`

* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474

For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.

* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.

If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).

The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.

### Update the parser API

* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505

Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.

Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens

Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.

### Refactor the downstream tools (linter and formatter)

* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592

And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
	* https://github.com/astral-sh/ruff/pull/11402
	* https://github.com/astral-sh/ruff/pull/11406
	* https://github.com/astral-sh/ruff/pull/11418
	* https://github.com/astral-sh/ruff/pull/11419
	* https://github.com/astral-sh/ruff/pull/11420
	* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR

For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
	1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.

## Test Plan

- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects

### Running this change on dozens of open-source projects

Refer to the PR description to get the list of open source projects used
for testing.

Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`

**Conclusion: all output were same**

## What's next?

The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
2024-06-03 18:23:50 +05:30
Micha Reiser
921bc15542
use owned ast and tokens in bench (#11598) 2024-05-29 18:10:32 +02:00
Dhruv Manilawala
a33763170e
Use TokenKind in doc_lines_from_tokens (#11418)
## Summary

This PR updates the `doc_lines_from_tokens` function to use `TokenKind`
instead of `Tok`.

This is part of #11401 

## Test Plan

`cargo test`
2024-05-14 16:56:14 +00:00
Dhruv Manilawala
025768d303
Add Tokens newtype wrapper, TokenKind iterator (#11361)
## Summary

Alternative to #11237 

This PR adds a new `Tokens` struct which is a newtype wrapper around a
vector of lexer output. This allows us to add a `kinds` method which
returns an iterator over the corresponding `TokenKind`. This iterator is
implemented as a separate `TokenKindIter` struct to allow using the type
and provide additional methods like `peek` directly on the iterator.

This exposes the linter to access the stream of `TokenKind` instead of
`Tok`.

Edit: I've made the necessary downstream changes and plan to merge the
entire stack at once.
2024-05-14 16:45:04 +00:00
Micha Reiser
64700d296f
Remove ImportMap (#11234)
## Summary

This PR removes the `ImportMap` implementation and all its routing
through ruff.

The import map was added in https://github.com/astral-sh/ruff/pull/3243
but we then never ended up using it to do cross file analysis.

We are now working on adding multifile analysis to ruff, and revisit
import resolution as part of it.


```
hyperfine --warmup 10 --runs 20 --setup "./target/release/ruff clean" \
              "./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" \
              "./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" 
Benchmark 1: ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
  Time (mean ± σ):      37.6 ms ±   0.9 ms    [User: 52.2 ms, System: 63.7 ms]
  Range (min … max):    35.8 ms …  39.8 ms    20 runs
 
Benchmark 2: ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
  Time (mean ± σ):      36.0 ms ±   0.7 ms    [User: 50.3 ms, System: 58.4 ms]
  Range (min … max):    34.5 ms …  37.6 ms    20 runs
 
Summary
  ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I ran
    1.04 ± 0.03 times faster than ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
```

I suspect that the performance improvement should even be more
significant for users that otherwise don't have any diagnostics.


```
hyperfine --warmup 10 --runs 20 --setup "cd ../ecosystem/airflow && ../../ruff/target/release/ruff clean" \
              "./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I" \
              "./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I" 
Benchmark 1: ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I
  Time (mean ± σ):      53.7 ms ±   1.8 ms    [User: 68.4 ms, System: 63.0 ms]
  Range (min … max):    51.1 ms …  58.7 ms    20 runs
 
Benchmark 2: ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I
  Time (mean ± σ):      50.8 ms ±   1.4 ms    [User: 50.7 ms, System: 60.9 ms]
  Range (min … max):    48.5 ms …  55.3 ms    20 runs
 
Summary
  ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I ran
    1.06 ± 0.05 times faster than ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I

```

## Test Plan

`cargo test`
2024-05-02 11:26:02 -07:00
Micha Reiser
5561d445d7
linter: Enable test-rules for test build (#11201) 2024-04-30 08:06:47 +02:00
Charlie Marsh
d544199272
Respect per-file-ignores for RUF100 with no other diagnostics (#11058)
## Summary

The existing test didn't cover the case in which there are _no_ other
diagnostics in the file.

Closes https://github.com/astral-sh/ruff/issues/10906.
2024-04-20 15:33:22 +00:00
Auguste Lalande
c746912b9e
[pycodestyle] Implement redundant-backslash (E502) (#10292)
## Summary

Implements the
[redundant-backslash](https://pycodestyle.pycqa.org/en/latest/intro.html#error-codes)
rule (E502) from pycodestyle.

## Test Plan

New fixture has been added

Part of #2402
2024-03-11 21:15:06 -04:00
Charlie Marsh
7515196245
Respect external codes in file-level exemptions (#10203)
We shouldn't warn when an "external" code is used in a file-level
exemption.

Closes https://github.com/astral-sh/ruff/issues/10202.
2024-03-03 00:20:36 +00:00
Hoël Bagard
9027169125
[pycodestyle] Add blank line(s) rules (E301, E302, E303, E304, E305, E306) (#9266)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-02-08 18:35:08 +00:00
Zanie Blue
0d752e56cd Add tests for redirected rules (#9754)
Extends https://github.com/astral-sh/ruff/pull/9752 adding internal test
rules for redirection

Fixes a bug where we did not see warnings for exact codes that are
redirected (just prefixes)
2024-02-01 13:35:02 -06:00
Zanie Blue
46c0937bfa Use fake rules for testing deprecation and removal infrastructure (#9752)
Updates #9689 and #9691 to use rule testing infrastructure from #9747
2024-02-01 13:35:02 -06:00
Zanie Blue
f18e7d40ac
Add internal hidden rules for testing (#9747)
Updated implementation of https://github.com/astral-sh/ruff/pull/7369
which was left out in the cold.

This was motivated again following changes in #9691 and #9689 where we
could not test the changes without actually deprecating or removing
rules.

---

Follow-up to discussion in https://github.com/astral-sh/ruff/pull/7210

Moves integration tests from using rules that are transitively in
nursery / preview groups to dedicated test rules that only exist during
development. These rules always raise violations (they do not require
specific file behavior). The rules are not available in production or in
the documentation.

Uses features instead of `cfg(test)` for cross-crate support per
https://github.com/rust-lang/cargo/issues/8379
2024-02-01 08:44:51 -06:00
Charlie Marsh
bea8f2ee3a
Detect automagic-like assignments in notebooks (#9653)
## Summary

Given a statement like `colors = 6`, we currently treat the cell as an
automagic (since `colors` is an automagic) -- i.e., we assume it's
equivalent to `%colors = 6`. This PR adds some additional detection
whereby if the statement is an _assignment_, we avoid treating it as
such. I audited the list of automagics, and I believe this is safe for
all of them.

Closes https://github.com/astral-sh/ruff/issues/8526.

Closes https://github.com/astral-sh/ruff/issues/9648.

## Test Plan

`cargo test`
2024-01-29 12:55:44 +00:00
Charlie Marsh
328262bfac
Add cell indexes to all diagnostics (#9387)
## Summary

Ensures that any lint rules that include line locations render them as
relative to the cell (and include the cell number) when inside a Jupyter
notebook.

Closes https://github.com/astral-sh/ruff/issues/6672.

## Test Plan

`cargo test`
2024-01-04 14:02:23 +00:00