Commit graph

6669 commits

Author SHA1 Message Date
Charlie Marsh
6fffde72e7
Use memchr for string lexing (#9888)
## Summary

On `main`, string lexing consists of walking through the string
character-by-character to search for the closing quote (with some
nuance: we also need to skip escaped characters, and error if we see
newlines in non-triple-quoted strings). This PR rewrites `lex_string` to
instead use `memchr` to search for the closing quote, which is
significantly faster. On my machine, at least, the `globals.py`
benchmark (which contains a lot of docstrings) gets 40% faster...

```text
lexer/numpy/globals.py  time:   [3.6410 µs 3.6496 µs 3.6585 µs]
                        thrpt:  [806.53 MiB/s 808.49 MiB/s 810.41 MiB/s]
                 change:
                        time:   [-40.413% -40.185% -39.984%] (p = 0.00 < 0.05)
                        thrpt:  [+66.623% +67.181% +67.822%]
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
lexer/unicode/pypinyin.py
                        time:   [12.422 µs 12.445 µs 12.467 µs]
                        thrpt:  [337.03 MiB/s 337.65 MiB/s 338.27 MiB/s]
                 change:
                        time:   [-9.4213% -9.1930% -8.9586%] (p = 0.00 < 0.05)
                        thrpt:  [+9.8401% +10.124% +10.401%]
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
lexer/pydantic/types.py time:   [107.45 µs 107.50 µs 107.56 µs]
                        thrpt:  [237.11 MiB/s 237.24 MiB/s 237.35 MiB/s]
                 change:
                        time:   [-4.0108% -3.7005% -3.3787%] (p = 0.00 < 0.05)
                        thrpt:  [+3.4968% +3.8427% +4.1784%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
lexer/numpy/ctypeslib.py
                        time:   [46.123 µs 46.165 µs 46.208 µs]
                        thrpt:  [360.36 MiB/s 360.69 MiB/s 361.01 MiB/s]
                 change:
                        time:   [-19.313% -18.996% -18.710%] (p = 0.00 < 0.05)
                        thrpt:  [+23.016% +23.451% +23.935%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild
  4 (4.00%) high severe
lexer/large/dataset.py  time:   [231.07 µs 231.19 µs 231.33 µs]
                        thrpt:  [175.87 MiB/s 175.97 MiB/s 176.06 MiB/s]
                 change:
                        time:   [-2.0437% -1.7663% -1.4922%] (p = 0.00 < 0.05)
                        thrpt:  [+1.5148% +1.7981% +2.0864%]
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe
```
2024-02-08 17:23:06 +00:00
Jane Lewis
ad313b9089
RUF027 no longer has false negatives with string literals inside of method calls (#9865)
Fixes #9857.

## Summary

Statements like `logging.info("Today it is: {day}")` will no longer be
ignored by RUF027. As before, statements like `"Today it is:
{day}".format(day="Tuesday")` will continue to be ignored.

## Test Plan

The snapshot tests were expanded to include new cases. Additionally, the
snapshot tests have been split in two to separate positive cases from
negative cases.
2024-02-08 10:00:20 -05:00
Charlie Marsh
f76a3e8502
Detect mark_safe usages in decorators (#9887)
## Summary

Django's `mark_safe` can also be used as a decorator, so we should
detect usages of `@mark_safe` for the purpose of the relevant Bandit
rule.

Closes https://github.com/astral-sh/ruff/issues/9780.
2024-02-07 23:10:46 -05:00
Tom Kuson
ed07fa08bd
Fix list formatting in documention (#9886)
## Summary

Adds a blank line to render the list correctly.

## Test Plan

Ocular inspection
2024-02-07 20:01:21 -05:00
Charlie Marsh
45937426c7
Fix blank-line docstring rules for module-level docstrings (#9878)
## Summary

Given:

```python
"""Make a summary line.

Note:
----
  Per the code comment the next two lines are blank. "// The first blank line is the line containing the closing
      triple quotes, so we need at least two."

"""
```

It turns out we excluded the line ending in `"""`, because it's empty
(unlike for functions, where it consists of the indent). This PR changes
the `following_lines` iterator to always include the trailing newline,
which gives us correct and consistent handling between function and
module-level docstrings.

Closes https://github.com/astral-sh/ruff/issues/9877.
2024-02-07 16:48:28 -05:00
Charlie Marsh
533dcfb114
Add a note regarding ignore-without-code (#9879)
Closes https://github.com/astral-sh/ruff/issues/9863.
2024-02-07 21:20:18 +00:00
Hugo van Kemenade
bc023f47a1
Fix typo in option name: output_format -> output-format (#9874) 2024-02-07 16:17:58 +00:00
Jack McIvor
aa38307415
Add more NPY002 violations (#9862) 2024-02-07 09:54:11 -05:00
Charlie Marsh
e9ddd4819a
Make show-settings filters directory-agnostic (#9866)
Closes https://github.com/astral-sh/ruff/issues/9864.
2024-02-07 03:20:27 +00:00
Micha Reiser
fdb5eefb33
Improve trailing comma rule performance (#9867) 2024-02-06 23:04:36 +00:00
Charlie Marsh
daae28efc7
Respect async with in timeout-without-await (#9859)
Closes https://github.com/astral-sh/ruff/issues/9855.
2024-02-06 12:04:24 -05:00
Charlie Marsh
75553ab1c0
Remove ecosystem failures (#9854)
## Summary

These are kinda disruptive, I'd prefer to TODO unless someone is
interested in solving them ASAP.
2024-02-06 09:45:13 -05:00
Charlie Marsh
c34908f5ad
Use memchr for tab-indentation detection (#9853)
## Summary

The benchmarks show a pretty consistent 1% speedup here for all-rules,
though not enough to trigger our threshold of course:

![Screenshot 2024-02-05 at 11 55
59 PM](317dca3f-f25f-46f5-8ea8-894a1747d006)
2024-02-06 09:44:56 -05:00
Charlie Marsh
a662c2447c
Ignore builtins when detecting missing f-strings (#9849)
## Summary

Reported on Discord: if the name maps to a builtin, it's not bound
locally, so is very unlikely to be intended as an f-string expression.
2024-02-05 23:49:56 -05:00
Seo Sanghyeon
df7fb95cbc
Index multiline f-strings (#9837)
Fix #9777.
2024-02-05 21:25:33 -05:00
Adrian
83195a6030
ruff-ecosystem: Add indico/indico repo (#9850)
It's a pretty big codebase using lots of different stuff, so a good
candidate for finding obscure problems.

I didn't look more closely which options are used (I have the feeling
`--select ALL` is not implied, since I see you adding it via
`check_options` for certain entries but not for others), the repo itself
has a pretty large ruff.toml - but assuming ecosystem just cares about
differences between base and head of a PR, `ALL` most likely makes
sense.
2024-02-06 00:37:58 +00:00
Daniël van Noord
d31d09d7cd
Add `--preview` to instruction for running newly added tests (#9846)
## Summary

This surprised me while working on adding a test. I thought about adding
an additional `note`, but how often is this incorrect? In general,
people reading the contributing guidelines probably want to enable this
flag and those who don't will know enough about the testing setup to
have their own commands/aliases.

## Test Plan

Ran CI on local fork and got an all green.
2024-02-05 19:33:22 -05:00
Tyler C Laprade, CFA
0f436b71f3
Typo in 0.2.1 changelog (#9847)
`refurn` -> `refurb`
2024-02-05 17:51:27 -05:00
Eero Vaher
cd5bcd815d
Mention a related setting in C408 description (#9839)
#2977 added the `allow-dict-calls-with-keyword-arguments` configuration
option for the `unnecessary-collection-call (C408)` rule, but it did not
update the rule description.
2024-02-06 03:57:53 +05:30
Charlie Marsh
0ccca4083a
Bump version to v0.2.1 (#9843) 2024-02-05 15:31:05 -05:00
Charlie Marsh
041ce1e166
Respect generic Protocol in ellipsis removal (#9841)
Closes https://github.com/astral-sh/ruff/issues/9840.
2024-02-05 19:36:16 +00:00
Dhruv Manilawala
36b752876e
Implement AnyNode/AnyNodeRef for FStringFormatSpec (#9836)
## Summary

This PR adds the `AnyNode` and `AnyNodeRef` implementation for
`FStringFormatSpec` node which will be required in the f-string
formatting.

The main usage for this is so that we can pass in the node directly to
`suppressed_node` in case debug expression is used to format is as
verbatim text.
2024-02-05 19:23:43 +00:00
Micha Reiser
b3dc565473
Add --range option to ruff format (#9733)
Co-authored-by: T-256 <132141463+T-256@users.noreply.github.com>
2024-02-05 19:21:45 +00:00
Thomas M Kehrenberg
e708c08b64
Fix default for max-positional-args (#9838)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary
`max-positional-args` defaults to `max-args` if it's not specified and
the default to `max-args` is 5, so saying that the default is 3 is
definitely wrong. Ideally, we wouldn't specify a default at all for this
config option, but I don't think that's possible?

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

<!-- How was it tested? -->
Not sure.
2024-02-05 16:58:14 +00:00
Charlie Marsh
73902323d5
Revert "Use publicly available Apple Silicon runners (#9726)" (#9834)
## Summary

Sadly, the Apple Silicon runners use macOS 14 and produce binaries that
segfault when run on macOS 11 (at least), and possibly on macOS 12
and/or macOS 13.

macOS 11 is EOL, but it doesn't seem like a good tradeoff to speed up
our release builds at the expense of user support and compatibility.

This reverts commit f0066e1b89.

Closes https://github.com/astral-sh/ruff/issues/9823.
2024-02-05 11:24:51 -05:00
Charlie Marsh
9781563ef6
Add fast-path for comment detection (#9808)
## Summary

When we fall through to parsing, the comment-detection rule is a
significant portion of lint time. This PR adds an additional fast
heuristic whereby we abort if a comment contains two consecutive name
tokens (via the zero-allocation lexer). For the `ctypeslib.py`, which
has a few cases that are now caught by this, it's a 2.5x speedup for the
rule (and a 20% speedup for token-based rules).
2024-02-05 11:00:18 -05:00
Zanie Blue
84aea7f0c8
Drop __get__ and __set__ from unnecessary-dunder-call (#9791)
These are for descriptors which affects the behavior of the object _as a
property_; I do not think they should be called directly but there is no
alternative when working with the object directly.

Closes https://github.com/astral-sh/ruff/issues/9789
2024-02-05 10:54:29 -05:00
Shaygan Hooshyari
b47f85eb69
Preview Style: Format module level docstring (#9725)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-02-05 15:03:34 +00:00
Micha Reiser
80fc02e7d5
Don't trim last empty line in docstrings (#9813) 2024-02-05 13:29:24 +00:00
dependabot[bot]
55d0e1148c
Bump memchr from 2.6.4 to 2.7.1 (#9827)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-05 13:24:23 +00:00
dependabot[bot]
1de945e3eb
Bump is-macro from 0.3.4 to 0.3.5 (#9829) 2024-02-05 13:11:15 +00:00
dependabot[bot]
e277ba20da
Bump pyproject-toml from 0.8.1 to 0.8.2 (#9826) 2024-02-05 14:05:52 +01:00
dependabot[bot]
2e836a4cbe
Bump toml from 0.8.8 to 0.8.9 (#9828) 2024-02-05 14:05:27 +01:00
dependabot[bot]
57d6cdb8d3
Bump itertools from 0.12.0 to 0.12.1 (#9830) 2024-02-05 14:03:06 +01:00
Charlie Marsh
602f8b8250
Remove CST-based fixer for C408 (#9822)
## Summary

We have to keep the fixer for a specific case: `dict` calls that include
keyword-argument members.
2024-02-04 22:26:51 -05:00
Charlie Marsh
a6bc4b2e48
Remove CST-based fixers for C405 and C409 (#9821) 2024-02-05 02:17:34 +00:00
Charlie Marsh
c5fa0ccffb
Remove CST-based fixers for C400, C401, C410, and C418 (#9819) 2024-02-04 21:00:11 -05:00
Charlie Marsh
dd77d29d0e
Remove LibCST-based fixer for C403 (#9818)
## Summary

Experimenting with rewriting one of the comprehension fixes _without_
LibCST.
2024-02-04 20:08:19 -05:00
Charlie Marsh
ad0121660e
Run dunder method rule on methods directly (#9815)
This stood out in the flamegraph and I realized it requires us to
traverse over all statements in the class (unnecessarily).
2024-02-04 14:24:57 -05:00
Charlie Marsh
5c99967c4d
Short-circuit typing matches based on imports (#9800) 2024-02-04 14:06:44 -05:00
Charlie Marsh
c53aae0b6f
Add our own ignored-names abstractions (#9802)
## Summary

These run over nearly every identifier. It's rare to override them, so
when not provided, we can just use a match against the hardcoded default
set.
2024-02-03 09:48:07 -05:00
Charlie Marsh
2352de2277
Slight speed-up for lowercase and uppercase identifier checks (#9798)
It turns out that for ASCII identifiers, this is nearly 2x faster:

```
Parser/before     time:   [15.388 ns 15.395 ns 15.406 ns]
Parser/after      time:   [8.3786 ns 8.5821 ns 8.7715 ns]
```
2024-02-03 14:40:41 +00:00
Jane Lewis
e0a6034cbb
Implement RUF027: Missing F-String Syntax lint (#9728)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fixes #8151

This PR implements a new rule, `RUF027`.

## What it does
Checks for strings that contain f-string syntax but are not f-strings.

### Why is this bad?
An f-string missing an `f` at the beginning won't format anything, and
instead treat the interpolation syntax as literal.

### Example

```python
name = "Sarah"
dayofweek = "Tuesday"
msg = "Hello {name}! It is {dayofweek} today!"
```

It should instead be:
```python
name = "Sarah"
dayofweek = "Tuesday"
msg = f"Hello {name}! It is {dayofweek} today!"
```

## Heuristics
Since there are many possible string literals which contain syntax
similar to f-strings yet are not intended to be,
this lint will disqualify any literal that satisfies any of the
following conditions:
1. The string literal is a standalone expression. For example, a
docstring.
2. The literal is part of a function call with keyword arguments that
match at least one variable (for example: `format("Message: {value}",
value = "Hello World")`)
3. The literal (or a parent expression of the literal) has a direct
method call on it (for example: `"{value}".format(...)`)
4. The string has no `{...}` expression sections, or uses invalid
f-string syntax.
5. The string references variables that are not in scope, or it doesn't
capture variables at all.
6. Any format specifiers in the potential f-string are invalid.

## Test Plan

I created a new test file, `RUF027.py`, which is both an example of what
the lint should catch and a way to test edge cases that may trigger
false positives.
2024-02-03 00:21:03 +00:00
Emil Telstad
25d93053da
Update max-pos-args example to max-positional-args. (#9797) 2024-02-02 20:29:13 +00:00
Charlie Marsh
ee5b07d4ca
Skip empty lines when determining base indentation (#9795)
## Summary

It turns out we saw a panic in cases when dedenting blocks like the `def
wrapper` here:

```python
def instrument_url(f: UrlFuncT) -> UrlFuncT:
    # TODO: Type this with ParamSpec to preserve the function signature.
    if not INSTRUMENTING:  # nocoverage -- option is always enabled; should we remove?
        return f
    else:

        def wrapper(
            self: "ZulipTestCase", url: str, info: object = {}, **kwargs: Union[bool, str]
        ) -> HttpResponseBase:
```

Since we relied on the first line to determine the indentation, instead
of the first non-empty line.

## Test Plan

`cargo test`
2024-02-02 19:42:47 +00:00
Charlie Marsh
e50603caf6
Track top-level module imports in the semantic model (#9775)
## Summary

This is a simple idea to avoid unnecessary work in the linter,
especially for rules that run on all name and/or all attribute nodes.
Imagine a rule like the NumPy deprecation check. If the user never
imported `numpy`, we should be able to skip that rule entirely --
whereas today, we do a `resolve_call_path` check on _every_ name in the
file. It turns out that there's basically a finite set of modules that
we care about, so we now track imports on those modules as explicit
flags on the semantic model. In rules that can _only_ ever trigger if
those modules were imported, we add a dedicated and extremely cheap
check to the top of the rule.

We could consider generalizing this to all modules, but I would expect
that not to be much faster than `resolve_call_path`, which is just a
hash map lookup on `TextSize` anyway.

It would also be nice to make this declarative, such that rules could
declare the modules they care about, the analyzers could call the rules
as appropriate. But, I don't think such a design should block merging
this.
2024-02-02 14:37:20 -05:00
Charlie Marsh
c3ca34543f
Skip LibCST parsing for standard dedent adjustments (#9769)
## Summary

Often, when fixing, we need to dedent a block of code (e.g., if we
remove an `if` and dedent its body). Today, we use LibCST to parse and
adjust the indentation, which is really expensive -- but this is only
really necessary if the block contains a multiline string, since naively
adjusting the indentation for such a string can change the whitespace
_within_ the string.

This PR uses a simple dedent implementation for cases in which the block
doesn't intersect with a multi-line string (or an f-string, since we
don't support tracking multi-line strings for f-strings right now).

We could improve this even further by using the ranges to guide the
dedent function, such that we don't apply the dedent if the line starts
within a multiline string. But that would also need to take f-strings
into account, which is a little tricky.

## Test Plan

`cargo test`
2024-02-02 18:13:46 +00:00
Micha Reiser
4f7fb566f0
Range formatting: Fix invalid syntax after parenthesizing expression (#9751) 2024-02-02 17:56:25 +01:00
Jordan Danford
50bfbcf568
README.md: add missing "your" in support section, add alt text to Astral logo (#9787) 2024-02-02 09:09:19 -06:00
Charlie Marsh
ea1c089652
Use AhoCorasick to speed up quote match (#9773)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

When I was looking at the v0.2.0 release, this method showed up in a
CodSpeed regression (we were calling it more), so I decided to quickly
look at speeding it up. @BurntSushi suggested using Aho-Corasick, and it
looks like it's about 7 or 8x faster:

```text
Parser/AhoCorasick      time:   [8.5646 ns 8.5914 ns 8.6191 ns]
Parser/Iterator         time:   [64.992 ns 65.124 ns 65.271 ns]
```

## Test Plan

`cargo test`
2024-02-02 09:57:39 -05:00